Performance Analysis of a Proposed Parallel Architecture on Matrix Vector Multiply Like Routines

Abstract: This paper presents some ideas on the implementation of a compiler which uses the knowledge of the finite geometry that underlies the novel parallel hardware architecture for sparse matrix computations [KAR90]. The compiler takes a data flow graph as input, rearranges it to avoid memory switch, and processors conflicts, and schedules operations to maximize the efficiency of the parallel hardware. The action of the compiler can be viewed as a discrete-time-driven simulation of the execution of the data flow graph on the parallel machine, the simulation capturing the state of the hardware at a particular time instant. The paper presents extensive simulation results for matrix-vector multiply routines on the parallel hardware. The matrices have been chosen from diverse LP applications as well as other scientific computations. The results indicate that uniformly high efficiency (above 90%) is achievable on problems with regular as well as arbitrary structure.

Download: pdf

Citation

Performance Analysis of a Proposed Parallel Architecture on Matrix Vector Multiply Like Routines (pdf, software)
I. Dhillon, N. Karmarkar, K. Ramakrishnan.
Technical Memorandum 11216-901004-13TM, 1990.

Bibtex:
@techreport{dhillon1990performanc, author = "Inderjit S. Dhillon AND Narendra K. Karmarkar AND K. G. Ramakrishnan", title = "Performance Analysis of a Proposed Parallel Architecture on Matrix Vector Multiply Like Routines", institution = "AT&T Bell Laboratories", number = "11216-901004-13TM", year = "1990", abstract = "This paper presents some ideas on the implementation of a compiler which uses the knowledge of the finite geometry that underlies the novel parallel hardware architecture for sparse matrix computations [KAR90]. The compiler takes a data flow graph as input, rearranges it to avoid memory switch, and processors conflicts, and schedules operations to maximize the efficiency of the parallel hardware. The action of the compiler can be viewed as a discrete-time-driven simulation of the execution of the data flow graph on the parallel machine, the simulation capturing the state of the hardware at a particular time instant. The paper presents extensive simulation results for matrix-vector multiply routines on the parallel hardware. The matrices have been chosen from diverse LP applications as well as other scientific computations. The results indicate that uniformly high efficiency (above 90%) is achievable on problems with regular as well as arbitrary structure." }

Center for Big Data Analytics

Performance Analysis of a Proposed Parallel Architecture on Matrix Vector Multiply Like Routines

Inderjit Dhillon, Narendra Karmarkar, K. Ramakrishnan

Download: pdf

Citation