abstract:332c7c3d57e67551.tex

1: \begin{abstract}

2:

3: We consider the problem of developing an efficient multi-threaded

4: implementation of the matrix-vector multiplication algorithm for sparse

5: matrices with structural symmetry.  Matrices are stored using the

6: \textit{compressed sparse row-column} format (CSRC), designed for profiting

7: from the symmetric non-zero pattern observed in global finite element matrices.

8: Unlike classical compressed storage formats, performing the sparse

9: matrix-vector product using the CSRC requires thread-safe access to the

10: destination vector.  To avoid race conditions, we have implemented two

11: partitioning strategies.  In the first one, each thread allocates an array for

12: storing its contributions, which are later combined in an accumulation step.

13: We analyze how to perform this accumulation in four different ways.

14: The second strategy employs a coloring

15: algorithm for grouping rows that can be concurrently processed by threads. Our

16: results indicate that, although incurring an increase in the working set size,

17: the former approach leads to the best performance improvements for

18: most matrices.

19:

20: \bigskip

21:

22: \noindent{\bf Keywords}: structurally symmetric matrix; sparse matrix-vector product; compressed sparse

23: row-column; parallel implementation; multi-core architectures; finite element method;

24:

25: \end{abstract}

26: