Abstracts: John Reid
Direct solution of large sparse sets of equations on a multicore computer
It has long been recognized that blocking is the key to fast solution of large sets of linear equations. The usual approach for sparse problems is to break the computation into major steps within which efficient full-matrix kernels can be employed. Over the past several years, we have been developing efficient uniprocessor kernels for the multifrontal method. These further subdivide the calculation and use Level-3 BLAS, mostly GEMM, for the inner tasks. We have recently extended these kernels to use OpenMP.
Recent work of Buttari, Langou, Kurzak, and Dongarra has shown that good speed for dense linear algebra computations on multicore computers may be achieved by dynamically scheduling the inner tasks, taking into account dependencies between them. We have developed a kernel that uses these ideas for the positive-definite case.
We report on the multicore performance of our resulting multifrontal codes on large problems. There is a limitation of the multifrontal framework in that work at a node does not start until all work at its children is complete. We will show that this can be circumvented in the positive-definite case by holding at a node only those columns that contain the node's pivots and assembling blocks of the generated elements directly in the matrices of its ancestors. This allows more independent tasks to be scheduled and we report on the resulting improved performance.
This is joint work with Jonathan Hogg and Jennifer Scott.