Nos partenaires



Accueil du site > Français > Evénements > Soutenances > Soutenances de thèses

Soutenances de thèses



Block Low-Rank multifrontal solvers: complexity, performance, and scalability

Théo MARY - Equipe APO - IRIT

Vendredi 24 Novembre 2017, 9h30
INP-ENSEEIHT, Salle des thèses
Version PDF :


Patrick AMESTOY, INP-IRIT (Directeur de thèse)
Cleve ASHCRAFT, Livermore Soft. Tech. Corp. (Examinateur)
Olivier BOITEAU, EDF (Examinateur)
Alfredo BUTTARI, CNRS-IRIT (Codirecteur de thèse)
Iain DUFF, STFC-Rutherford Appleton Laboratory (Examinateur)
Xiaoye Sherry LI, Lawrence Berkeley Nat. Lab (Rapportrice)
Gunnar MARTINSSON, Univ. of Oxford (Rapporteur)
Pierre RAMET, INRIA-LaBRI (Examinateur)


We investigate the use of low-rank approximations to reduce the cost of sparse direct multifrontal solvers. Among the different matrix representations that have been proposed to exploit the low-rank property within multifrontal solvers, we focus on the Block Low-Rank (BLR) format whose simplicity and flexibility make it easy to use in a general purpose, algebraic multifrontal solver. We present different variants of the BLR factorization, depending on how the low-rank updates are performed and on the constraints to handle numerical pivoting.
We first investigate the theoretical complexity of the BLR format which, unlike other formats such as hierarchical ones, was previously unknown. We prove that the theoretical complexity of the BLR multifrontal factorization is asymptotically lower than that of the full-rank solver. We then show how the BLR variants can further reduce that complexity. We provide an experimental study with numerical results to support our complexity bounds.
After proving BLR multifrontal solver can achieve a low complexity, we turn to the problem of translating that low complexity in actual performance gains on modern architectures. We first present a multithreaded BLR factorization, and analyze its performance in shared-memory multicore environments on a large set of real-life problems. We put forward several algorithmic properties of the BLR variants necessary to efficiently exploit multicore systems by improving the arithmetic intensity and the scalability of the BLR factorization. We then move on to the distributed-memory BLR factorization, for which additional challenges are identified and addressed.
The algorithms presented throughout this thesis have been implemented within the MUMPS solver. We illustrate the use of our approach in three industrial applications coming from geosciences and structural mechanics. We also compare our solver with the STRUMPACK package, based on Hierarchically Semi-Separable approximations. We conclude this thesis by reporting results on a very large problem (90 millions of unknowns) which illustrates future challenges posed by BLR multifrontal solvers at scale.