Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 2 - NLAFET (Parallel Numerical Linear Algebra for Future Extreme-Scale Systems)

Teaser

Today’s largest High Performance Computing (HPC) systems have a serious gap between the peak capabilities of the hardware and the performance realized by HPC applications. Future extreme scale systems present new challenges that could widen the gap so much that the...

Summary

Today’s largest High Performance Computing (HPC) systems have a serious gap between the peak capabilities of the hardware and the performance realized by HPC applications. Future extreme scale systems present new challenges that could widen the gap so much that the possibilities of effective use can be even more difficult to reach. However, there is considerable potential to improve algorithm and software design in order to better exploit the hardware performance features already present in the fastest systems of today and the heterogeneous multi-core systems that will be available in the next five years.

The NLAFET project contributes to a new generation of computational tools and parallel software for problems in numerical linear algebra. These tools are developed with a focus on extreme scale systems. Since linear algebra is both fundamental and ubiquitous in computational science and its vast application areas, the results of NLAFET have applicability to computational science in general.

The NLAFET project spans—for a set of fundamental linear algebra operations—the whole spectrum from idea to application. Only improved and new algorithms implemented in new scalable software and integrated into real-world applications will be utilized efficiently in extreme scale HPC systems. Algorithms and software developed are integrated and validated in four complementary real world application domains: particle physics, power systems, linear elasticity problems, and data analysis in astrophysics.

The main overall objectives of NLAFET are:
• development of novel algorithms that expose as much parallelism as possible, exploit heterogeneity, avoid communication bottlenecks, respond to escalating fault rates, and help meet emerging power constraints;
• exploration of advanced scheduling strategies and runtime systems focusing on the extreme scale and strong scalability in multi/many-core and hybrid environments;
• design and evaluation of novel strategies and software support for both offline and online auto-tuning.

Work performed

The work in the NLAFET project has focused on fundamental scientific issues and software development, including novel algorithms, prototype and library software for a critical set of fundamental linear algebra operations. The results are presented in 39 deliverables and 23 NLAFET Working Notes available on the NLAFET web-site (www.nlafet.eu). Here follows a brief overview of the work performed including some highlights.

The work package Dense Linear Systems and Eigenvalue Problem Solvers has designed algorithms efficient, scalable, and robust parallel software for fundamental dense matrix computations far above and beyond those covered by standard packages today. This WP delivers:
• novel linear system solvers
• novel BLAS for heterogeneous systems
• novel non-symmetric standard and generalized eigenvalue problem solvers including robust eigenvector computation
• novel singular value problem solvers

The work package Direct Solution of Sparse Linear Systems has designed parallel algorithms and software for the direct solution of sparse linear equations. This WP delivers:
• novel lower bounds on communication for sparse matrices
• novel direct solvers for (near-)symmetric systems
• novel direct solvers for highly unsymmetric systems
• novel hybrid direct-iterative solvers

The work package Communication-Optimal Algorithms for Iterative Methods has designed new iterative methods that allow to drastically reduce the communication, and even minimize it whenever possible. This WP delivers:
• novel computational kernels for preconditioned iterative methods
• novel iterative (enlarged Krylov) methods
• novel multilevel preconditioners

The work package Cross-Cutting Issues has investigated and designed a sustainable set of methods and tools for scheduling and runtime systems, auto-tuning, and algorithm-based fault tolerance (ABFT) packaged into open source library modules. This WP delivers:
• novel parallel critical path (PCP) scheduling for improving scalability; evaluation of several scheduling strategies for solving dense linear algebra problems on various HPC systems,
• novel self-adaptive approaches for on-line tuning
• novel implementation of a task-based ABFT Cholesky factorization

The work package Challenging applications – a selection has integrated and evaluated software parts of the NLAFET Library in the following applications:
• Task-based shared memory parallelism into 2DRMP software for modelling of electron scattering from H-like atoms and ions
• Load flow based calculations in large-scale power systems and PowerFactory code
• Communication avoiding iterative methods for solving linear systems arising from several different applications, in particular linear elasticity problems
• Data analysis in astrophysics and Midapack

One main achievement of the NLAFET project is the software developed and deployed, and made available via the NLAFET website http://www.nlafet.eu/software/. The public repositories that together constitute the NLAFET library software are structured in five groups:
• Dense matrix factorizations and solvers
• Solvers and tools for standard and generalized dense eigenvalue problems
• Sparse direct factorizations and solvers
• Communication optimal algorithms for iterative methods
• Cross-cutting tools
The parallel programming models used in various combinations are MPI, OpenMP, PaRSEC and StarPU.

Dissemination and outreach activities of the NLAFET results have been extensive and received much positive attention. This includes presentations in a number of highly ranked conferences and workshops; in total: 25 invited/keynote presentations, >70 contributed and poster presentations, organization of 17 workshops/ minisymposia. There is more to come; including both invited and contributed presentations as well as various training opportunities for users of the NLAFET Library software.

Final results

The NLAFET project has developed a new generation of computational tools and software for problems in numerical linear algebra; tools that will simplify the transition to the next generation of HPC architectures. NLAFET thus provides a software infrastructure that many leading-edge applications must have for attaining high performance on extreme-scale systems. Many of the methodologies, functionalities and solutions developed are applicable to the development of numerical solutions for a wide range of applications, so we expect that the impact of the project will be very broad.

In summary, our results include theory, algorithms, prototype and library software and represent progress far above and beyond the state of the art when the NLAFET project started. The goal is to further develop, push, and deploy software into the scientific community to make it competitive on a world scale, and to contribute to standardisation efforts in the area.

In addition, there will be benefits for computational science education in offering a set of components ready for extreme scale computing; for application developers by giving them a single point of contact for registering their requirements; and for vendors by organizing a community to help them assemble the complete software environment that their systems need for success.

We also emphasize the positive effects with regards to the career development of the participating young researchers (students, postdocs). Several of these now have positions at top Vendors and Universities.

Website & more info

More info: http://www.nlafet.eu/.