EXA2PRO-EoCoE joint workshop
from
Monday, February 22, 2021 (9:30 AM)
to
Wednesday, February 24, 2021 (6:00 PM)
Monday, February 22, 2021
9:30 AM
EXA2PRO framework overview & success stories
-
Lazaros Papadopoulos
(
ICCS/NTUA
)
EXA2PRO framework overview & success stories
(EXA2PRO)
Lazaros Papadopoulos
(
ICCS/NTUA
)
9:30 AM - 10:00 AM
Overview of the EXA2PRO project and EXA2PRO framework
10:00 AM
EXA2PRO High-level programming interface: SkePU and ComPU
-
Christoph Kessler
(
Linköping University
)
EXA2PRO High-level programming interface: SkePU and ComPU
(EXA2PRO)
Christoph Kessler
(
Linköping University
)
10:00 AM - 10:45 AM
We shortly present the main concepts of the EXA2PRO high-level programming model: SkePU skeletons (i.e., generic C++ program constructs with multiple backends supporting heterogeneous systems and clusters), multi-variant software components with explicit metadata annotation, smart data-containers for array-based data types, and the XPDL platform modeling framework. [https://skepu.github.io/tutorials/eocoe-exa2pro-2021/]( https://skepu.github.io/tutorials/eocoe-exa2pro-2021/)
10:45 AM
Break
Break
10:45 AM - 11:15 AM
11:15 AM
EXA2PRO Runtime system: StarPU
-
Samuel Thibault
(
University of Bordeaux
)
EXA2PRO Runtime system: StarPU
(EXA2PRO)
Samuel Thibault
(
University of Bordeaux
)
11:15 AM - 12:00 PM
We present the concepts of the EXA2PRO low-level programming model: StarPU task-based programming (https://starpu.gitlabpages.inria.fr/), which provides optimized execution on clusters of heterogeneous platforms. We will start with the basic principles of task-based programming. We will then bring an overview of the set of features and optimizations which are thus made possible at little extra cost from the programmer, from optimized scheduling to efficient distributed execution.
12:00 PM
Lunch break
Lunch break
12:00 PM - 2:00 PM
2:00 PM
EoCoE framework overview & success stories
-
Edouard Audit
(
CEA
)
EoCoE framework overview & success stories
(EoCoE)
Edouard Audit
(
CEA
)
2:00 PM - 2:30 PM
Overview of the EoCoE project
2:30 PM
EoCoE - The Parallel Data Interface
-
Julien Bigot
(
MdlS/CEA
)
EoCoE - The Parallel Data Interface
(EoCoE)
Julien Bigot
(
MdlS/CEA
)
2:30 PM - 3:15 PM
Links: - project website - https://pdi.julien-bigot.fr/master/
3:15 PM
Break
Break
3:15 PM - 3:45 PM
3:45 PM
EoCoE - FTI - State-of-the-art multi-level checkpointing library
-
Leonardo Bautista-Gomez
(
Barcelona Super-Computing Center
)
EoCoE - FTI - State-of-the-art multi-level checkpointing library
(EoCoE)
Leonardo Bautista-Gomez
(
Barcelona Super-Computing Center
)
3:45 PM - 4:30 PM
Large scale infrastructures for distributed and parallel computing offer thousands of computing nodes to their users to satisfy their computing needs. As the need for massively parallel computing increases in industry and development, cloud infrastructures and computing centers are being forced to increase in size and to transition to new computing technologies. While the advantage for the users is clear, such evolution imposes significant challenges, such as energy consumption and fault tolerance. Fault tolerance is even more critical in infrastructures built on commodity hardware. Recent works have shown that large scale machines built with commodity hardware experience more failures than previously thought. Leonardo Bautista Gomez, senior Researcher at the Barcelona Supercomputing Center, will focus on how to guarantee high reliability to high-performance applications running in large infrastructures. In particular, they will cover all the technical content necessary to implement scalable multilevel checkpointing for tightly coupled applications. This will include an overview of the internals of the FTI library, and explain how multilevel checkpointing is implemented today, together with examples that the audience can test and analyze on their own laptops, so that they learn how to use FTI in practice, and ultimately transfer that knowledge to their production systems.
Tuesday, February 23, 2021
9:00 AM
SkePU Skeleton Programming Hands-on Session
-
Johan Ahlqvist
(
Linköping University
)
August Ernstsson
(
Linköping University
)
Christoph Kessler
(
Linköping University
)
SkePU Skeleton Programming Hands-on Session
(EXA2PRO)
Johan Ahlqvist
(
Linköping University
)
August Ernstsson
(
Linköping University
)
Christoph Kessler
(
Linköping University
)
9:00 AM - 12:30 PM
*This session is limited to 20 participants.* __Tutorial website:__ [https://skepu.github.io/tutorials/eocoe-exa2pro-2021/](https://skepu.github.io/tutorials/eocoe-exa2pro-2021/)
12:30 PM
Lunch break
Lunch break
12:30 PM - 2:00 PM
2:00 PM
Performance Engineering and code generation techniques
-
Thomas Gruber
(
FAU
)
Markus Holzer
(
FAU
)
Sebastian Kuckuk
(
FAU
)
Performance Engineering and code generation techniques
(EoCoE)
Thomas Gruber
(
FAU
)
Markus Holzer
(
FAU
)
Sebastian Kuckuk
(
FAU
)
2:00 PM - 5:30 PM
*This hands-on session is limited to 20 participants.*
Wednesday, February 24, 2021
9:00 AM
StarPU task-based programming hands-on session
-
N. Furmento
Olivier Aumage
Samuel Thibault
(
University of Bordeaux
)
StarPU task-based programming hands-on session
(EXA2PRO)
N. Furmento
Olivier Aumage
Samuel Thibault
(
University of Bordeaux
)
9:00 AM - 12:30 PM
*This hands-on session is limited to 20 participants.* [https://starpu.gitlabpages.inria.fr/tutorials/2021-02-EoCoE/](https://starpu.gitlabpages.inria.fr/tutorials/2021-02-EoCoE/)
12:30 PM
Lunch break
Lunch break
12:30 PM - 2:00 PM
2:00 PM
Solving large linear systems with parallel solvers designed on top of runtime systems
-
Florent Pruvost
(
INRIA
)
Solving large linear systems with parallel solvers designed on top of runtime systems
(EoCoE)
Florent Pruvost
(
INRIA
)
2:00 PM - 3:30 PM
The HiePACS Inria team co-develops linear algebra libraries to solve very large numerical systems on supercomputers. To get good performances whatever the computing machine, these libraries are designed as task-based algorithms and make use of runtime systems such as [OpenMP](https://www.openmp.org/) (task), [Parsec](http://icl.utk.edu/parsec/) or [StarPU](https://starpu.gitlabpages.inria.fr/). One main advantage is that with a single algorithm we can deploy executions on different architectures (homogeneous, heterogeneous with GPUS, with few/many cores, different kind of architectures and networks) achieving relatively high performance without requiring a lot of parameter tuning. Three of these libraries will be highlighted within a thirty minutes presentation to which will succeed a one hour demonstration on our [PlaFRIM supercomputer](https://www.plafrim.fr/): [Chameleon](https://gitlab.inria.fr/solverstack/chameleon) (parallel dense linear algebra), [PaStiX](https://gitlab.inria.fr/solverstack/pastix) (parallel sparse direct solver) and [Maphys](https://gitlab.inria.fr/solverstack/maphys/maphys) (parallel hybrid solver). We will show how to install each library, how to use it through examples, discuss how to get good performances by tuning some parameters and finally visualize execution traces. The demonstration will put the emphasis on the reproducibility of experiments and performance; we will do so thanks to the [GNU Guix](https://guix.gnu.org/) distribution.
3:30 PM
Break
Break
3:30 PM - 4:00 PM
4:00 PM
Extreme-scale computation with PSBLAS and AMG4PSBLA
Extreme-scale computation with PSBLAS and AMG4PSBLA
(EoCoE)
4:00 PM - 5:30 PM