Melissa

Description

Melissa is a file avoiding, adaptive, fault tolerant and elastic framework, enabling very efficient executions of Ensemble Runs on large scale supercomputers. The amount of storage needed for Ensemble Runs can quickly become overwhelming, with the associated long read time that makes statistic computing time consuming. To avoid this pitfall, scientists reduce their study size by running low resolution simulations or down-sampling output data in space and time. Melissa bypasses this limitation by avoiding intermediate file storage. Melissa processes the data in transit enabling very large scale sensitivity analysis. Outputs are never stored on disc. This allows to compute oblivious statistics maps on every mesh element for every timestep on a full scale study. Experiments have demonstrated Melissa scalability up to 29000 cores and processing on-line 273 TB of data.

Integration

Melissa is designed to be flexible relying on a client/server model enabling where each ensemble member, an instance of the parallel simulation, runs as an independent job that dynamically connect to the parallel server in charge of data aggregation. This loosely coupled model leads to a very elastic framework where the number of concurrent members executing adapts to the machine availability. Melissa currently supports the SLURM, OAR and LSF batch scheduler and can be easily extended to other tools.

Sophistication

Melissa will be extended to support data aggregation beyond sensitivity analysis, for data assimilation, deep reinforcement learning and Surrogate model computing.

Please visit https://gitlab.inria.fr/melissa for more information on Melissa.