BeBiDa

Description

BeBiDa is a resource management tool that enables the collocation of HPC and Big Data workloads leveraging the idle resources of an HPC system. The technique is seamless for end-users, it demands no change on the underlying resource management HPC system and is based on the simple job prolog/epilogue mechanism which is typical for HPC resource managers. The technique leverages Big Data frameworks resilience and elasticity by using a dynamic resource pool and do not disturb the HPC applications, since the Big Data are executed as low-priority best-effort jobs which get removed when an HPC job needs the resources and by ensuring that no Big Data processes left on the compute nodes after decommissioning.

Integration

BeBiDa has initially been developed for OAR but it can be easily adapted for SLURM or other resource managers. Furthermore, BeBiDa supports Spark deployed as bare-metal on the HPC system. In REGALE, BeBiDa will be adapted to support the usage of Singularity containerization platform which will provide flexibility in terms of environment deployment and the possibility to use different Big Data frameworks upon the traditional rigid HPC system.

Sophistication

BeBiDa will be extended to support execution of Big Data workloads with 2 new classes: ‘deadline aware’ with no guarantees in terms of real-time resource requirements but a guarantee to finalize execution under a deadline and ‘time-critical’ with high guarantees for real-time resource requirements. Both new classes will be leveraging on pools of spare resources which will be used for the scaling needs of the elastic Big Data workloads. Furthermore, in REGALE, BeBiDa will be integrated with the workflow engine RYAX. BeBiDa will be enhanced with estimation techniques based on resources monitoring and historical data.

Please visit https://gitlab.inria.fr/mmercier/bebida for more information on BeBiDa.