Description
COUNTDOWN is an open-source runtime library that is able to identify and automatically reduce the power consumption of the computing elements during communication and synchronization of MPI-based applications. COUNTDOWN saves energy without imposing a significant performance penalty by lowering CPUs power consumption only during waiting times for which performance state transition overheads are negligible. This is done transparently to the user. Since COUNTDOWN targets performance-neutral energy savings, its goal is to avoid performance penalties for a large set of MPI-based applications. Thus, COUNTDOWN focuses on saving energy only when this has no effects on performance.
Integration
COUNTDOWN exposes the same interface as a standard MPI library and intercepts all MPI calls from the application. COUNTDOWN implements two wrappers to intercept MPI calls: i) the first wrapper is used for C/C++ MPI libraries, ii) the second one is used for FORTRAN MPI libraries. When an application is instrumented with COUNTDOWN, every MPI call is enclosed in a corresponding wrapper function that implements the same signature. Interface extensions will be designed to be integrated with the job scheduler and with system-level monitoring frameworks.
Sophistication
It is possible to implement different power management strategies by exploiting specific application characteristics. COUNTDOWN also implements a strategy to split slack regions to data copy regions of MPI primitives and it is able to profile these regions to extract the unbalanced workload of the application. Furthermore, COUNTDOWN allows extracting fine-grain traces from the application, which can be exploited to estimate application performance and MPI communication characteristics. In REGALE, this will be leveraged for providing run-time information on application slack and unbalance as well as for energy reduction.
Please visit https://github.com/EEESlab/countdown for more information on COUNTDOWN