Apprentissage par Bandits pour du DVFS efficace en énergie en contexte HPC

Contexte La consommation d’énergie des ordinateurs devient une préoccupation majeure dans le cadre du réchauffement climatique. Pour optimiser leur consommation électrique d’application informatique, il est nécessaire de disposer d’informations précises sur leur comportement. Il devient alors possible de choisir la bonne fréquence d’un processeur. Cependant, le choix de la vitesse de fréquence peut fortement détériorer son fonctionnement, ou au contraire, n’avoir aucun effet visible pour l’utilisateur. Objectif L’objectif de ce projet sera réalisé en plusieurs étapes

Monitoring des performances énergétiques programmation GPU

Encadrants Georges Da Costa, Loïc Barthe, Nicolas Mellado Contexte Ce stage s’inscrit dans les thématiques de recherche des équipes SEPIA et STORM de l’IRIT. L’équipe SEPIA s’intéresse à l’économie d’énergie dans les datacenters. En effet ces derniers sont constitués de plusieurs milliers d’ordinateurs et leur impact écologique les placent au niveau de l’industrie de l’aviation. Les travaux de l’équipe SEPIA se positionnent autant au niveau algorithmique (ordonnancement de tâches, reconfiguration d’applications) qu’au niveau des outils de support (lancement d’expériences sur plusieurs centaines de machines, monitoring bas niveau de performance et d’énergie).

Exploring the balance between energy and performance of federated learning algorithms

Context There is an increasing interest in a new distributed ML paradigm called Federated Learning (FL)[La17], in which nodes compute their local gradients and communicate them to a central server. This centralised server then orchestrates rounds of training over large data volumes created and stored locally at a large number of nodes. This training procedure repeats until some criterion are met. This enables the participating nodes (e.g., IoT devices, mobile phones, etc) to protect their data and solve the data security and privacy issues imposed by law.

Sufficient data center: off-grid scheduling for environmentally responsible users

Topic Avoiding the ecological catastrophe will require a joined effort from every actor of society. Our intensive and growing use of digital technologies must be questioned. We postulate that some environmental-aware individuals are willing to reflect upon and reduce the footprint associated to their digital usages. Similarly to the Low-tech Magazine[1], a solar-powered and very lightweight website, this internship will study an off-grid “sufficient”[2] data center in which a part of the users accepts to contribute to the environmental effort.

Prédiction frugale de la charge d’un supercalculateur pour réduire son impact énergétique

Keywords prédiction, charge, hpc, ordonnancement, incertitude, énergie Encadrants Millian Poquet, Georges Da Costa Contexte Dans le monde du calcul à haute performance, un supercalculateur est une plateforme de calcul utilisée par de nombreux utilisateurs pour y exécuter des applications, notamment pour lancer des campagnes de simulations scientifiques à grande échelle. Les supercalculateurs récents peuvent avoir un nombre très grand de ressources (de l’ordre du million de cœurs) et les utilisateurs n’accèdent donc pas directement aux ressources ; ils passent par l’intermédiaire d’un gestionnaire de ressources (comme SLURM[1]) pour réserver des nœuds/cœuds de calcul et pour y exécuter des applications.

Replaying with feedback: towards more realistic HPC simulations

Topic Researchers use simulations to compare the performance (execution time, energy efficiency, …) of different scheduling algorithms in High-Performance Computing (HPC) platforms. The most common method is to replay historic workloads recorded in real HPC infrastructures (like the ones available in the Parallel Workloads Archive): jobs are submitted to the simulation at the same timestamp as in the original log. A major drawback of this method is that it does not preserve the submission behavior of the users of the platform.

Game Theory for Green Datacenters

In order to operate a datacenter only with renewable energies, a negotiation has to be undertaken between the sources providing and storing the energy (solar panels, wind turbine, batteries, hydrogen tanks) and the consumers of the energy (basically the IT infrastructure). In the context of the ANR DATAZERO2 project (, a negotiation module has to be improved, starting from a existing proof of concept already published. The improvement will be included in a dedicated module, interoperable with a functioning middleware developed in the project.

Federation of clouds: Multi-Clouds overflow

To cover data analytics needs, Cloud providers need to adapt their IaaS services to resources consumption fluctuations and demands. This requests geographical distribution of tasks excutions and flexible services. Having a federation of cloud providers allows to provide such services to users. In this project, users submit their applications on a cloud broker. The aim is to find resources in one or many clouds to be able to answer the request.

DVFS-aware performance and energy model of HPC applications

Power consumption of computers is becoming an major concern. To optimise their power consumption it is necessary to have precise information on the behavior of applications. With this information, it is possible to choose the right frequency of a processor. The speed of some applications is not really impacted by changes of this frequency, while for some application it has an important effect. The goal of this internship is to model the fine grained behavior of applications and to link this behavior with the impact (on performance and energy) of frequency changes.

Fast scheduling under energy and QoS constraints in a Fog computing environment

Location: LAAS-CNRS - Team SARA or IRIT - Team SEPIA Supervisors: Da Costa Georges ( / Guérout Tom ( Duration: 6 months, possibility of thesis afterwards. Context The explosion of the volume of data exchanged within today’s IT systems, due to an increasingly wide use and by an increasingly wide audience (large organizations, companies, general public etc.), has led for several years to question the architectures used until now. Indeed, for the past few years, Fog computing [1], which extends the Cloud computing paradigm to the edge of the network, has been developing steadily, offering more and more possibilities and thus extending the field of Internet of Things applications.