PYRAMIDE Team

Head : Franck MORVAN

Dynamic Query Optimization in Large-Scale Distributed Environments  

In parallel and distributed large-scale environments (Cluster, Grid, Cloud Computing), we address the main topic of query processing and optimization, targeting huge volumes of distributed data in large-scale (“Big Data”).

Currently, our research activities focus on the design and development of new elastic resource allocation models for query optimization, while maximizing the exploitation of fundamental results obtained in parallel and distributed systems, particularly the aspects relative to parallelism types (i.e., partitioned, independent and pipeline parallelisms) and the minimization of inter-operation communication costs.

Our approach is based on the best trade-off between: (i) efficiency (multi-tenant satisfaction in terms of Quality of Service QoS) and (ii) profitability (PaaS service providers). The originality of these new elastic resource allocation models lies in: (i) the introduction of an economic model integrating the profitability dimension (taking into account the providers’ pricing) in the objective function, (ii) the decentralization of control to ensure the scalability by the integration of pro-active migration policy based on mobile agents, and (iii) the revisit of cost estimation methods and search strategies for finding an optimal or near-optimal execution plan.

The two main research issues addressed by the Pyramid team are described below.

I1: Elastic Resource Allocation for Query Optimization

The objective is to design and develop elastic resource allocation models for query optimization. In Cloud Computing environments, the allocated resources (on the provider side) should increase or decrease in accordance with the demand of services (on the tenants side), in order to maintain a QoS and to meet the SLAs (Service Level Agreement). The main QoS criteria taken into account are the performance (e.g., query response time) and the availability of services. As for the SLA, a kind of provider-consumer contract, it specifies a set of constraints to meet and objectives to reach, in terms of these main QoS criteria. The main challenge is to find the best trade-off between the tenants’ satisfaction in terms of QoS and the providers’ gain in terms of profitable resource management.

I2: Data Replication in Cloud Systems

The data replication strategies proposed for parallel, distributed and grid systems are difficult to adapt to Cloud systems. The objective is to propose data replication strategies which should integrate an economic model in terms of profitability of the provider which takes into account possible penalties. The main challenge is to define a dynamic mechanism to adjust the optimal number of replicas in order to allow an elastic resource management.

skills

Parallel and Distributed Databases in Large-scale, Elastic Resource Allocation, Economic Model
Cloud Computing
Query Optimization
Elastic Resource Allocation
Economic Model

team Members

Permanent members
Non-permanent members
External members

team publications

International journals articles
  • Jorge Martinez-Gil, Riad Mokadem, Josef Küng, Abdelkader Hameurlain

    Neurofuzzy semantic similarity measurement

    Data and Knowledge Engineering, 2023, 145, pp.102155. ⟨10.1016/j.datak.2023.102155⟩

    Accès: https://hal.science/hal-04060914

  • Tarek Hamrouni, Riad Mokadem, Amel Khelifa

    Review on data replication strategies in single vs. interconnected cloud systems: Focus on data correlation‐aware strategies

    Concurrency and Computation: Practice and Experience, 2023, 7758, ⟨10.1002/cpe.7758⟩

    Accès: https://hal.science/hal-04133143

  • Riad Mokadem, Jorge Martinez Gil, Abdelkader Hameurlain, Josef Kueng

    A review on data replication strategies in cloud systems

    International Journal of Grid and Utility Computing, 2022, 13 (4), pp.347-362. ⟨10.1504/IJGUC.2022.125135⟩

    Accès: https://hal.science/hal-03828293

  • Jorge Martinez-Gil, Shaoyi Yin, Josef Küng, Franck Morvan

    Matching Large Biomedical Ontologies Using Symbolic Regression Using Symbolic Regression

    Journal of Data Intelligence, 2022, 3 (3), pp.316-332. ⟨10.26421/JDI3.3-2⟩

    Accès: https://hal.science/hal-03842681

  • Jorge Martinez-Gil, Riad Mokadem, Franck Morvan, Josef Küng, Abdelkader Hameurlain

    Interpretable entity meta-alignment in knowledge graphs using penalized regression: a case study in the biomedical domain

    Progress in Artificial Intelligence, 2022, 11 (1), pp.93-104. ⟨10.1007/s13748-021-00263-1⟩

    Accès: https://hal.science/hal-03841296

  • Amel Khelifa, Riad Mokadem, Tarek Hamrouni, Faouzi Ben Charrada

    Data correlation and fuzzy inference system-based data replication in federated cloud systems

    Simulation Modelling Practice and Theory, 2022, 115, pp.102428. ⟨10.1016/j.simpat.2021.102428⟩

    Accès: https://hal.science/hal-03621837

  • Amel Khelifa, Tarek Hamrouni, Riad Mokadem, Faouzi Ben Charrada

    Combining task scheduling and data replication for SLA compliance and enhancement of provider profit in clouds

    Applied Intelligence, 2021, 51, pp.7494-7516. ⟨10.1007/s10489-021-02267-9⟩

    Accès: https://hal.science/hal-03481554

  • Abdenour Lazeb, Riad Mokadem, Ghalem Belalem

    A new popularity-based data replication strategy in cloud systems

    Multiagent and Grid Systems – An International Journal of Cloud Computing , 2021, 17 (2), pp.159-177. ⟨10.3233/MGS-210348⟩

    Accès: https://hal.science/hal-03481577

  • Uras Tos, Riad Mokadem, Abdelkader Hameurlain, Tolga Ayav

    Achieving query performance in the cloud via a cost-effective data replication strategy

    Soft Computing, 2021, 25, pp.5437-5454. ⟨10.1007/s00500-020-05544-w⟩

    Accès: https://hal.science/hal-03116152

  • Amel Khelifa, Tarek Hamrouni, Riad Mokadem, Faouzi Ben Charrada

    Triadic Concept Analysis-based Data Replication Strategy while satisfying Tenant Performance and provider Profit Gurantees

    International Journal of High Performance Computing and Networking, In press

    Accès: https://hal.science/hal-03116182

  • National journals articles
    Special issues of journal
    International conferences articles
  • Mira El Danaoui, Shaoyi Yin, Abdelkader Hameurlain, Franck Morvan

    A Cost-Effective Query Optimizer for Multi-tenant Parallel DBMSs

    European Conference on Advances in Databases and Information Systems (ADBIS 2023), Sep 2023, Barcelona, Spain. pp.25-34, ⟨10.1007/978-3-031-42941-5_3⟩

    Accès: https://hal.science/hal-04200855

  • Antoine Bugnicourt, Riad Mokadem, Franck Morvan, Nadia Bebeshina

    An Error-Based Measure for Concept Drift Detection and Characterization

    Learning and Intelligent Optimization: 17th International Conference, LION 17, Jun 2023, Nice, France. pp.239-253, ⟨10.1007/978-3-031-44505-7_17⟩

    Accès: https://hal.science/hal-04206125

  • Damien T Wojtowicz, Shaoyi Yin, Jorge Martinez-Gil, Franck Morvan, Abdelkader Hameurlain

    Multi-Cloud Query Optimisation with Accurate and Efficient Quoting

    IEEE International Conference on BigData (BigData 2022), Dec 2022, Osaka, Japan. pp.228-233, ⟨10.1109/BigData55660.2022.10020835⟩

    Accès: https://hal.science/hal-03841516

  • Morgan Séguéla, Riad Mokadem, Jean-Marc Pierson

    Dynamic Energy and Expenditure Aware Data Replication Strategy

    IEEE International Conference on Cloud Computing Technical Program (CLOUD 2022), IEEE, Jul 2022, Barcelona, Spain

    Accès: https://hal.science/hal-03696210

  • Jorge Martinez-Gil, Shaoyi Yin, Josef Küng, Franck Morvan

    Matching Large Biomedical Ontologies Using Symbolic Regression

    23rd International Conference on Information Integration and Web Intelligence (iiWAS 2021), Nov 2021, Linz, Austria. pp.162-167, ⟨10.1145/3487664.3487781⟩

    Accès: https://hal.science/hal-03853319

  • Jorge Martinez-Gil, Riad Mokadem, Josef Küng, Abdelkader Hameurlain

    A Novel Neurofuzzy Approach for Semantic Similarity Measurement

    23rd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2021), Sep 2021, Virtual, France. pp.192-203, ⟨10.1007/978-3-030-86534-4_18⟩

    Accès: https://hal.science/hal-03481565

  • Damien T Wojtowicz, Shaoyi Yin, Franck Morvan, Abdelkader Hameurlain

    Cost-Effective Dynamic Optimisation for Multi-Cloud Queries

    IEEE 14th International Conference on Cloud Computing (CLOUD 2021), IEEE Computer Society under the auspice of the Technical Committee on Services Computing (TCSVC), Sep 2021, Chicago (virtual), United States. pp.387-397, ⟨10.1109/CLOUD53861.2021.00052⟩

    Accès: https://ut3-toulouseinp.hal.science/hal-03428073v2

  • Morgan Séguéla, Riad Mokadem, Jean-Marc Pierson

    Energy and Expenditure Aware Data Replication Strategy

    14th IEEE International Conference on Cloud Computing (CLOUD 2021), Sep 2021, Chicago (virtual), United States. pp.421-426, ⟨10.1109/CLOUD53861.2021.00056⟩

    Accès: https://hal.science/hal-03481520

  • Damien T Wojtowicz, Shaoyi Yin, Franck Morvan

    SLA Definition for Multi-Cloud Queries

    36ème Conférence sur la Gestion de Données : Principes, Technologies et Applications (BDA 2020), LIP6 Sorbonne Université, Oct 2020, Paris (online), France. pp.80

    Accès: https://ut3-toulouseinp.hal.science/hal-03211098

  • Morgan Séguéla, Riad Mokadem, Jean-Marc Pierson

    Comparing energy-aware vs. cost-aware data replication strategy

    10th International Green and Sustainable Computing Conference (IGSC 2019), Oct 2019, Alexandria, United States. pp.1-8

    Accès: https://hal.science/hal-02950735

  • National conferences articles
  • Morgan Séguéla, Riad Mokadem, Jean-Marc Pierson

    Étude des Stratégies de réplication de données prenant en compte la consommation énergétique vs. le profit économique dans les systèmes Cloud

    Conférence d’informatique en Parallélisme, Architecture et Système (ComPAS 2019), Jun 2019, Anglet, France. pp.1-8

    Accès: https://hal.science/hal-02887496

  • Conferences articles without published proceedings
    Books
  • Abdelkader Hameurlain, A Min Tjoa

    Transactions on Large-Scale Data- and Knowledge-Centered Systems LIII

    Springer, 13840, 2023, Lecture Notes in Computer Science (LNCS), 0302-9743. ⟨10.1007/978-3-662-66863-4⟩

    Accès: https://hal.science/hal-04044446

  • Jorge Martinez-Gil, Shaoyi Yin, Josef Küng, Franck Morvan

    Knowledge Graph Augmentation for Increased Question Answering Accuracy

    Abdelkader Hameurlain; A Min Tjoa. Transactions on Large-Scale Data- and Knowledge-Centered Systems LII, 13470, Springer, pp.70-85, 2022, Lecture Notes in Computer Science book series (LNCS), 978-3-662-66145-1. ⟨10.1007/978-3-662-66146-8_3⟩

    Accès: https://hal.science/hal-03842359

  • Abdelkader Hameurlain, A. Min Tjoa

    Transactions on Large-Scale Data- and Knowledge-Centered Systems LII

    Springer, 13470, pp.IX, 149, 2022, Lecture Notes in Computer Science book series (LNCS), 978-3-662-66145-1. ⟨10.1007/978-3-662-66146-8⟩

    Accès: https://hal.science/hal-03842710

  • Abdelkader Hameurlain, a Min Tjoa, Philippe Lamarre, Karine Zeitouni

    Transactions on Large-Scale Data- and Knowledge-Centered Systems {XLIV} – Special Issue on Data Management – Principles, Technologies, and Applications

    2020, ⟨10.1007/978-3-662-62271-1⟩

    Accès: https://hal.science/hal-03102005

  • Abdelkader Hameurlain, A Min Tjoa, Philippe Lamarre, Karine Zeitouni

    Transactions on Large-Scale Data-and Knowledge-Centered Systems – XLIV

    Springer, LNCS 12380, pp.1–204, 2020, Book series: Transactions on Large-Scale Data- and Knowledge-Centered Systems, 978-3662622711. ⟨10.1007/978-3-662-62271-1⟩

    Accès: https://hal.science/hal-04458571

  • Sven Hartmann, Hua Ma, Abdelkader Hameurlain, Günther Pernuel, Roland Wagner

    Database and Expert Systems Applications – Proceedings (Part 1 et 2) of 29th International Conference, DEXA 2018, Regensburg, 03/09/2018 – 06/09/2018

    Hartmann, Sven; Ma, Hua; Hameurlain, Abdelkader; Pernuel, Günther; Wagner, Roland. Springer, 11029 (Part 1) et 11030 (Part 2), 2018, Lecture Notes in Computer Science book series (LNCS)

    Accès: https://hal.science/hal-03033974

  • Abdelkader Hameurlain, Riad Mokadem

    Special Issue: Elastic Data Management in Cloud Systems

    Abdelkader Hameurlain; Riad Mokadem. CRL Publishing, 32 (4), 2017, International Journal of Computer Systems Science and Engineering, ISSN: 0267-6192

    Accès: https://hal.science/hal-03109260

  • Djamal Benslimane, Ernesto Damiani, William Grosky, Abdelkader Hameurlain, Amit P. Sheth, Roland Wagner

    Database and Expert Systems Applications – 28th International Conference, DEXA 2017, Part II

    2017

    Accès: https://inria.hal.science/hal-01857556

  • Books Books parts
    Thesis and HDR
  • Damien T Wojtowicz

    Optimisation de requêtes en environnements multi-clouds

    Sciences de l’information et de la communication. Université Paul Sabatier – Toulouse III, 2023. Français. ⟨NNT : 2023TOU30043⟩

    Accès: https://theses.hal.science/tel-04202698

  • Morgan Séguéla

    Stratégie de réplication de données prenant en compte la consommation énergétique et la dépense dans les systèmes à grandes échelles

    Sciences de l’information et de la communication. Université Paul Sabatier – Toulouse III, 2022. Français. ⟨NNT : 2022TOU30126⟩

    Accès: https://theses.hal.science/tel-03924082

  • Riad Mokadem

    Contribution à la réplication de données dans les systèmes de gestion de données à grande échelle

    Informatique [cs]. Université Toulouse III – Paul Sabatier (UPS), 2020

    Accès: https://hal.science/tel-03116229

  • Max Halford

    Statistical learning for selectivity estimation in relational databases

    Statistics [math.ST]. Université Toulouse 3 – Paul Sabatier, 2020. English. ⟨NNT : ⟩

    Accès: https://theses.hal.science/tel-03231204

  • Mohamed Mehdi Kandi

    Allocation de ressources élastique pour l’optimisation de requêtes

    Recherche d’information [cs.IR]. Université Paul Sabatier – Toulouse III, 2019. Français. ⟨NNT : 2019TOU30172⟩

    Accès: https://theses.hal.science/tel-02619755

  • Damla Oğuz

    Méthodes d’optimisation pour le traitement de requêtes réparties à grande échelle sur des données liées

    Web. Université Paul Sabatier – Toulouse III, 2017. Français. ⟨NNT : 2017TOU30067⟩

    Accès: https://theses.hal.science/tel-01820773

  • Uras Tos

    Data replication in large-scale data management systems

    Web. Université Paul Sabatier – Toulouse III, 2017. English. ⟨NNT : 2017TOU30066⟩

    Accès: https://theses.hal.science/tel-01820748

  • Chiraz Moumen

    Une méthode d’optimisation hybride pour une évaluation robuste de requêtes

    Arithmétique des ordinateurs. Université Paul Sabatier – Toulouse III, 2017. Français. ⟨NNT : 2017TOU30070⟩

    Accès: https://theses.hal.science/tel-01820739

  • Thesis and HDR
    • Djaouad Benachir

      Méthodes de séparation aveugle de sources pour le démélange d’images de télédétection

      Master’s Thesis, Université de Toulouse, November 2014.

      BibTeX

    • Igor Epimakhov

      Allocation des ressources pour l’optimisation de requêtes dans les systèmes de grille de données.

      Master’s Thesis, Université Paul Sabatier, July 2013.

      BibTeX

    • Imen Ketata

      Méthode de découverte de sources de données en environnement de grille de données en tenant compte de la sémantique

      Master’s Thesis, Université Paul Sabatier, January 2012.

      BibTeX

    • Deniz Cokuslu

      Resource Discovery and Allocation for Query Processing in Grid Systems

      Master’s Thesis, Université Paul Sabatier, November 2012.

      BibTeX

    • Raddad Al King

      Localisation de sources de données et optimisation de requêtes réparties en environnement pair-à-pair

      Master’s Thesis, Université Paul Sabatier, May 2010.

      BibTeX

    • Mahmoud El Samad

      Découverte et monitoring de ressources pour le traitement de requêtes dans une grille de données

      Master’s Thesis, Université Paul Sabatier, December 2009.

      BibTeX

    • Christelle Pierkot

      Gestion de la mise à jour de données géographiques répliquées

      Master’s Thesis, Université Paul Sabatier, July 2008.

      BibTeX

    • Belgin Ergenç

      Query Execution for Restricted Sources in a Large Scale Data Integration Environment

      Master’s Thesis, Université Paul Sabatier, January 2008.

      BibTeX

    • Nadhem Marsit

      Traitement des requêtes dépendant de la localisation avec des contraintes de temps réel

      Master’s Thesis, Université Paul Sabatier, December 2007.

      BibTeX

    • Franck Morvan

      Optimisation dynamique de requêtes : du centralisé au décentralisé

      HDR, Université Paul Sabatier, December 2006.

      BibTeX

    Reports
  • Shaoyi Yin, Franck Morvan, Jorge Martinez-Gil, Abdelkader Hameurlain

    MTD-DS: an SLA-aware Decision Support Benchmark for Multi-tenant Parallel DBMSs

    IRIT/RR–2023–05–FR, IRIT – Institut de Recherche en Informatique de Toulouse. 2023

    Accès: https://hal.science/hal-04312262

  • Morgan Séguéla, Riad Mokadem, Jean-Marc Pierson

    Energy and Expenditure Aware Data Replication Strategy

    [Research Report] IRIT/RR–2021–07–FR, Institut de Recherche en Informatique de Toulouse (IRIT), Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse cedex 9. 2021

    Accès: https://hal.science/hal-03378928

  • Riad Mokadem, Abdelkader Hameurlain

    An Elastic Data Replication Strategy with Performance and Availability Guarantees in Cloud Systems

    [Rapport de recherche] IRIT. 2017

    Accès: https://hal.science/hal-03109256

  • Chiraz Moumen, Franck Morvan, Abdelkader Hameurlain

    An Hybrid Method to Robust Query Processing with Respect to Estimation Errors

    [Research Report] IRIT. 2017

    Accès: https://hal.science/hal-03109255

  • Reports

    team Contracts

    AcronymeTitreResp. scDébut – fin
    AcronymeTitreResp. scDébut – fin
    Labex CIMI Centre International de Mathématiques et d’Informatique (de Toulouse) Joseph GERGAUD
    Christine ROCHANGE
    Franck MORVAN
    Denis KOUAMÉ
    Lotfi CHAARI
    Urtzi AYESTA
    Boris TEABE
    Thomas CARLE
    Emmanuel SOUBIES
    José Henrique DE MORAIS GOULART
    2012 – 2024
    AcronymeTitreResp. scDébut – fin
    Wednesday 26 April 2023, 10h00
    Optimisation de requêtes en environnements multi-clouds
    Damien WOJTOWICZ – Team PYRAMIDE, IRIT UT3 Paul Sabatier, IRIT, Auditorium J. Herbrand
    #these
    Wednesday 4 May 2022, 14h00
    Stratégie de réplication de données dans des systèmes larges échelles pour prendre en compte la consommation énergétique et la dépense
    Morgan SEGUELA – Team PYRAMIDE, Team SEPIA, IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses
    #these
    Monday 12 October 2020, 14h00
    Apprentissage statistique pour l’estimation de sélectivité en bases de données relationnelles
    Max HALFORD – Team PYRAMIDE, IRIT UT3 Paul Sabatier, IRIT, Auditorium J. Herbrand
    #these
    Friday 29 November 2019, 10h00
    Elastic resource allocation for query optimization
    Mohamed Mehdi KANDI – Team PYRAMIDE, IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses
    #these
    Wednesday 28 June 2017, 10h00
    Optimization Methods for Large-scale Distributed Query Processing on Linked Data
    Damla DEMIRTAS – Team PYRAMIDE – IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses
    #these
    Tuesday 27 June 2017, 10h00
    Data Replication in Large Scale Data Management Systems
    Uras TOS – Team PYRAMIDE – IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses
    #these
    Monday 29 May 2017, 10h30
    Une méthode d’optimisation hybride pour une évaluation robuste de requêtes
    Chiraz MOUMEN – Team PYRAMIDE – IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses
    #these
    Tuesday 4 September 2018
    BDMICS 2018 : 3rd International Workshop on Big Data Management in Cloud Systems
    Regensburg (Allemagne)
    #congres Know more
    Monday 3 September 2018 – Thursday 6 September 2018
    29th DEXA Conferences and workshops
    Regensburg (Allemagne)
    #congres Know more
    Monday 25 June 2018 – Friday 29 June 2018
    AstroInfo2018 : Ecole Thématique AstroInformatique 2018
    Polytech Marseille – Parc scientifique et technologique de Luminy, Marseille
    #congres Know more
    Tuesday 29 August 2017 – Wednesday 30 August 2017
    2nd International Workshop on Big Data Management in Cloud Systems
    Lyon
    #congres Know more
    Monday 28 August 2017 – Thursday 31 August 2017
    28th DEXA Conferences and Workshops
    Lyon
    #congres Know more
    Tuesday 6 September 2016 – Wednesday 7 September 2016
    1st International Workshop on Big Data Management in Cloud Systems
    Porto (Portugal)
    #congres Know more
    Monday 5 September 2016 – Thursday 8 September 2016
    27th International Conference on Database and Expert Systems Applications
    Porto (Portugal)
    #congres Know more
    Tuesday 5 July 2016 – Wednesday 6 July 2016
    Journée Action MAESTRO du GDR MADICS : Masse de données en Astrophysique
    UT3 Paul Sabatier, IRIT, Salle des Thèses
    #congres Know more
    Tuesday 1 September 2015 – Friday 4 September 2015
    26th International Conference on Database and Expert Systems Applications
    Valencia (Espagne)
    #congres Know more
    Tuesday 1 September 2015 – Wednesday 2 September 2015
    8th International Conference on Data Management in Cloud, Grid and P2P Systems
    Valencia (Espagne)
    #congres Know more
    Monday 29 August 2022, 16h00 – 18h00
    Indexation des ressources dans les environnements connectés
    Richard CHBEIR – LIUPPA (France) UT3 Paul Sabatier, IRIT, Salle des Thèses
    #seminaire
    Wednesday 13 November 2019, 10h30 – 12h30
    L’évolution de la découverte des dépendances fonctionnelles à partir des données
    Noël NOVELLI – Université d’Aix-Marseille UT3 Paul Sabatier, IRIT, Salle 003
    #seminaire
    Wednesday 6 February 2019, 14h00 – 16h00
    Scaling OLTP throughput over data centres
    Joseph VELLA – University of Malta UT3 Paul Sabatier, IRIT, Salle des Thèses
    #seminaire
    Tuesday 10 July 2018, 11h00 – 12h00
    Utilisation des Dépendances fonctionnelles pour l’optimisation des Requêtes multidimensionnelles
    Sofian MAABOUT – LaBRI, Bordeaux UT3 Paul Sabatier, IRIT, Salle des Thèses
    #seminaire
    Tuesday 17 November 2020, 10h00
    Contribution à la réplication de données dans les systèmes de gestion de données à grande échelle
    Riad MOKADEM – Team PYRAMIDE, IRIT UT3 Paul Sabatier, IRIT, Salle des Thèses et en visioconférence
    #hdr