Our partners



Home page > English > Departments > Data Management > PYRAMID Team > Research



Nowadays, large amounts of data are produced by different sources (e.g. scientific observation, simulation, sensors, logs, social networks, finance). These large amounts of data, often referred to as Big Data and characterized by 4Vs (Volume, Variety, Velocity, and Value), are distributed in large scale, heterogeneous, and produced continuously. The management of such data raises new problems and presents a real challenge: modeling, storage, processing, optimization, cost model, replication, data privacy and security, monitoring services … In parallel and distributed large-scale environments (Cluster, Grid, Cloud), the Pyramid team addresses the main problems of query processing and optimization, targeting large volumes of data (corresponding to the first big V: Volume) distributed in large-scale.

To manage a huge amount of data there are two approaches: parallel database systems and cloud systems (e.g. Hadoop, MapReduce, HDFS). The parallel database systems have been an important success, whether in research in the early 90s and now in industry. They have enabled many applications handling large data volumes to meet their requirements in terms of performance (e.g. Response Time) and resource availability. It recognized that parallel database systems are very expensive and require having high level skills within the company to administer the systems and databases. As for cloud systems, they allow a company to reduce these costs in term of infrastructure either by purchasing a server comprised of low-cost commodity machines or by renting a service provider in pay-per-use. Public Clouds provide on demand resources and services with advantages of scalability and elasticity. However, elasticity paradigm raises a new challenge regarding the design of efficient and profitable resource allocation models. Moreover, with respect to data volumes growing every day, cloud systems should provide replication mechanisms insuring high performance, data availability and integrating fault tolerance.

Thus, the main characteristics of the public cloud systems: (i) traditional infrastructures are replaced by the clusters of low-cost commodity hardware, (ii) users become multi-tenant, because public cloud systems are not owned nor managed by customers, (iii) elasticity and pay-per-use: services are provided on demand of the users and invoiced based on the consumed resources, and (iv) performance isolation: a minimal QoS for Multi-tenant should be insured.

Research activities of the Pyramid team focuses on the design and development of new elastic resource allocation models for dynamic query optimization, while maximizing the exploration of fundamental results obtained in parallel and distributed systems, particularly the aspects relative to parallelism types (partitioned, independent and pipeline parallelisms) and the minimization of inter-operation communication costs.

Our approach is based on the best trade-off between: (i) efficiency (multi-tenant satisfaction/QoS) and (ii) cost-effectiveness (Service Providers with respect to IaaS/SaaS and meeting SLA (Service Level Agreement)). The originality of these new resource allocation models lies in: (i) the introduction of the profitability dimension (i.e. economic model) in the objective function, and (ii) the decentralization of control to insure the scalability by the integration of pro-active migration policy.


The main research issues tackled by the Pyramid team can be summarized here and are briefly described below:



Elastic Resource Allocation

The objective is to design and develop elastic resource allocation models. In fact, the allocated resources (on the provider side) should increase or decrease in accordance with the demand of (...)

Read more

Performance Isolation

From the DaaS (Database-as-a-Service) provider’s point of view, multi-tenancy allows sharing resources of a single database server, in order to achieve the cost-effective goal. However, for an (...)

Read more