Skip to content

Data storage elasticity – quick view on master thesis work (part 2)

In this second part I welcome Nicolas Degroodt who explains how he has extended the YCSB for implementing TPC-C benchmark for NoSQL. In this post we call DBMS a storage framework whether it is RDBMS or a NoSQL.

When dealing with new DBMS in a cloud environment, traditional benchmarks (like TPC-C [1]) need to be re-designed. First of all, the semantic of their queries is too rigid and cannot fit, as they are defined, with a NoSQL system. For example, a key-value store cannot execute immediately a GROUP BY SQL statement. These queries need to be translated regarding to the tested storage.
Secondly, their data-schemas are oriented and designed for relational databases. Storing data, as they are described by TPC, in a non-relational databases would be weak in terms of performance. The data-schema needs to be modeled by taking into account the underlying DBMS, its indexation schema, its data sharding policy, etc.

We proposed to adapt the TPC-C to these new DBMS and to study the related issues. We proposed to describe a new benchmark, TPC-C like, which defines data and queries as functional values (instead of semantic fields).
On the other hand, Yahoo Cloud Serving Benchmark (YCSB) proposes a framework [2] and a common set of workloads for evaluating the performance the performance of different key-value and cloud serving stores. Their approach consists of testing DBMS considering only their common basic functions (i.e. put, get, delete operation).

YCSB
The Yahoo! Cloud Serving Benchmark

The modular architecture of YCSB (see figure above from [3]) is very powerful and easy to configure, for instance, scenari are defined by in-line parameters such as number of clients, etc. We adapted this architecture to meet our requirements: (1) we added a pseudo-random data generator module for producing not only data but also parameters for TPC-C transactions, and (2) we added a queries generator module which produces queries for putting data into the data base (according to the data schema) and for attacking the DBMS (according to its syntax).

ycsb2

 

We tested successfully this adapted framework in the EURA NOVA lab on Cassandra 0.5 deployed on a 4-nodes cluster. For testing another DBMS, we need to adapt the Queries Generator and the DB Interface Layer modules. In our next work, we will continue to deal with YCSB and focusing on the elasticity aspects ( definition, criteria and test scenario).

References:

[1] Transaction Processing Performance Council: http://www.tpc.org/
[2] YCSB Application, https://github.com/brianfrankcooper/YCSB
[3] YCSB Article, http://research.yahoo.com/files/ycsb.pdf


 

Releated Posts

The Building Blocks of a Responsible AI Practice: An Outlook on the Current Landscape

Responsible AI comes with the challenge of implementation. This survey aims to bridge the gap between principles and practice through a study of different approaches taken in the literature and the proposition of a foundational framework.
Read More

TS-Relax : Interprétation des représentations apprises pour les séries temporelles

Les modèles d’apprentissage de représentations sont de plus en plus utilisés, mais des modèles d’IA explicables et de confiance sont nécessaires. Ce travail présente l’adaptation aux séries temporelles d’une méthode d’interprétation de représentation initialement conçue pour les images.
Read More