Skip to content

Data storage elasticity – quick view on master thesis work (part 2)

In this second part I welcome Nicolas Degroodt who explains how he has extended the YCSB for implementing TPC-C benchmark for NoSQL. In this post we call DBMS a storage framework whether it is RDBMS or a NoSQL.

When dealing with new DBMS in a cloud environment, traditional benchmarks (like TPC-C [1]) need to be re-designed. First of all, the semantic of their queries is too rigid and cannot fit, as they are defined, with a NoSQL system. For example, a key-value store cannot execute immediately a GROUP BY SQL statement. These queries need to be translated regarding to the tested storage.
Secondly, their data-schemas are oriented and designed for relational databases. Storing data, as they are described by TPC, in a non-relational databases would be weak in terms of performance. The data-schema needs to be modeled by taking into account the underlying DBMS, its indexation schema, its data sharding policy, etc.

We proposed to adapt the TPC-C to these new DBMS and to study the related issues. We proposed to describe a new benchmark, TPC-C like, which defines data and queries as functional values (instead of semantic fields).
On the other hand, Yahoo Cloud Serving Benchmark (YCSB) proposes a framework [2] and a common set of workloads for evaluating the performance the performance of different key-value and cloud serving stores. Their approach consists of testing DBMS considering only their common basic functions (i.e. put, get, delete operation).

YCSB
The Yahoo! Cloud Serving Benchmark

The modular architecture of YCSB (see figure above from [3]) is very powerful and easy to configure, for instance, scenari are defined by in-line parameters such as number of clients, etc. We adapted this architecture to meet our requirements: (1) we added a pseudo-random data generator module for producing not only data but also parameters for TPC-C transactions, and (2) we added a queries generator module which produces queries for putting data into the data base (according to the data schema) and for attacking the DBMS (according to its syntax).

ycsb2

 

We tested successfully this adapted framework in the EURA NOVA lab on Cassandra 0.5 deployed on a 4-nodes cluster. For testing another DBMS, we need to adapt the Queries Generator and the DB Interface Layer modules. In our next work, we will continue to deal with YCSB and focusing on the elasticity aspects ( definition, criteria and test scenario).

References:

[1] Transaction Processing Performance Council: http://www.tpc.org/
[2] YCSB Application, https://github.com/brianfrankcooper/YCSB
[3] YCSB Article, http://research.yahoo.com/files/ycsb.pdf


 

Releated Posts

Internships 2023

This document presents internships supervised by our software engineering department or by our research & development department. Each project is an opportunity to feel both empowered and responsible for your own professional development and for your contribution to the company.
Read More

AI For Aviation

Our team works with EUROCONTROL and WaPT to safely reduce wake separation between flights. Read on to read more about the two papers they recently published!
Read More