An empirical comparison of graph databases

Engineering

October 10, 2013

In this blog post we briefly describe our new contribution to the big data domain, especially, graphDB benchmarking. This work was accepted for publication at 2013 ASE/IEEE International Conference on Big Data.

In this work, we presented a distributed graph database benchmarking framework. We used this tool to analyze the performance of four graph databases: Neo4j 1.9M05 , Titan 0.3, OrientDB 1.3 and DEX 4.7.

We developed a Java distributed benchmarking framework, in order to test and compare different Blueprints-compliant graph databases. This tool can be used to simulate real graph database workloads with any number of concurrent clients performing any type of operation on any type of graph.

The main purpose of our solution is to objectively compare graph databases using usual graph operations. Indeed, some operations like exploring a node neighborhood, finding the shortest path between two nodes, or simply getting all vertices that share a specific property are frequent when working with graph databases. It can thus be interesting to compare their behavior when performing this type of operation.

Concretely, we compared the performance of the graphDB by means of different workloads. A workload represents a unit of work for a graph database. We defined three types of workloads:

1. Load workload

2. Traversal workload

Shortest path workload
Neighborhood exploration workload

3. Intensive workload

GET vertices/edges by ID
GET vertices/edges by property
GET vertices by ID and UPDATE property
GET two vertices and ADD an edge between them

Architecture:
Our solution works as follows: the user defines a benchmark, that first contains a list of databases to compare and then a series of operations, the workloads to realize on each database. This benchmark is then executed by a module called Operational Module, whose responsibility is to start the databases and measure the time required to perform the operations specified in the benchmark. Finally, the Statistics Module gathers and aggregates all these measures and produces a summary report together with some results visualizations.
See the following figure:

If you’re interested to know more about the results, please read the paper “An empirical comparison of graph databases” and/or contact us.

Salim Jouili
Twitter: @jouilis
E-mail: [email protected]

Releated Posts

Insights from GTC Paris 2025

25.06.2025 / Engineering / Blog, Event

Among the NVIDIA GTC Paris crowd was our CTO Sabri Skhiri, and from quantum computing breakthroughs to the full-stack AI advancements powering industrial digital twins and robotics, there is a lot to share! Explore with Sabri GTC 2025 trends, keynotes, and what it means for businesses looking to innovate.

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

23.05.2025 / Engineering / Academic collaborations, Papers

Tumor tracking in PET/CT is essential for monitoring cancer progression and guiding treatment strategies. Traditionally, nuclear physicians manually track tumors, focusing on the five largest ones (PERCIST criteria), which is both time-consuming and imprecise. Automated tumor tracking can allow matching of the numerous metastatic lesions across scans, enhancing tumor change monitoring.

An empirical comparison of graph databases

Releated Posts

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Recent Posts

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Insights from Data & AI Tech Summit Warsaw 2025

Insights From Flink Forward 2024

Tracks

Mjolnir

Rune

Vadgelmir

Yggdrasil

Field of expertises

Data architecture

Data governance

Data science

Engineering

Academic collaboration

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media