An empirical comparison of graph databases

Engineering

October 10, 2013

In this blog post we briefly describe our new contribution to the big data domain, especially, graphDB benchmarking. This work was accepted for publication at 2013 ASE/IEEE International Conference on Big Data.

In this work, we presented a distributed graph database benchmarking framework. We used this tool to analyze the performance of four graph databases: Neo4j 1.9M05 , Titan 0.3, OrientDB 1.3 and DEX 4.7.

We developed a Java distributed benchmarking framework, in order to test and compare different Blueprints-compliant graph databases. This tool can be used to simulate real graph database workloads with any number of concurrent clients performing any type of operation on any type of graph.

The main purpose of our solution is to objectively compare graph databases using usual graph operations. Indeed, some operations like exploring a node neighborhood, finding the shortest path between two nodes, or simply getting all vertices that share a specific property are frequent when working with graph databases. It can thus be interesting to compare their behavior when performing this type of operation.

Concretely, we compared the performance of the graphDB by means of different workloads. A workload represents a unit of work for a graph database. We defined three types of workloads:

1. Load workload

2. Traversal workload

Shortest path workload
Neighborhood exploration workload

3. Intensive workload

GET vertices/edges by ID
GET vertices/edges by property
GET vertices by ID and UPDATE property
GET two vertices and ADD an edge between them

Architecture:
Our solution works as follows: the user defines a benchmark, that first contains a list of databases to compare and then a series of operations, the workloads to realize on each database. This benchmark is then executed by a module called Operational Module, whose responsibility is to start the databases and measure the time required to perform the operations specified in the benchmark. Finally, the Statistics Module gathers and aggregates all these measures and produces a summary report together with some results visualizations.
See the following figure:

If you’re interested to know more about the results, please read the paper “An empirical comparison of graph databases” and/or contact us.

Salim Jouili
Twitter: @jouilis
E-mail: [email protected]

Releated Posts

Evaluation of GraphRAG Strategies for Efficient Information Retrieval

24.12.2025 / Engineering / Papers

Traditional RAG systems struggle to capture relationships and cross-references between different sources unless explicitly mentioned. This challenge is common in real-world scenarios, where information is often distributed and interlinked, making graphs a more effective representation. Our work provides a technical contribution through a comparative evaluation of retrieval strategies within GraphRAG, focusing on context relevance rather than abstract metrics. We aim to offer practitioners actionable insights into the retrieval component of the GraphRAG pipeline.

Flight Load Factor Predictions based on Analysis of Ticket Prices and other Factors

22.12.2025 / Data science / Papers

The ability to forecast traffic and to size the operation accordingly is a determining factor, for airports. However, to realise its full potential, it needs to be considered as part of a holistic approach, closely linked to airport planning and operations. To ensure airport resources are used efficiently, accurate information about passenger numbers and their effects on the operation is essential. Therefore, this study explores machine learning capabilities enabling predictions of aircraft load factors.

An empirical comparison of graph databases

Releated Posts

Evaluation of GraphRAG Strategies for Efficient Information Retrieval

Flight Load Factor Predictions based on Analysis of Ticket Prices and other Factors

Recent Posts

Evaluation of GraphRAG Strategies for Efficient Information Retrieval

Flight Load Factor Predictions based on Analysis of Ticket Prices and other Factors

Investigating a Feature Unlearning Bias Mitigation Technique for Cancer-type Bias in AutoPet Dataset

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

Tracks

Mjolnir

Rune

Vadgelmir

Yggdrasil

Field of expertises

Data architecture

Data governance

Data science

Engineering

Academic collaboration

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media