Skip to content

An empirical comparison of graph databases

In this blog post we briefly describe our new contribution to the big data domain, especially, graphDB benchmarking. This work was accepted for publication at 2013 ASE/IEEE International Conference on Big Data.

In this work, we presented a distributed graph database benchmarking framework. We used this tool to analyze the performance of four graph databases: Neo4j 1.9M05 , Titan 0.3, OrientDB 1.3 and DEX 4.7.

We developed a Java distributed benchmarking framework, in order to test and compare different Blueprints-compliant graph databases. This tool can be used to simulate real graph database workloads with any number of concurrent clients performing any type of operation on any type of graph.

The main purpose of our solution is to objectively compare graph databases using usual graph operations. Indeed, some operations like exploring a node neighborhood, finding the shortest path between two nodes, or simply getting all vertices that share a specific property are frequent when working with graph databases. It can thus be interesting to compare their behavior when performing this type of operation.

Concretely, we compared the performance of the graphDB by means of different workloads. A workload represents a unit of work for a graph database. We defined three types of workloads:

1. Load workload

2. Traversal workload

  • Shortest path workload
  • Neighborhood exploration workload

3. Intensive workload

  • GET vertices/edges by ID
  • GET vertices/edges by property
  • GET vertices by ID and UPDATE property
  • GET two vertices and ADD an edge between them

Our solution works as follows: the user defines a benchmark, that first contains a list of databases to compare and then a series of operations, the workloads to realize on each database. This benchmark is then executed by a module called Operational Module, whose responsibility is to start the databases and measure the time required to perform the operations specified in the benchmark. Finally, the Statistics Module gathers and aggregates all these measures and produces a summary report together with some results visualizations.
See the following figure:


If you’re interested to know more about the results, please read the paper “An empirical comparison of graph databases” and/or contact us.

Salim Jouili
Twitter: @jouilis

Releated Posts

Insights from IAPP AI Governance Global 2024

In early June, Euranova's CTO Sabri Skhiri, attended the IAPP AI Governance Global 2024 conference in Brussels. In this article, Sabri will delve into some of the keynotes, panels and
Read More

Kafka Summit 2024: Announcements & Trends

The Kafka Summit brought together industry experts, developers, and enthusiasts to discuss the latest advancements and practical applications of event streaming and microservices. In this article, our CTO Sabri Skhiri
Read More