A distributed approach for graph-oriented multidimensional analysis

The importance of graphs as the fundamental structure underpinning many real world applications is no longer to be proved. Large graphs have emerged in various fields such as biological, social and transportation networks. The sheer volume of these networks poses challenges to traditional techniques for storage and analysis of graph data. In particular, OLAP analysis requires access to large portions of data to extract key information and to feed strategic decision making. OLAP provides multilevel, multiperspective views of the data. Most of the current techniques are optimized for centralized graph processing. A distributed approach providing horizontal scalability is required in order to handle the analysis workload.
In this paper, we focus on applying OLAP analysis on large, distributed graph data. We describe Distributed Graph Cube, our distributed framework for graph-based OLAP cubes computation and aggregation. Experimental results on large, real-world datasets demonstrate that our method significantly outperforms its centralized counterparts. We also evaluate the performance of both Hadoop and Spark for distributed cubes computations.

 

Benoît Denis, Amine Ghrab, and Sabri Skhiri, A Distributed Approach for Graph-Oriented Multidimensional Analysis, proceedings of the 2013 IEEE International Conference on Big Data, Santa Clara, CA, USA, October 2013.

Click here to access the paper.

DEXA, DaWaK 2013

I spent the last week at Prague, Czech Republic, to present my paper entitled “An Analytics-Aware Conceptual Model For Evolving Graphs”. More details on the paper could be found in my previous article.

In this post I will focus more on the trends, and share some of the lessons I learned during this nice travel.

Dawak

Continue reading

The Dirty Dozen

the-dirty-dozen-1

While it’s not necessary to be dirty, this film tells us one clear thing: you’re stronger with a small team of specialists rather than with a big group of clones. More: you can achieve impossible missions.

We can observe this in innovation every time it comes to reach great, ambitious, disruptive business objectives. It’s a matter of mastering the market context, the behavior of the customers, the way value can be created, to go outside the standard tracks, to know the channels to reach your targets and to exploit every single advantage you have in front of your competitors, might it be in term of timing, vision or technology.

Continue reading

An empirical comparison of graph databases

In recent years, more and more companies provide services that can not be anymore achieved efficiently using relational databases. As such, these companies are forced to use alternative database models such as XML databases, object-oriented databases, document-oriented databases and, more recently graph databases. Graph databases only exist for a few years. Although there have been some comparison attempts, they are mostly focused on certain aspects only.
In this paper, we present a distributed graph database comparison framework and the results we obtained by comparing four important players in the graph databases market: Neo4j, OrientDB, Titan and DEX.

 

Salim Jouili, and Valentin Vansteenberghe, An empirical comparison of graph databases, proceedings of the 2013 ASE/IEEE International Conference on Big Data, Washington D.C., USA, September 2013.

Click here to access the paper.

Brussels Cassandra Users

Cassandra

This last Tuesday EURA NOVA hosted an interesting Cassandra meetup. Several tech enthusiasts and Cassandra fans have met each other for two talks: I briefly introduced what are the challenges and potential solution when building a distributed graph layer on top of Cassandra, the second talk, presented by Ansar Rafique,  was a general introduction to Cassandra and some performance comparisons against MySQL. Ansar is a PhD student at the department of Computer Science of the Katholieke Universiteit Leuven – Language technology and middleware taskforce (LANTAM) of the DistriNet research group. The talk presented the result of Ansar’s previous researches at Cinnober Financial Technology about factors influencing Read/Write operation in MySQL and Cassandra.

Continue reading

Knowledge Sharing

 

knowledge-sharing-image-cropped_10

I have been working  now for more than 6 months at EURA NOVA, and this article is the opportunity for me to share what I consider to be a key factor of success regarding my integration in the company and the conduct of my work : The way we are creating links and sharing knowledge between colleagues.

Continue reading