Skip to content

ASE/IEEE Big Data Washington 2013

Hammer

This week I had the chance to attend the ASE/IEEE International conference on Big Data here in Washington DC. EURA NOVA’s presence has been highly remarked as we had two papers: imGraph (big up to @AldemarReynaga) and a paper presenting our empirical performance analysis on some of the major graph database on the current market.

Big Data is all about applications

Research on Big Data has been around for several years now and at this state many of the researches now focus on actual use cases of Big Data in wide scale scenarios. Indeed, past developments and researches have all paved the way for a path which allows for (1) massive data collecting, (2) storage and (3) processing.

It is now only natural that most researches focus on using these bricks on integral scenarios.

As such I have seen many presentations of applications of Big Data: from the monitoring fishes in the sea around the world (fish4knowledge), classifying them and allowing marine biologists to execute queries on a time frame (e.g. how many gold fishes in south pacific between 1/2001 to 6/2006), to disaster management improvement using mobile data and machine learning. It is now an acknowledged fact that Big Data can ease the work on much scenarios, bring additional knowledge and, even, save lives.

Big Data becomes social
Another big trend that I had already noticed from last year’s edition of SIGKDD  is social computing, mainly driven by the proliferation of easily accessible data from social networks such as Twitter, Epinions, … Social computing has gained much momentum and nearly 3 out of 5 presented projects revolved around infering information about human social behaviour (such as e.g. retweet models for users, trust and friendship transfer between members) from public data acquired in social networks.

But hey, where is the Big Data?
The thing is while one might think of Big Data when speaking about data that can be abundant in social networks, the volumes of data generally involved in these projects is ridiculously low. Most of these social computing projects end up working on pre-processed datasets going from tens to hundreds of megabytes and restricted to few users, microscopic volumes with regards to the actual volume of data streaming from those social networks.

From my experience in dealing with the storage and distributed processing of data, I can tell from what I have seen at the conference that there is currently a gap between (1) the actual technological bricks which effectively allow to design resilient applications where the data volumes and the complexity are high and (2) the applications designed on the features enabled by (1). Even though many of the presented applications in social computing have shown interesting results and perspectives, there is still a way to go to make them actually scalable on usable real-time applications.

Closing thought
While some might argue that Big Data is only a passing trend and that many of the focuses of Big Data have already been widely covered in the past years in disciplines such as very large databases, parallel processing, real-time computing and machine learning, I believe that, beyond the overly-hyped buzzword, the intelligent combination of the aforementioned domains in specific applications will actually form the “Big Data” evolution and can ultimately change the game.

 

Nam-Luc Tran


 

Releated Posts

IEEE Big Data 2023 – A Summary

Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023. In this article, Sabri explores for you the various keynotes and talks that took place during the
Read More

Robust ML Approach for Screening MET Drug Candidates in Combination with Immune Checkpoint Inhibitors

Present study highlights the significance of dataset size in ICI microbiota models and presents a methodology to enhance the performances of a multi-cohort-based ML approach.
Read More