Euranova has 3 fundamental pillars: explore, craft and serve. The explore pillar of Euranova is an independent research centre dedicated to data science, software engineering and AI.
Through the exploration of tomorrow’s engineering and data science to answer today’s problems, our research centre is dedicated to anticipating the challenges that European businesses face. We find solutions to current and future digital challenges with passion, creativity and integrity.
Euranova has 3 fundamental pillars: explore, craft and serve. The explore pillar of Euranova is an independent research centre dedicated to data science, software engineering and AI.
Through the exploration of tomorrow’s engineering and data science to answer today’s problems, our research centre is dedicated to anticipating the challenges that European businesses face. We find solutions to current and future digital challenges with passion, creativity and integrity.
EURA NOVA R&D has a new rallying cry : Join The Pack!
After launching our first bootcamp, we are organising our first workshop colocated with IEEE conference. The workshop will take place in December in Washington D.C. and will bring together industrial and academic stakeholders to discuss, explore and refine new opportunities and use cases in the area of stream processing and real-time analytics in big data. Indeed, stream processing and real-time analytics have caught the interest of the industry lately. Many use cases are waiting for relevant and efficient solutions to be developed. Such use cases include event-driven marketing, dynamic network management & optimization, real-time recommendation, context-aware applications and real-time fraud detection. The workshop will showcase prototypes or products leveraging big data technologies as well as models and efficient algorithms for scalable complex event processors and context detection engines. Here is a short list of research topics to inspire you : New stream processing architecture for big data. Complex event processing for big data, pattern matching engines for big data. Scalable real-time decision algorithms. Scalable stream processing architecture, algorithms or models. Stream SQL and other continuous query languages on big data frameworks. Algorithms for high-speed data stream mining. On-line/incremental learning on data streams. Your paper will be reviewed by a panel of academic as well as industrial experts. Find more information about program co-chairs and members on the workshop website and submit your paper to join the Euranovian pack! Don’t miss the chance to be part of an IEEE conference and to see Washington under the snow.
BOOT CAMP 2017
EURA NOVA is launching an intense 3-month I.T. boot camp starting September 2017.
Installing TensorFlow with distributed GPU support.
Today, I wrote my first “Hello World” script using the freshly open-sourced version of TensorFlow with distributed GPU support. At the time of this writing, the binary releases of TensorFlow don’t come with the distributed GPU support therefore I had to build TensorFlow from sources. All the documentation to do this already exists but is a bit scattered on multiple websites. Here is a condensed version of the install process (on a Linux Ubuntu 14.04 platform).
My internship at EURA NOVA
Renaud Vilmart (Mines de Nancy) did an Engineering Internship at EURA NOVA from June to September. In the article, Renaud describes his experience as an intern.
Flink Forward 2015 – Slides & video
The first edition of Flink Forward took place past October 12th and 13th in Berlin. Flink Forward is two-day conference exclusively dedicated to Apache Flink, the distributed pipelined batch and streaming processing framework. EURA NOVA was present among the speakers of the event (http://flink-forward.org/?session=stale-synchronous-parallel-iterations-on-flink). Here is the talk we presented.
IEEE Big Data 2015
This year we had the opportunity to publish a paper, DISTRIBUTED FRANK-WOLFE UNDER PIPELINED STALE SYNCHRONOUS PARALLELISM, at the IEEE Big Data conference at Santa Clara, CA. This was an excellent opportunity to write a short summary on the trends in the big data area and our personal feelings after one week under the sun with Tacos and Enchiladas.
EURA NOVA Internships & Master Thesis
As of each year since its foundation, EURA NOVA proposes Master thesis subjects and research internships, led in collaboration with academic institutions.
Distributed Frank-Wolfe under pipelined stale synchronous parallelism
Iterative-convergent algorithms represent an im-portant family of applications in big data analytics. These aretypically run on distributed processing frameworks deployed on a cluster of machines. On the other hand, we are witnessing the move towards data center operating systems (OS), where resources are unified by a resource manager and processing frameworks coexist with each other. In this context, different processing framework job tasks can be scheduled on the same machine and slow down a worker (straggler problem). Existing work has shown that an iteration model with relaxed consistency such as the Stale Synchronous Parallel (SSP) model, while still guaranteeing convergence, is able to cope with stragglers. In this paper we propose a model for the integration of the SSP model on a pipelined distributed processing framework. We then apply SSP on a distributed version of the Frank-Wolfe algorithm. We theoretically show its sparsity bounds and convergence under SSP. Finally, we experimentally show that the Frank-Wolfe algorithm applied on LASSO regression under SSP is able to converge faster than its BSP counterpart, especially under load conditions similar to those encountered in a data center OS. Nam-Luc Tran, Thomas Peel, Sabri Skhiri, Distributed Frank-Wolfe under Pipelined Stale Synchronous Parallelism, proceedings of the 2015 IEEE Conference on Big Data, November 2015, Santa Clara, CA, USA. Click here to access the paper in its preprint form.
Graph Data Management: Status and Trends
Today’s social environments are getting more interconnected and the business market is becoming increasingly open and competitive. Organisations require a better awareness of their state and an accurate prediction of their evolution. To cope with this surging demand, new models and tools need to be developed. In my opinion, graph models are of a crucial interest for addressing these challenges.
Flink Forward 2015
The first edition of Flink Forward took place past October 12th and 13th in Berlin. Flink Forward is two-day conference exclusively dedicated to Apache Flink, the distributed pipelined batch and streaming processing framework. EURA NOVA was present among the speakers of the event. Here is our field report.
High Availability in RoQ
In the last year, we have worked with Benjamin Van Melle on implementing High Availability in RoQ, our proof-of-concept distributed pub-sub messaging system. As a consequence, we needed to expand our JUnit tests to cover individual component failure scenarios and prove they were handled as expected. This piece will show how we used Docker to achieve this.
ICML 2015
Introduction The International Conference on Machine Learning is one of the most important annual event in the world of machine learning. The place is where the most renowned researchers in the field gather to present and share their -often diverging – vision and directions for the future. As such, the event is sponsored by most of the biggest companies in IT such as Google, Baidu and Facebook. It also attracts numerous smaller companies with particular interest in big data in its wake.