Skip to content

Flink Forward 2015 – Slides & video

The first edition of Flink Forward took place past October 12th and 13th in Berlin. Flink Forward is two-day conference exclusively dedicated to Apache Flink, the distributed pipelined batch and streaming processing framework. EURA NOVA was present among the speakers of the event (

Here is the talk we presented.

Stale Synchronous Parallel Iterations

We started the project earlier in April based on the publication of Cui et al. After discussions with the Flink team at Data Artisans, we went on and implemented SSP within Flink.

Along with SSP on Flink, we also proposed a distributed version of the Frank-Wolfe algorithm under SSP. The results has been published in the IEEE Big Data Conference 2015.

The abstract of the talk says:

While Bulk Synchronous Parallel is a model suitable for distributed bulk iterations, it has the overhead of synchronizing each worker with a fresh view of the working set between each iteration. It has however been shown that algorithms still converge when distributed workers hold an outdated or inconsistent view of the solution between iterations within defined bounds. This has led to the concept of Stale Synchronous Parallel iterations, in which workers work on cached model data from other workers covering previous iterations within defined bounds. This introduces two new notions: first the “clock” representing the smallest amount of work performed by a worker in an iteration, and second the “slack”, defining the maximum amount of clocks a worker can be ahead of the slowest.

In this project we implement the SSP iteration model on top of Flink iterations and introduce an important element of the model: the parameter server. We will present our contribution at the model level, our implementation within Flink using Apache Ignite and show the use cases benefitting from this iteration model.

Here is the video of the talk :

Feeling adventurous?

The code of the contribution is available on github:


Releated Posts

Calibrate to Interpret

Trustworthy machine learning is driving a large number of the ML community works in order to improve ML acceptance and adoption. In this paper, we show a first link between uncertainty and explainability, by studying the relation between calibration and interpretation.
Read More

Mass Estimation of Planck Galaxy Clusters using Deep Learning

Galaxy cluster masses can be inferred indirectly using measurements from X-ray band, Sunyaev-Zeldovich (SZ) effect signal or optical observations. Unfortunately, all of them are affected by some bias. Alternatively, we provide an independent estimation of the cluster masses from the Planck PSZ2 catalogue of galaxy clusters using a machine-learning method.
Read More