The first edition of Flink Forward took place past October 12th and 13th in Berlin. Flink Forward is two-day conference exclusively dedicated to Apache Flink, the distributed pipelined batch and streaming processing framework. EURA NOVA was present among the speakers of the event (http://flink-forward.org/?session=stale-synchronous-parallel-iterations-on-flink).
Here is the talk we presented.
Stale Synchronous Parallel Iterations
We started the project earlier in April based on the publication of Cui et al. After discussions with the Flink team at Data Artisans, we went on and implemented SSP within Flink.
Along with SSP on Flink, we also proposed a distributed version of the Frank-Wolfe algorithm under SSP. The results has been published in the IEEE Big Data Conference 2015.
The abstract of the talk says:
While Bulk Synchronous Parallel is a model suitable for distributed bulk iterations, it has the overhead of synchronizing each worker with a fresh view of the working set between each iteration. It has however been shown that algorithms still converge when distributed workers hold an outdated or inconsistent view of the solution between iterations within defined bounds. This has led to the concept of Stale Synchronous Parallel iterations, in which workers work on cached model data from other workers covering previous iterations within defined bounds. This introduces two new notions: first the “clock” representing the smallest amount of work performed by a worker in an iteration, and second the “slack”, defining the maximum amount of clocks a worker can be ahead of the slowest.
In this project we implement the SSP iteration model on top of Flink iterations and introduce an important element of the model: the parameter server. We will present our contribution at the model level, our implementation within Flink using Apache Ignite and show the use cases benefitting from this iteration model.
Here is the video of the talk :
https://www.youtube.com/watch?v=SVnRFrEYE3s
Feeling adventurous?
The code of the contribution is available on github:
https://github.com/apache/flink/pull/1102
https://github.com/apache/flink/pull/967