Automatic Parameter Tuning for Big Data Pipelines

Engineering, Mjolnir

March 10, 2022

Tuning big data frameworks is a very important task to get the best performance for a given application. However, these frameworks are rarely used individually, they generally constitute a pipeline, each having a different role. This makes tuning big data pipelines an important yet difficult task given the size of the search space. Moreover, we have to consider the interaction between these frameworks when tuning the configuration parameters of the big data pipeline. A trade-off is then required to achieve the best end-to-end performance.

Machine learning-based methods have shown great success in automatic tuning systems, but they rely on a large number of high-quality learning examples that are rather difficult to obtain. In this context, we propose to use a deep reinforcement learning algorithm, namely Twin Delayed Deep Deterministic Policy Gradient, TD3, to tune a fraud detection big data pipeline.

Houssem Sagaama, Nourchene Ben Slimane, Maher Marwani, Sabri Skhiri, Automatic Parameter Tuning for Big Data Pipelines, In Proc. of The 26th IEEE Symposium on Computers and Communications (ISCC 2021), September 2021.

Click here to access the paper.

Releated Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

4.08.2025 / Data science / Papers

The topic of explainable AI has recently received attention driven by a growing awareness of the need for transparent and accountable AI. In this paper, we propose a novel methodology to decompose any state-of-the-art perturbation-based explainability approach into four blocks. In addition, we provide Muppet: an open-source Python library for explainable AI.

Insights from GTC Paris 2025

25.06.2025 / Engineering / Blog, Event

Among the NVIDIA GTC Paris crowd was our CTO Sabri Skhiri, and from quantum computing breakthroughs to the full-stack AI advancements powering industrial digital twins and robotics, there is a lot to share! Explore with Sabri GTC 2025 trends, keynotes, and what it means for businesses looking to innovate.

Automatic Parameter Tuning for Big Data Pipelines

Releated Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

Insights from GTC Paris 2025

Recent Posts

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

Insights from GTC Paris 2025

Development & Evaluation of Automated Tumour Monitoring by Image Registration Based on 3D (PET/CT) Images

Insights from Data & AI Tech Summit Warsaw 2025

Tracks

Mjolnir

Rune

Vadgelmir

Yggdrasil

Field of expertises

Data architecture

Data governance

Data science

Engineering

Academic collaboration

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media