Skip to content

A distributed data mining framework accelerated with graphics processing units

In the context of processing high volumes of data, the recent developments have led to numerous models and frameworks of distributed processing running on clusters of commodity hardware. On the other side, the Graphics Processing Unit (GPU) has seen much enthusiastic development as a device for general-purpose intensive parallel computation. In this paper we propose a framework which combines both approaches and evaluates the relevance of having nodes in a distributed processing cluster that make use of GPU units for further fine-grained parallel processing. We have engineered parallel and distributed versions of two data mining problems, the naive Bayes classifier and the k-means clustering algorithm, to run on the framework and have evaluated the performance gain. Finally, we also discuss the requirements and perspectives of integrating GPUs in a distributed processing cluster, introducing a fully distributed heterogeneous computing cluster.

Nam-Luc Tran, Quentin Dugauthier, and Sabri Skhiri, A Distributed Data Mining Framework Accelerated with Graphics Processing Units, proceedings of the 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia), FuZhou, China, December 2013.

Click here to access the paper in its preprint form.

Releated Posts

Investigating a Feature Unlearning Bias Mitigation Technique for Cancer-type Bias in AutoPet Dataset

We proposed a feature unlearning technique to reduce cancer-type bias, which improved segmentation accuracy while promoting fairness across sub-groups, even with limited data.
Read More

Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods

The topic of explainable AI has recently received attention driven by a growing awareness of the need for transparent and accountable AI. In this paper, we propose a novel methodology to decompose any state-of-the-art perturbation-based explainability approach into four blocks. In addition, we provide Muppet: an open-source Python library for explainable AI.
Read More