Trustworthy machine learning is driving a large number of the ML community works in order to improve ML acceptance and adoption. In this paper, we show a first link between uncertainty and explainability, by studying the relation between calibration and interpretation.Continue reading
Galaxy cluster masses can be inferred indirectly using measurements from X-ray band, Sunyaev-Zeldovich (SZ) effect signal or optical observations. Unfortunately, all of them are affected by some bias. Alternatively, we provide an independent estimation of the cluster masses from the Planck PSZ2 catalogue of galaxy clusters using a machine-learning method.Continue reading
Big data frameworks generally constitute a pipeline, each having a different role. This makes tuning big data pipelines an important yet difficult task given the size of the search space. We propose to use a deep reinforcement learning algorithm to tune a fraud detection big data pipeline.Continue reading
We propose a multi-modal framework to tackle the SPARK Challenge by classifying satellites using RGB and depth images. Our framework is mainly based on Auto-Encoders to embed the two modalities in a common latent space in order to exploit redundant and complementary information between the two types of data.Continue reading
In this paper, we propose an automated framework for multi-view image classification tasks. The proposed framework is able to, all at once, train a model to find a common latent representation and perform data imputation, choose the best classifier and tune all necessary hyper-parameters.Continue reading
Under the GDPR requirements and privacy-by-design guidelines, access control for personal data should not be limited to a simple role-based scenario. For the processing to be compliant, additional attributes, such as the purpose of processing or legal basis, should be verified against an established data processing agreement or policy.Continue reading
Anomaly detection is a widely explored domain in machine learning. Many models are proposed in the literature, and compared through different metrics measured on various datasets.
The most popular metrics used to compare performances are F1-score, AUC and AVPR.
Missing data is a recurrent and challenging problem, especially when using machine learning algorithms for real-world applications. For this reason, missing data imputation has become an active research area, in which recent deep learning approaches have achieved state-of-the-art results. We propose DAEMA: Denoising Autoencoder with Mask Attention.Continue reading
Uncertainty in probabilistic classifiers predictions is a key concern when models are used to support human decision making, in broader probabilistic pipelines or when sensitive automatic decisions have to be taken.
Studies have shown that most models are not intrinsically well calibrated, meaning that their decision scores are not consistent.
We propose a framework using contrastive learning as a pre-training task to perform image classification in the presence of noisy labels. Recent strategies, such as pseudo-labelling, sample selection with Gaussian Mixture models, and weighted supervised contrastive learning have been combined into a fine-tuning phase following the pre-training.Continue reading