Data is at the heart of digital transformations. However, between data and the business cases of companies there is still a big entry barrier :
With our ASGARD research project, Euranova wants to be present on the market within 2 or 3 years with the technologies and knowledge necessary to create value for our customers. The goal is to help the companies to create more value with less investments, more reliability and above all in full compliance with the legislation
Asgard’s research project was able to see the day due to the co-financing of the Walloon Region. It concentrates its research around the 4 main stages of the data value chain and address, through its research tracks, the following challenges :
The RUNE track answers to “Making data available” challenges by respecting the law and more specifically the GDPR. How to automate this access to data according to the legal basis of exploitation of the data scientist/analyst and according to the legal basis of data acquisition. This track covers NLP, Legal NLP, GPDR ontology, first order inference, …
The YGGDRASIL track answers to “Data exploitation and modelling” challenges. The 10 largest ML conferences generate around 2500 papers per year (ICML, IJCAI, NeurIPS, AAAI, ICLR, INTELISYS, ECML, KDD, ECAI, ACL, ICIP, Big Data), it is extremely difficult for a data scientist to stay up to date with the state of the art in his field, even more so when he is 100% of his time on industrial projects. In addition, we have a new category of problems that consist of having to analyse data from different media (image, text, database, graphs, etc.). So this track helps assist/automate machine learning tasks in these multi-media cases and covers the following topics autoML, Multi model representation learning, GAN-based feature extractor.
The MJOLNIR track answers to “Deployment in production” challenges, they require the implementation of data pipelines generally composed of several data processing and storage frameworks. However, the optimisation of these frameworks is very complicated when one variable can positively or negatively influence the 500 others! We should therefore try to use ML to automatically find the configuration parameters that optimise the pipeline according to the jobs, data and budget available. Such an optimisation could improve the performance, and therefore reduce the cost, by 10 and 60. This track covers AutoTuning, Reinforcement learning, batch and stream processing, meta learning.
The VADGELMIR track answers to “Control in production, data quality” challenges. It is usually in production that we realise that the data is very different from our training data. A large part of the differences come from the quality of the data. Today there are a lot of tools for quality processing, but they are very expensive to implement and can sometimes take 3 to 5 years. Instead, we are looking for a solution that does not find the real data but limits the impact of the quality defect on the ML task. Just like an image that is blurred due to noise to which we could apply a denoising filter. We consider here that the quality defect is noise added to the initial data, so we only have to debug it This track covers GAN, missing data imputation, auto encoder, deep neural nets.