10 Papers in 2021

To answer today’s problems, our research centre is dedicated to anticipating the challenges that European businesses face. Find out the impacts of our latest published papers.

Academic collaborations Blog

Our Research Director Invited as PC Member at IEEE Big Data

We are very proud of our research director Sabri Skhiri for joining the program committee of IEEE Big Data 2021!

He will be the only Belgian and one of the few Europeans to be on the program committee of this top tier research conference in Big Data.

Congratulation Sabri!

Blog

This document presents internships supervised by our software engineering department or by our research & development department. Each project is an opportunity to feel both empowered and responsible for your own professional development and for your contribution to the company.

Academic collaborations Blog

MASTER THESIS & PFE 2021

This document introduces you to master thesis and graduation projects supervised by our research & development department. Each project offers you the chance to be actively involved in the development of solutions to address tomorrow’s challenges in ICT and implementing them today!

Academic collaborations Blog

ECML 2020 – The Keynotes

A few weeks ago, the biggest European conference on machine learning was held: ECML 2020. Our research engineer Nourchène, our R&D consultant Gianmarco, and our data scientist Ronan attended the event from Tunisia, Belgium and Marseille. In this article, they tell you about the different keynote talks they attended.

Gemma Galdon-Clavell – Algorithmic Auditing: how to open the black-box of ML

Nourchène says: “I loved the talk given by Gemma Galdon-Clavell during which she addressed the problem of ethics in AI, as computer science engineers do not often question what they are producing from a moral standpoint. In her talk, Gemma points out the importance of data used to train a machine learning model. Data are provided by humans, but people are not perfect, they are likely to make wrong decisions. The model will then learn to behave the same way. So we might end up creating an unethical model. This can lead to two different behaviours: users either will follow the system’s recommendations at any cost or decide not to if they find the decisions not reasonable. Data will then continue to be biased, which creates a sort of deadlock.”

Ronan adds: “Algorithms do not produce biases from anywhere; they reproduce and amplify biases they can find in the data they ingest. As a result, we have to pay attention first to the quality of the data we use. Gemma emphasizes that algorithmic auditing is the key to understanding if the algorithm meets the expectations and if it complies with the regulations. The audit does not only cover the technical part and the way the algorithm was coded. It also focuses on how the problem was approached and the means deployed to solve it.”

Nourchène explains: “The speaker suggests that before creating a product, computer science engineers and developers need to ask the following questions: Is the product desirable and what is the problem that it tries to solve? Is it acceptable and does it involve users? Is it legal? Finally, does it use the right data? Gemma also suggests that ethics be taught in engineering schools. I totally agree with that because nowadays technology does not always seek to solve real problems, its goal is rather to make a fortune out of the proposed product.”

Max Welling – Amortized and Neural Augmented Inference

Gianmarco says: ‘My favourite talk was the one held by Max Welling. It clearly showed and unified the underlying theoretical grounds of many superficially different models, without failing to provide real-world applications. More concretely, the talk showed how to develop hybrid amortized methods that combine classical learning, inference and optimization algorithms with learned neural networks, which is of strong interest, especially in physics-related fields.

It provided a comprehensive and complete exposition of the topic of amortized neural inference and, as a consequence, it did not fail in bringing the spectator up-to-date with applications in that regard. Max Welling presented how a learned neural network can augment or correct a classical solution (attained by means of expert-knowledge or classical equations), or reversely, how a neural network can be fed useful information computed by a classical method.”

Been Kim – Interpretability for everyone

Gianmarco says: “I was exposed to many new topics and applications I was not familiar with. Talks like Interpretability for everyone that offered more abstract research were the ones that struck my attention the most. The talk presented the latest discoveries and tools in terms of interpretability quantification. It also introduces how to extract interpretability from a black-box end-to-end model, which I find very important for the construction of more robust models and model diagnosis.”

Doina Precup – Building Knowledge For AI Agents With Reinforcement Learning

Ronan says: “I really liked the talk given by Doina Precup on how to build knowledge in the field of reinforcement learning. I only had little knowledge of this field. Thankfully, Doina introduced us quickly to the key concepts of reinforcement learning. She also presented us with some big successes of RL, presented different RL mechanisms and went towards the problem of using existing knowledge to build a life-long learning agent. Doina concluded her talk with a lot of open and inspiring questions: How can we exploit previously learned knowledge and apply it to new environments not related in any manner to the previous ones? How well is an agent preserving and enhancing its knowledge? These questions might not have definitive answers or just answers at all but I found very relevant and interesting the interrogations she raises on how we can represent knowledge.

Stephan Günnemann about Certifiable Robustness of ML Models for Graphs

Ronan says: In this technical talk, Stephan presented us different methods to assess GNN robustness. To certificate the robustness of a GNN, an evaluation of its sensitivity to perturbations needs to be conducted. For example, you can search for a worst-case scenario, and verify that the margin is positive to ensure the model is robust. Stephan’s talk was very pleasant to listen to, as he accompanied it with several examples and applications of the methods he presented us. Finally, he concluded that ML models for graphs aren’t reliable but that we can apply certificates and robustification principles to provide guarantees for a reliable use of GNNs.

Watch the talks:

If you wish to catch up on talks we mentioned or those you missed, all the sessions, paper and presentation recordings are available (for a limited time) from the ECML website.

Gemma Galdon-Clavell

Max Welling :

Been Kim

Doina Precup

Stephan Günnemann

Blog

ECML 2020 – A Summary

A few weeks ago, the biggest European conference on machine learning was held: ECML 2020. Our research engineer Nourchène, our R&D consultant Gianmarco, and our data scientist Ronan attended the event from Tunisia, Belgium and Marseille. What were the big trends and their favourite talks? What did they think of the online remote format? Let’s find out with them!

The Big Trends

The overall conference was very well up-to-date with the outside world’s latest trends and needs. Gianmarco explains: “The conference was rich in presentations which covered nearly all possible topics in machine learning. However, I had the impression that Graph Neural Networks and Generative Models had a little more presence than other models. Transfer learning was also another topic that seemed to be very relevant throughout the conference.”

Remote Format For The First Time

Due to the COVID-19 pandemic, the conference was fully virtual. The talks were pre-recorded and made available prior to the conference. The live sessions were dedicated to questions and answers, with a very brief presentation at the beginning of the session.

Nourchène explains: “The downside was that we had to watch the whole presentation beforehand, otherwise it was difficult to follow the discussion and to interact with the speaker. Fun fact: there was a session where even the moderator was not aware of this Q&A aspect and asked the speaker why the presentation was so short! The good thing is that, since the presentations were pre-recorded, it was possible to watch the presentations from sessions running in parallel.”

Gianmarco adds: “I have not had many remote conferences in my life, but I was genuinely surprised to see how well-organised this one was. The remote framework was very well-designed, the web interface was fully functional, and they took advantage of all the benefits that a remote event can have like re-watchable presentations.”

Kudos to the organising committee for pulling it off!

The Keynotes

We wrote an article with more details about different keynotes that you can find on this link, but here is a teaser:

Gemma Galdon-Clavell – Algorithmic Auditing: how to open the black-box of ML

In her talk, Gemma points out the importance of data used to train a machine learning model. According to her, algorithmic auditing is the key to understanding if the algorithm meets the expectations and if it complies with the regulations. This audit does not only cover the technical part and the way the algorithm was coded. It also focuses on how the problem was approached and the means deployed to solve it. Read our detailed review here.

Max Welling – Amortized and Neural Augmented Inference

The talk showed and unified the underlying theoretical grounds of many superficially different models, without failing to provide real-world applications. It provides a comprehensive and complete exposition of the topic of amortized neural inference and, as a consequence, it did not fail in bringing the spectator up-to-date with applications in that regard. Read more here.

Been Kim – Interpretability for everyone

The talk presented the latest discoveries and tools in terms of interpretability quantification. It also introduces how to extract interpretability from a black-box end-to-end model. Read more in our article.

Doina Precup – Building Knowledge For AI Agents With Reinforcement Learning

Doina Precup talks on how to build knowledge in the field of reinforcement learning. She also presents some big successes of RL, presented different RL mechanisms and went towards the problem of using existing knowledge to build a life-long learning agent. Discover more!

Stephan Günnemann – Certifiable Robustness of ML Models for Graphs

Stephan presented different methods to assess GNN robustness: an evaluation of its sensitivity to perturbations needs to be conducted. Learn more with Ronan here.

Interesting Paper?

Si-An Chen; Voot Tangkaratt; Hsuan-Tien Lin; Masashi Sugiyama – Active deep Q-learning with demonstration

Nourchène says: “The authors presented their paper proposing different groups of techniques for learning from demonstration in Reinforcement Learning, like RL Expert Demonstration (RLED) or Active RL Demonstration (ARLD). These techniques can be used to fasten the learning process of an RL agent. They also propose an uncertainty-based query strategy named Active Deep Q-Network, based on DQN, to dynamically estimate the uncertainty of recent states and use the queried demonstration data.“

Favourite tutorial

Learning With Imbalanced Domains and Rare Event Detection

Ronan says: “This tutorial was interesting and well-structured. Imbalance domains and rare-events prediction concern a lot of domains: financial, medical, data distribution… and will always remain a centre of attention in designing the appropriate solution to a problem. As a consequence, it will remain a core problem in the research. I particularly liked this tutorial as it covered a lot of different approaches: unsupervised (statistical-based, proximity-based, clustering-based), supervised and semi-supervised and compared them. As there is no ideal solution that can be applied to every problem, you have to know what exists before choosing the one that better fits your problem. The tutorial also covered different methods to properly evaluate the performance of an algorithm on an imbalanced task. ”

Conclusion

The conference provided a wide range of machine learning topics in the form of presentations about the latest trends, technologies and applications. As Nourchène says: “it is an optimal platform to stay up-to-date, to widen one’s perspectives and/or dig deeper into a specific topic.”

Watch the talks:

If you wish to catch up on talks we mentioned or those you missed, all the sessions, paper and presentation recordings are available (for a limited time) from the ECML website.

Gemma Galdon-Clavell

Max Welling

Been Kim

Doina Precup

Stephan Günnemann

Active deep Q-learning with demonstration: Read the paper

Blog

Internship & Master Thesis Offer – 2021

Our master thesis and internships offers for the coming year, supervised by our software engineering department or by our research & development department, will be available in the course of November, and will cover the following research topics:

Regarding data privacy:

Legal entity relations with knowledge graph
Legal NLP
Privacy by design
Topic modeling
Text summarisation
…

Regarding data automation

GAN for multimodal representation
AutoML
Optimization methods
Computer vision
Graph Embeddings
…

Regarding data pipelines

Reinforcement learning
Optimisation methods
Stream Processing
CEP
Network compression
…

Regarding data quality

Denoising technique
GAN for missing data
Semi-Supervised learning
Data cleaning
Attention Model for Structural dep.
…

Each project is an opportunity to feel both empowered and responsible for your professional development and to address tomorrow’s challenges in ICT, coached by the Eura Nova crew. The detailed offers will be available mid-november. In the meantime, do not hesitate to contact us at [email protected] for any question regarding internships and master thesis!

As an example, the documents listed below present our 2020 master thesis and internships:

Academic collaborations Blog

Internships 2020

If you are interested in one of our offers, please send us your application to [email protected], including your CV and motivation regarding your top three internship positions (described in the document).

If you wish to read the testimonies of students who have done an internship at EURA NOVA, visit our blog, or read directly their experiences:

Renaud, Elouan
[french-speaking article] Souheila, Léo

If you are interested in working on a topic that is not in our range of offers, we would be delighted to hear your proposition and invite you get in touch.

Internship subjects and application guidelines are available here: Internship Offers.

Academic collaborations Blog

Thirty-Fourth AAAI Conference On Artificial Intelligence: A Summary

Two weeks ago, our young research engineers Hounaida Zemzem and Rania Saidi were in New York for the Thirty-Fourth AAAI Conference On Artificial Intelligence. The conference promotes research in artificial intelligence and fosters scientific exchange between researchers, practitioners, scientists, students, and engineers in AI and its affiliated disciplines. Rania and Hounaida attended dozens of technical paper presentations, workshops, and tutorials on their favourite research areas: reinforcement learning for Hounaida and graph theory for Rania. What were the big trends and their favourite talks? Let’s find out with them!

The Big Trends:

Rania says: “The conference focused mostly on advanced AI topics such as graph theory, NLP, Online Learning, Neural Nets Theory and Knowledge Representation. It also looked into real-world applications such as online advertising, email marketing, health care, recommender systems, etc.”

Hounaida adds: “I thought it was very successful given the large number of attendees as well as the quality of the accepted papers (7737 submissions were reviewed and 1,591 accepted). The talks showed the power of AI to tackle problems or improve situations in various domains.”

Favourite talks and tutorials

Hounaida explains: “Several of the sessions I attended were very insightful. My favourite talk was given by Mohammad Ghavamzadeh, an AI researcher at Facebook. He gave a tutorial on Exploration-Exploitation in Reinforcement Learning. The tutorial by William Yeoh, assistant professor at Washington University in St. Louis, was also amazing. He talked about Multi-Agent Distributed Constrained Optimization. Both their talks were clear and funny.”

Rania’s feedback? “One of my favourite talks was given by Yolanda Gil, the president of the Association for the Advancement of Artificial Intelligence (AAAI). She gave a personal perspective on AI and its watershed moments, demonstrated the utility of AI in addressing future challenges, and insisted on the fact that AI is now necessary to science. I also learned a lot about the state of the art in graph theory. The tutorial given by Yao Ma, Wei jin, Lingfu Wu and Tengfei Ma was really interesting. They explained Graph Neural Networks: Models and Applications. Finally, the tutorial presented by Chengxi Zang and Fei Wang about Differential Deep Learning on Graphs and its Applications was excellent. Both were really inspiring and generated a lot of ideas about how to continue to expand my research in the field! ”

Favourite papers

A personal selection by Rania & Hounaida of interesting papers to check out :

For Hounaida:

Generalizable Resource Allocation in Stream Processing via DRL, by Xiang Ni, Jing Li, Mo Yu, Wang Zhou, and Kun-Lung Wu. This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real-time in a large distributed system.
Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks, by Fabio Pardo, Vitaly Levdik, and Petar Kormushev. The authors propose to use convolutional network outputs (Q-values) to generate several sub-goals at once. And this, in order to better guide the agents.
From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning, by George Konidaris, Leslie Pack Kaelbling, and Tomas Lozano-Perez. The paper tackles the problem of constructing abstract representations for planning in high-dimensional, continuous environments.

For Rania:

Optimizing Reachability Sets in Temporal Graphs by Delaying, by Argyrios Deligkas and Igor Potapov.
Learning Hierarchy aware knowledge Graph Embeddings for Link Prediction, by Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, and Jie Wang. The authors propose a novel knowledge graph embedding model which maps entities into the polar coordinate system reflecting hierarchy.
Multi-View Multiple Clustering using Deep Matrix Factorization, by Shaowei Wei, 1Jun Wang, Guoxian Yu, Carlotta Domeniconi, and Xiangliang Zhang. The paper introduces a solution to discover multiple clusterings. It gradually factorizes multi-view data matrices into representational subspaces layer-by-layer and generates one clustering in each layer.

Final thoughts

After attending their first conference as Euranovians, what will Rania & Hounaida remember? Hounaida concludes: “Going to New York for the AAAI-20 Conference as one of the ENX data scientists was an amazing experience. I met many brilliant and sharp international experts in various fields. I enjoyed the one-week talks with so many special events, offline discussions, and the night strolls!”

Blog Event

Schloss Dagstuhl: Where Computer Science Meets

Which direction stream and complex event processing is going to take? Last week, the world’s best-known international researchers met in Schloss Dagstuhl, Germany, to present and discuss their research. Among the members were present Avigdor Gal, Professor at the Israel Institute of Technology, Alessandro Margara, Assistant Professor at the Polytechnic University of Milan, or Till Rohrmann, engineering lead at Veverica.

Invited to talk about the requirements and needs from the industry, our R&D director Sabri Skhiri explains: “The seminar brought together world-class computer scientists and practitioners working on complex event recognition, distributed systems, databases, stream reasoning and artificial intelligence. Our objective was to disseminate the recent foundational results in each of these isolated fields among all participants, to identify the open problems that need to be resolved, and to establish new research collaborations among these fields”.

What were the big trends and intakes gathered by those brilliant minds? Let’s find out with Sabri!

The Big Trends

This seminar is a bit particular as it does not show any trends but rather gives a picture of all the communities working on CER in a way or another. I was fascinated by the diversity of researchers. I did not expect to see such a rich variety of fields: knowledge representation, spatial reasoning, logic-based reasoning, data management, learning-based approaches, event-driven processing, process mining, database theory, stream mining,… According to me, the composite event recognition models that are the best at recognising complex events would include:

Data flow model
Ontology-based and reasoning model
Symbolic reasoning model
Automata-based model

We also identified common challenges across these models and communities. The three priority topics areas we identified are:

Expressivity: composability & hierarchies
Evaluation strategy, parallelization and distribution
Uncertainty management

Favourite Talk

Kurt Rothermel from TU Stuttgart – Time-sensitive Complex Event Processing

My first reaction to load shedding was: “It is useless since customers do not want to lose any event, that is why so much effort is spent today on exactly once semantics…“. However, there is a trend today in stream processing, which is the trade-off between cost, latency, and correctness. Tyler Akidau described this challenge as a choice between one of three propositions: fast and correct, cheap and correct, or fast and cheap. Tyler was talking about streaming but that rule applies in the same way in a CEP context. The load shedding strategy directly falls in the third proposition. In this perspective, the work of Kurt is highly relevant.

Favourite Tutorial

Jacopo Urbani & Fredrik Heintz – Stream Reasoning

Concretely, stream reasoning is incremental reasoning over rapidly changing information. The tutorial opened new perspectives on stream processing for me. It tried to answer a very interesting question: how can you provide reasoning about context from streams of data? I definitely come from the database and event-based systems communities and I did not know at all that stream reasoning was so mature. This community has been evolving from having a continuous version of SPARKQL to a complete distributed stream reasoning semantics. It is interesting to see that the work we have done in the LEAD algebra and semantics is deeply inspired by this community. However, we have never used any reasoning logic on top of LEAD. But after a few hours of the tutorial, I realise that (1) reasoning can be used for query rewriting and optimisation (2) it is worth evaluating at least BigSR, the LARS implementation on Flink.

Avigdor Gal & Ruben Mayer – Distributed and Event-Based Systems

Avidgor is a kind of pop star for the stream processing and distributed systems community, or at least for me! The papers he published about a probabilistic CEP engine with late arrival and event uncertainty were visionary.

The speakers started by explaining the basics of stream processing then went deeper into the event recognition language and architecture. They detailed pub/sub applied to event recognition and explained the data flow model, which consists of a single unified data processing model where the stream and batch paradigms are the same. This last part was based on Tyler Akidau’s paper.

A second part of the talk focused on elasticity on streams. Stream fission puts operators among different categories:

Firstly, key-based operators, that is a group by operation (as in SQL)
Secondly, window-based operators enable to split processing that needs to have multiple event types correlated with different keys within the same operator
Finally, pane-based operators enable a split-merge strategy where you distribute and merge the result.

Interestingly, Avigdor presented his work about late-arrival processing from a probabilistic viewpoint and not from the watermark perspective. Usually, modern stream processing frameworks use watermarks in order to take into account events that arrive later. Avigdor presented a probabilistic approach to this issue.

What are late-arrival events?

Imagine we want to count the number of cars entering a road segment every three minutes: we have a “tumbling window” every 3 minutes. If an event (ie a car) arrives at 2’55 second in the window but is stuck somewhere in the network for 6 sec, it is called a late-arrival event. The processing time (the time at which the CEP processes the event) is delayed compared to the event time (the time on which the event really occurs).

Note that for CEP, there is clearly a trade-off between timeliness and accuracy, because the slack time will increase the delay to deliver your result but will increase your accuracy. There is always a tradeoff between cost, latency and correctness, and usually, you can only pick two among the three.

Fun fact: If you need to explain what is event time & processing time to your mother (yeah, don’t underestimate the power of this kind of discussion at Christmas dinner), the best way is to take the Star Wars analogy. From an event time perspective (which is the time at which the story really happened) you should follow episode 1, 2, 3,4, 5, 6, 7,8, 9. But if you take the processing time (the time on which we received the episode), it is 4, 5, 6, 1, 2, 3, 7, 8, 9. Isn’t it great ?!

Final Thoughts

CER has been explored from many viewpoints. However, never in the research history was there a meeting gathering representatives of these communities. This was the objective of this seminar. Having all these people in a castle in the middle of nowhere was a blast! I had very passionate discussions during meals but also during the night at the library with the most brilliant brains on stream and CEP. On the other hand, I still had some fun discussions about comparing Star Trek DIscovery and Picard! Finally, the most important things I will remember after this seminar… are the endless ping pong games with Till Rohrmann and Alessandro Margara :-).

Blog Event

10 Papers in 2021

Our Research Director Invited as PC Member at IEEE Big Data

INTERNSHIPS 2021

MASTER THESIS & PFE 2021

ECML 2020 – The Keynotes

ECML 2020 – A Summary

The Big Trends

Remote Format For The First Time

The Keynotes

Interesting Paper?

Favourite tutorial

Conclusion

Internship & Master Thesis Offer – 2021

Internships 2020

Thirty-Fourth AAAI Conference On Artificial Intelligence: A Summary

The Big Trends:

Favourite talks and tutorials

Favourite papers

Final thoughts

Schloss Dagstuhl: Where Computer Science Meets

The Big Trends

Favourite Talk

Favourite Tutorial

Final Thoughts

SERVE

Expertise

CRAFT

digazu

CONTACT

Belgium

France

Tunisia

CAREER

Job Offers

Social media