7 Publications in 2018

At EURA NOVA, we believe investing in research allows us to continuously become more proficient, to maintain our know-how at the cutting edge of IT, and to share its benefits with our customers. As we look back on the year 2018, we are both proud and happy to announce that our R&D department has published 7 publications this year:

 

Firstly, our paper “Pairwise Image Ranking with Deep Comparative Network” was published at the 26th European Symposium on Artificial Neural Networks. The paper, written by our Lead R&D engineer Aymen Cherif and Salim Jouili, discuss how using the pair-wise ranking model can provide better results for instance-level image retrieval.

Aymen Cherif, Salim Jouili, Pairwise Image Ranking with Deep Comparative Network. ESANN 2018: ES2018-200

 

Secondly, our R&D engineer Cécile Pereira participated in the redaction of a paper published in Bioinformatics in May 2018. They propose a novel end-to-end deep learning approach for biomedical NER tasks that leverage the local contexts based on n-gram character and word embeddings via Convolutional Neural Network.

Qile Zhu, Xiaolin Li,  Ana Conesa, Cécile Pereira, GRAM-CNN: A deep learning approach with local context for named entity recognition in biomedical text, Bioinformatics – May 2018

 

In July, our R&D engineer Katherine Krasnoschok was in Melbourne, Australia to attend the ACL conference. She presented her poster on topic modelling. Her paper, co-written with Salim Jouili, indicates that involving more named entities positively influences the overall quality of topics.

Katsiaryna Krasnashchok, Salim Jouili, Improving Topic Quality by Promoting Named Entities in Topic Modeling, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Vol. 2. 2018

 

Moreover, our paper “Graph BI & Analytics: Current State and Future Challenges” was accepted for publication and presented at the 20th International Conference on Big Data Analytics and Knowledge Discovery, taking place in Germany in September. The paper presents the state of the art of graph BI & analytics, with a focus on graph warehousing.

Amine Ghrab, Oscar Romero, Salim Jouili, Sabri Skhiri, Graph BI & Analytics: Current State and Future Challenges. DaWaK 2018: 3-18

 

In September as well, our paper Data Mining and Machine Learning Techniques supporting Time-based Separation Concept Deployment, co-written with Eurocontrol and WaPT, was presented at the 37th Digital Avionics Systems Conference (DASC) in London, U.K. The paper presents two methods to allow air traffic controllers to deliver separation minima accurately and safely, on the basis of time intervals instead of distances.

De Visscher, I.; Stempfel, G.; Rooseleer, F. & Treve, V.; Data mining and Machine Learning techniques supporting Time-Based Separation concept deployment, in 37th Digital Avionics Systems Conference (DASC), pp 594-603, London, UK, September 23-27, 2018

 

Finally, our engineer Katsiaryna Krasnashchok presented in October her poster on Hierarchical Attention-Based Neural Topic Model at the 6th International Conference on Statistical Language and Speech Processing. Furthermore, our Lead R&D engineer Aymen Cherif and our bootcamper Luca De Petris presented as well their poster on LSTM Siamese Network.

Katsiaryna Krasnashchok, Salim Jouili, Hierarchical Attention-Based Neural Topic Model, SLSP 2018

Luca De Petris, Aymen Cherif,  LSTM Siamese Network for Question Answering System, SLSP 2018

IEEE Big Data 2018: a summary

At the beginning of the month, our R&D director Sabri Skirhi and our R&D engineer Syrine Ferjaoui travelled to Seattle to attend IEEE Big Data. The conference is one of the most influent in this domain, gathering more than 1100 attendees, 5 keynotes, 9 tutorials, and 8 daily tracks in parallel. Back in Belgium, our R&D director gives you his opinion on the conference itself and the important elements from the keynotes, the tutorials, the workshops and the interesting papers.

 

Favourite Talks

 

Keynote 1: Decentralized Machine Learning – Google AI

The IEEE Big Data conference started with the inspiring keynote of Blaise Agüera y Arcas, a distinguished researcher at Google AI. Our director details: “The straightforward thesis of the talk is that we can, and we must, use the mobile device for local deep neural network computing. Blaise Agüera explained that since the launch of Tensorflow, Google Brain has built specialised hardware servers to run efficiently deep neural network computing jobs. Nowadays, we find on the market specialised chips that are smaller than a coin of 1 cent and that costs less than a cappuccino. Using them, you can run very efficiently deep neural net computing jobs on mobile at low frequency, low energy and even continuously. For example,  the Google camera embeds deep neural nets and does not need to send data to the server side for face or situation detection. But Dr Blaise is going further. He works on reusing the existing techniques in distributed neural net and sharing the learned gradient in a parameter server and sharing them to all device. This is what we call federated learning, and it has impacted many research areas, such as edge computing. The idea of edge computing is to execute light tasks on the edge of the network in order to offload the server/cloud. But here, this is changing the game since the nature of the job is not light anymore. In addition, the concept of federated learning does not try to offload the server but changes the role of the server as a coordinator between edge devices. Secondly, it has impacted neural net compression. The question is then: do we still need to compress networks when we can either distribute the neural net on the server side or have specialised chips on the device side?”

 

Keynote 2: Big Data for Speech and Language Processing – MSF Research

The second keynote, Xuedong Huang, is a Microsoft Technical Fellow of Microsoft Cloud and AI. He was presenting the latest advances in Speech recognition and Text To Speech (TTS). The key papers behind this technology can be found here and on the research group page. Our director explains: “The first part of the keynote was about the MSF live captioning that will be soon integrated natively in PowerPoint. That is just impressive. Everything that the speaker is saying is capturing by the tool. I personally tested the Translator Android application and it works just fine!  The second part of the keynote was focused on the Text To Speech (TTS). The speaker was showing a set of very interesting examples of how voice can be modelled. For instance, if the system learns a model out of hours of discussions, it can apply my voice in Chinese or Arabic or it can learn from a group of person in order to get a better accent and expression”.

 

The Tutorials

This year, IEEE Big Data organised 9 tutorials. Our R&D director explains: “This is probably what I like the most at an academic conference. A research group presents a complete state-of-the-art review in their domain and usually position their own work in the story. My favourite was Progress in Zeroth Order Optimization and Its Applications to Adversarial Robustness in Deep Learning. It was one of the coolest research topics I have seen so far. They discussed how you can fool a deep neural network in order to get a wrong classification. The idea is great: finding the minimal noise you can add to a picture in order to increase the probability of a wrong classification. In this setting, you don’t know anything about the classifier, but you can submit images and you will get a label. Indeed, that looks like a black box optimisation setting. That is precisely why they use Zeroth order optimisation. The research topic is so cool, you can manage to fool the classifier to make it recognize a piano in an image picturing a bagel! Can you imagine the impact, at the era of the electronic passport, where image recognition starts to be used in the signature process?  What if I can find how to fool an algorithm to be classified as someone else with just a few grey pixels on my picture?”

 

The Workshops

EURA NOVA research centre organised the third workshop on Real-time and Stream analytics in Big Data, collocated with the 2018 IEEE conference on Big Data. Our Research Director Sabri Skhiri talked about data management, and stream and real-time analytics. Thank you to our keynote speaker Fabian Hueske, and all the attendees and speakers! They had a great time, with captivating talks and a lot of interesting questions and comments. The summary of the event is available on our website. The slides of the opening session and the slides of the second keynote are available here.

 

Final Feelings

In the early age of the conference, IEEE Big Data was mainly focused on the big data infrastructure. In the following years, the conference became data science oriented, with a significant increase in the number and the complexity of data science use cases. When we asked how he felt about the event, Sabri explained: “I have been attending this conference since the first occurrence. The most important shift I have seen is really about the content. This year, the infrastructure papers have almost disappeared. On the other hand, the vast majority of the publications are on data science. We can really see that it is becoming a conference for ML practitioners. The side effect is the complexification of the discussed topics. Machine learning notions are supposed to be known, deep neural networks are becoming the norm. Going further, the authors are also good at using distributed frameworks, especially Spark. For them, the infrastructure is not a problem anymore, this is part of the daily job”.

 

The Papers

A personal selection of interesting papers: