Skip to content

A Combined Rule-Based and Machine Learning Approach for Automated GDPR Compliance Checking

The General Data Protection Regulation (GDPR) requires data controllers to implement end-to-end compliance. Controllers must therefore ensure that the terms agreed with the data subject and their own obligations under GDPR are respected in the data flows from data subject to controllers, processors and sub-processors (i.e. data supply chain). This paper seeks to contribute to bridging both ends of compliance checking through a two-pronged study. First, we conceptualize a framework to implement a document-centric approach to compliance checking in the data supply chain. Second, we develop specific methods to automate compliance checking of privacy policies. We test a two-modules system, where the first module relies on NLP to extract data practices from privacy policies. The second module encodes GDPR rules to check the presence of mandatory information. The results show that the text-to-text approach outperforms local classifiers and enables the extraction of both coarse-grained and fine-grained information with only one model. We implement a full evaluation of our system on a dataset of 30 privacy policies annotated by legal experts. We conclude that this approach could be generalized to other documents in the data supply as a means to improve end-to-end compliance.

Rajaa El Hamdani, Majd Mustapha, David Restrepo Amariles, Aurore Troussel, Sébastien Meeus, Katsiaryna Krasnashchok, A Combined Rule-Based and Machine Learning Approach for Automated GDPR Compliance Checking, Proc. of the 18th International Conference on Artificial Intelligence and Law, 2021

Watch the presentation on YouTube.

Click here to access the paper.

Releated Posts

Calibrate to Interpret

Trustworthy machine learning is driving a large number of the ML community works in order to improve ML acceptance and adoption. In this paper, we show a first link between uncertainty and explainability, by studying the relation between calibration and interpretation.
Read More

Mass Estimation of Planck Galaxy Clusters using Deep Learning

Galaxy cluster masses can be inferred indirectly using measurements from X-ray band, Sunyaev-Zeldovich (SZ) effect signal or optical observations. Unfortunately, all of them are affected by some bias. Alternatively, we provide an independent estimation of the cluster masses from the Planck PSZ2 catalogue of galaxy clusters using a machine-learning method.
Read More