A Survey of Maturity Models in Data Management

Maturity models are helpful business tools that refine and develop how organizations conduct their businesses and benchmark their maturity status against a scale or with industry peers. They serve to prioritize the actions for improvement better and control the progress in reaching the target maturity stage.

Continue reading

Towards a Continuous Evaluation of Calibration

For safety-critical systems involving AI components (such as in planes, cars, or healthcare), safety and associated certification tasks are one of the main challenges, which can become costly and difficult to address.

One key aspect is to ensure that the decisions a machine-learning classifier makes are properly calibrated.

Continue reading

Privacy Policy Classification with XLNet

The popularisation of privacy policies has become an attractive subject of research in recent years, notably after the General Data Protection Regulation came into force in the European Union. While GDPR gives Data Subjects more rights and control over the use of their personal data, length and complexity of privacy policies can still prevent them from exercising those rights. An accepted way to improve the interpretability of privacy policies is through assigning understandable categories to every paragraph or segment in said documents. The current state of the art in privacy policy analysis has established a baseline in multi-label classification on the dataset containing 115 privacy policies, using BERT Transformers. In this paper, we propose a new classification model based on the XLNet. Trained on the same dataset, our model improves the baseline F1 macro and micro averages by 1-3% for both majority vote and union-based gold standards. Moreover, the results reported by our XLNet-based model have been achieved without fine-tuning on domain-specific data, which reduces the training time and complexity, compared to the BERT-based model. To make our method reproducible, we report our hyper-parameters and provide access to all used resources, including code. This work may, therefore, be considered as a first step to establishing a new baseline for privacy policy classification.

Majd Mustapha, Katsiaryna Krasnashchok, Anas Al Bassit and Sabri Skhiri, Privacy Policy Classification with XLNet, Proc. of the 15th DPM International Workshop on Data Privacy Management, Surrey, UK, 2020.

Click here to access the paper in its preprint form.