Privacy Enhancing Technologies 2024: A Summary

For Large Language Models (LLMs), Azure confidential computing offers TEEs to protect data integrity throughout various stages of the LLM lifecycle, including prompts, fine-tuning, and inference. This ensures that all data used by LLMs remains encrypted and protected within the TEE, enhancing overall data security and privacy.

The recent Privacy Enhancing Technologies (PETs) London Conference convened stakeholders across Legal, IT, and various business domains to delve into the evolving data privacy and security landscape. With a focus on collaborating and sharing data through cutting-edge PETs, the summit provides valuable insights into how organisations can optimise the value of their data while ensuring privacy and security. Here, we distil the key themes and insights shared during the conference, shedding light on the interesting talks.

Tags: <privacy, synthetic data, encryption, sharing data >

Big Trends in PETs for 2024

The PET Summit 2024 showcased significant advancements and key trends in Privacy Enhancing Technologies (PETs), reflecting the growing importance of data privacy and security in the digital age. Major themes included:

  • Emerging Technologies Driving PET Development: Technologies such as Trusted Execution Environments (TEEs) and Large Language Models (LLMs) garnered attention for their potential to revolutionise data privacy and security, albeit with lingering concerns over overhead versus value.
  • Regulatory Challenges: The summit underscored the complexities of navigating a rapidly evolving regulatory landscape, with new regulations like the Data Act and AI Act adding layers of compliance obligations. The lack of clear technical implementation standards further complicates PET adoption.
  • Business Value Creation: Despite challenges, organisations demonstrated how PETs can unlock significant value from data while ensuring compliance. Use cases ranged from anonymisation solutions for data governance projects to Homomorphic Encryption (HE) for secure information sharing.

Keynote: Can We Encode Fundamental Rights into Code?

Data privacy is a highly regulated landscape, riddled with ongoing legal and political disputes regarding its interpretation. While regulations like GDPR (General Data Protection Regulation) strive to provide clear guidelines, companies face the ongoing challenge of balancing compliance with innovationThis keynote emphasised the importance of a risk-based approach to data privacy, advocating for measures that safeguard individual rights while enabling data movement within legal frameworks. 

The keynote acknowledges the challenges but doesn’t offer definitive answers. Instead, it serves as a call to action for ongoing dialogue and collaboration. By fostering open discussions and exploring creative solutions, stakeholders can translate the fundamental right to privacy into a more secure and responsible digital ecosystem. Moving forward, collaboration across legal, IT, data scientists, and business users is essential to bridge the gap between regulations and practical implementation.

Favourite Talks

Unlocking Client Value with Data: Insights from PwC Germany

PwC Germany showcased their journey in leveraging PETs, particularly anonymisation solutions from Anonos to unlock client value. Their approach involved gathering user requirements, conducting successful proof-of-concepts, and implementing use case-based strategies. 

Challenges included communication gaps between stakeholders (understanding stakeholders’ perspectives across diverse fields with differing terminologies and expectations) and the need for agile delivery to be able to meeting the business’s need for rapid implementation

Another hurdle involved catering to various use cases requiring different anonymisation approaches. This is where Anonos data embassy proved valuable, offering a comprehensive suite of anonymisation methods.

Understanding the Legal Basis and Regulations for Data Synergies in PET and Legal Supply

This session delved into the legal implications of utilising PETs, focusing on privacy principles like transparency, anonymisation, and data sharing: 

  • Achieving transparency becomes challenging when translating legal concepts into clear and understandable IT requirements. This is further complicated by the potential disconnect between legal expertise and the intricate nature of these technologies.
  • The inherent difficulties in achieving truly anonymised datasets due to evolving re-identification techniques and ever-growing data availability. The requirements, focused on singling out individuals, re-identification risks, and attribute inference, make true anonymisation a complex endeavour.
  • Balancing transparency with data sharing necessitates explaining the context and purpose of data use in a clear and accessible manner. This is where strong data governance, particularly in terms of data lineage, becomes crucial.

This highlights the need for a clear understanding of which legal frameworks apply and how they translate to specific constraints, IT requirements, roles, and responsibilities. A balanced approach, combining PETs with robust legal frameworks and clear, user-friendly communication, is essential for fostering responsible data use.

Mastercard Explores Homomorphic Encryption in Singapore's IMDA Sandbox

Mastercard’s collaboration with the Singaporean government showcased the potential of Homomorphic Encryption (HE) for secure information sharing in financial crime investigationThe core use case focused on enabling banks to share financial crime intelligence across borders while guaranteeing robust privacy protection. 

The technical details involved utilising a Multi-Party Computation (MPC) encryption scheme, allowing banks to verify the validity of International Bank Account Numbers (IBANs) without revealing any sensitive user or transaction data. However, scaling this solution even within the limited scope of financial crime investigation presented significant challenges, including computational intensity and regulatory complexities. While HE enables cross-border data exchange, questions lingered regarding its effectiveness in resolving data privacy concerns.

Azure Confidential Computing for Enhancing Privacy in AI

Azure Confidential Computing tackles the critical gap in data security by focusing on protecting data during processing within the cloud environment. It utilises Trusted Execution Environments (TEEs), such as Intel SGX, to create secure enclaves within the CPU and memory. This ensures that sensitive data remains encrypted and isolated from the main operating system, even during active processing. 

The increasing reliance on data for AI development raises critical questions about data privacy and sovereignty. Azure confidential computing enhances data security by encrypting it at rest, in transit, and in use, which protects data throughout the AI lifecycle, and ensures users maintain control over their data, complying with various regulations.

On top of these considerations, Microsoft proposes a multi-layered approach to achieving responsible AI with strong privacy safeguards:

  • Data Governance and Compliance: Establishing clear policies and adhering to regulations.
  • Privacy and Robustness of AI Models: Incorporating privacy-preserving techniques to enhance their security.
  • Confidential AI & Acceleration Platform: Utilising TEEs for securing CPU to GPU communications. 

For Large Language Models (LLMs), Azure confidential computing offers TEEs to protect data integrity throughout various stages of the LLM lifecycle, including prompts, fine-tuning, and inference. This ensures that all data used by LLMs remains encrypted and protected within the TEE, enhancing overall data security and privacy.

Favourite Panels

Data as The New Currency: Developments in Data and Where Do PETs Fit In

The panel discussed the balance between the value of data and associated risks, emphasising principles like lawfulness, trustworthiness, and privacy. They highlighted the importance of data governance as a foundational PET and advocated for a combined approach of data governance with a risk-based assessment. Challenges to PET adoption included: 

  • Complexity: Adding noise through Differential Privacy (DP) can introduce complexities into data processing pipelines. Generally speaking, PETs introduce complexity and costs.
  • Regulatory compliance: If encryption is applied, we lose the opportunity to evaluate fairness and bias. 
  • Return on Investment: Determining the value extracted from PET implementation can be challenging.
  • Technical Proficiency: Implementing PETs requires in-depth technical expertise, unlike user-friendly platforms for managing Data Subject Rights (DSR) requests. This complexity extends to the procurement process, necessitating collaboration between data governance, legal, and IT teams.
  • Limited Proactive Adoption: Without regulatory pressure, many organisations remain hesitant to adopt PETs, relying solely on a “compliance-driven” approach instead of proactively creating value through responsible data practices.
 

Cross-Functional Collaboration: Overcoming Hurdles in Implementing PETs

This panel emphasised the crucial role of cross-functional collaboration in successfully implementing PETs. It addressed stakeholder involvement, technology gaps, and the need to shift focus from technical implementation to unlocking value across different functions. Mastercard’s experience showcased the importance of a dedicated PET team and understanding industry-specific regulatory constraints.

Leveraging Data and PETs for Business Success

The panel explored the significant role of data in unlocking the potential of PETs across sectors. It highlighted the shift in perspective within healthcare to view data as an opportunity and discussed strategies for navigating collaboration and value creation

  • Open communication and establishing clear protocols for collaboration.
  • Identifying the inherent value for each party involved in the data chain.
  • Striking a balance between regulation and technological solutions. While some panellists advocated for stricter regulations, others emphasised the need for clear technical implementation guidelines to translate legal principles into actionable IT requirements.

Building trust in data processing, particularly regarding transparency and explainability, was identified as a crucial challenge. LLMs were recognised as potential catalysts for PET adoption, but the lack of clear technical implementation guidelines hinders regulatory impact.

Conclusion

The PET Summit 2024 provided valuable insights into the technical challenges and opportunities surrounding PET adoption. Azure confidential computing exemplifies Microsoft’s commitment to enhancing data privacy and security, particularly in the context of AI. However, challenges such as complexity, regulatory tension, and the need for cross-functional collaboration persist. Clear guidance and standards are essential to drive PET adoption and ensure responsible data practices across industries. As organisations continue to navigate the evolving landscape of data privacy and security, collaboration, transparency, and innovation will be key to unlocking the full potential of PETs. Collaboration across disciplines and a balanced approach that integrates PETs with robust legal frameworks are essential for navigating the evolving landscape of data privacy and security effectively. 

IEEE Big Data 2023 – A Summary

Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023. In this article, Sabri explores for you the various keynotes and talks that took place during the conference, highlighting the noteworthy insights and the practical applications shared by industry leaders.

Continue reading

Internships 2024

This document presents internships supervised by our consulting department or by our research & development department. Each project is an opportunity to feel both empowered and responsible for your own professional development and for your contribution to the company.

Continue reading

SANGEA: Scalable and Attributed Network Generation

In this paper, we present SANGEA, a sizeable synthetic graph generation framework that extends the applicability of any SGG to large graphs.
By first splitting the large graph into communities, SANGEA trains one SGG per community, then links the community graphs back together to create a synthetic large graph.

Continue reading

Kafka Summit 2023: Announcements & Trends

The Kafka Summit brought together industry experts, developers, and enthusiasts to discuss the latest advancements and practical applications of event streaming and microservices. In this article, our CTO Sabri Skhiri delves into some of the keynotes and talks that took place during the summit.

Continue reading