Between the collection of personal data and its use for a defined purpose, many risks related to infrastructure or user practices can lead to the leakage of personal information. Due to the volume of data generated, healthcare institutions are prime targets for attackers, whether through the theft of hard drives, the recovery of email content or data from unsecured workstations. In addition to the invasion of individual privacy, this type of incident greatly deteriorates the image of and trust in these institutions. However, the use of health data is essential in the daily life of many units (research, training...).
Octopize, thanks to its Avatar anonymization method, allows to create synthetic datasets that protect the individuals at the origin of the data, while keeping the statistical potential and the original granularity. Avatar data is no longer considered personal data and can be safely shared within a research unit for example. In case of malicious or accidental leakage of this anonymized data, re-identification of patients is practically impossible.
A research unit in oncology at an hospital wishes to improve its practices in the use of personal data while allowing its doctoral students to understand clinical health data in order to set up analyses.
This is a cohort of women with breast cancer whose tumor severity is being determined through measurements taken on biopsies. The objective is to share this health data with PhD students so that they can understand the pathology without holding personal data on their position.
The transformation of data by the Avatar solution is systematically accompanied by an evaluation of the security of the generated summary data through unique metrics. These metrics were developed to verify compliance with the 3 criteria identified by the European Data Protection Committee (EDPS) (formerly G29) to qualify data as anonymous under the GDPR; namely:
From our example we obtain the following results:
The results obtained indicate that it is impossible in practice for an attacker to re-identify the individuals in the cohort.
The aim is to verify whether the data set anonymized by the Avatar method has retained its pedagogical character and can be used by doctoral students to carry out analyses while respecting the privacy of the patients.
The transformation of data into avatars makes it possible to secure and facilitate the internal use of data. The data in circulation is not personal data, thus avoiding any risk of malicious or accidental leakage. However, after transformation, the data remains useful for the uses initially planned.