In the Systolic Blood Pressure case, the data used are artificial systolic blood pressure data. All the measurements are performed on 20 patients on about 50 timepoints and form a time series. This is a specific type of data involving a dependence between the different values. Time series are frequently used, especially in the health field. Their temporal characteristic makes them subject to re-identification by individualization.
- 20 patients
- 25 measurements / patient
Objectives of anonymization
- First, the goal is to make it impossible to re-identify the individuals in the dataset: personal data protection objective.
- Secondly, the anonymization of systolic blood pressure data will have to preserve the usefulness of the data by preserving the trends, patterns and possible seasonality of the time series.
The reference data produced were purposely separated in a bimodal fashion to ensure that this information would be retained after the transformation into avatarized time series.
After transformation into an avatar, the generated time series are superimposed on the reference series. This representation allows to visualize the quality of the information kept in terms of clustering and trend.