In this use case, the personal data used represents a sample of 1,451,721 cab trips made in New York City in 2016.
The dataset, initially pseudonymous, presents a high risk of re-identification represented by the combination of spatial (GPS coordinates of departure and arrival) and temporal (departure and arrival times) information. In this context, the possibility for an attacker to infer an individual's place of residence from the information at his disposal represents a risk.
In this use case, several objectives are identified.
This information must be preserved while respecting the topographical plausibility of the original data. Indeed, the avatars must not be able to take impossible GPS coordinates such as the East River branch or Central Park.