Visual automation platform

Data anonymization: a challenge to face!

Data anonymization.

Since its implementation in May 2018, the GDPR has promoted the privacy of European users by requiring companies to protect the personal data they collect. Indeed, protecting citizens names and surnames as well as their biometric data is a question of preserving their private lives.

There are different methods for data protection and, particularly, data anonymization.

What is data anonymization?

Data anonymization refers to an irreversible transformation of data to prevent the identification of a particular individual. Irreversible means that it must be impossible to re-identify the person in question, directly or indirectly.

An alternative is pseudonyimzation. Like anonymization, it consists of a transformation of personal data to prevent the identification of an individual. However, its main objective is to prevent direct identification. That is, it is still possible to re-identify a person using additional information. Therefore, pseudonymization is a less powerful data processing than anonymization since it does not totally and definitively prevent the identification of a person.

Methods of anonymization

There are two main techniques for anonymizing data:

  1. Randomization: transforming data so that it can no longer be attributed to a real person.
  2. Generalization: generalize the data so that they become common to a set of people and not to a particular person.

The challenges of anonymization

However, companies are facing a growing and increasingly difficult volume of data to manage. Consequently, anonymization seems much less effective. Indeed, it is easier to re-identify a person by cross-checking information.

To ensure effective anonymization, the CNIL(1) considers that a set of anonymized data must meet 3 main criteria:

  1. Individualization: is it always possible to isolate an individual? In other words, it must not be possible to identify a particular individual.
  2. Correlation: is it possible to link separate data sets for the same individual? In other words, the cross-checking of information must not be possible.
  3. Inference: can you deduce information about an individual? No induction allowing the identification of a person should be possible.

The dataset is poorly anonymized if it does not meet all 3 conditions.

Case study: RATP and the anonymization of video streams

Data anonymization solutions must be built on a case-by-case basis. In fact, each company and industry have different needs and requirements. Let’s consider the following example to illustrate this point.

Deepomatic has set up a tailor-made system for RATP to anonymize the video stream from their video surveillance cameras. More specifically, anonymization is achieved through an automatic blurring module that operates in real-time. This solution allows rapid image processing and ensures the preservation of personal data (in this case, biometric data) of users of the Paris transport network.

If you also have a project to anonymize images or video streams, do not hesitate to contact us.

(1). G29 publishes an opinion on anonymization techniques

You may also like

Difference between computer vision and image recognition
Top 6 computer vision conferences around the world
Data Drift or the Nightmare of Artificial Intelligence in Production (Part 1)
La data drift or how to regain control of your AIs in production (Part 2)
The 6 steps to create your own image recognition system
Lean AI – The secret to developing Computer Vision on an industrial scale


Deepomatic New York
135 East 57th street, 16th floor
New York, NY 10022

Deepomatic Paris
53 rue de Turbigo, 75 003 Paris

©Deepomatic 2020 РPrivacy Policy

This website uses cookies to improve your experience. To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.