Data drift is a phenomenon that causes Machine Learning applications to become obsolete during the production phase. In this article, we discuss how you can regain control over your projects despite this problem.
We saw in previous post that data drift refers to environmental changes that confuse AI and decrease its performance. The concept perfectly illustrates the limits of current artificial intelligence applications that still have a limited abstraction capacity (which does not stop them from being very useful for companies).
In addition, Machine Learning projects also have to deal with another bias, that of concept drift. Concept drift differs little from data drift, except that it is no longer the environment that changes, but the focus of artificial intelligence. Let’s take again our example of a camera that must understand the different characteristics of vehicles in order to charge them the right price on the highway. Initially, it had to understand the number of wheels touching the ground and the presence or absence of a trailer. Let us say that the rules are changing as a result of European harmonisation. The AI must include the number of occupants (in compliance with the GDPR) in order to charge lower prices to carpoolers.
The AI must therefore continuously adapt to changes: whether it faces data drift, continuous environmental change or concept drift, the change of target, the AI must be able to easily evolve. Unfortunately, most of the time, it has been set up by a team of technology specialists (developers, data scientists) who are not readily available to update the project, and if they are, the maintenance costs of the project will be very high.
But don’t give up, and keep investing in AI, because there is a way to solve the problem. As very few overcome it you will become one of the few real beneficiaries of this technology.
To do this, you must be able to anticipate changes, continuously measure performance drops and correct your AI to implement an effective continuous improvement loop:
Next-generation image recognition is a science that is analogous to that of the human brain, so it is very intuitive. AI projects should be entrusted to those in charge of field operations. They will be able to anticipate changes in conditions and therefore work on alternative methods.
The decrease in AI performance, just like the increase, must be a variable measured continuously, not just at the beginning of the project to give the green light for the system to go live. In this way, one can understand and be aware of the intensity of the data drift.
this is the most important point. Continuous improvement consists in putting into production not a fixed application but a complete system, self-learning and capable of adapting to changes in the application environment. AI training comes from the interaction it has with humans who will be able to correct its few errors on a daily basis.
The key is to set up a system in production that is maintained not by Deep Learning or IT project experts, but by experts in the field. They will be the most able to detect drops in performance, they are the ones who operate the AI on a daily basis, and are therefore able to notify the AI of its shortcomings to enable it to adapt to changes and improve throughout the life of the project.
An article originally posted on L’Usine Nouvelle