What is object recognition and how you can use it


Object recognition. A new concept?

Spoiler alert!

Notwithstanding the two different labels, there is no real difference between object recognition and image recognition. In fact, they both refer to technologies that can recognize certain targeted subjects through specific algorithms like deep learning. They are strictly related to computer vision, which we define as the art and science of making computers understand images.

In this article, we’ll revise some basic concepts in object recognition: classification, tagging, detection, and segmentation.

Read our previous blog posts if you want to know more about image recognition and the technology behind it, i.e. deep learning.

What is object recognition ?

Object recognition consists of recognizing, identifying, and locating objects within a picture with a given degree of confidence.

In this process, the four main tasks are:

  1. Classification.
  2. Tagging.
  3. Detection.
  4. Segmentation.

Classification and tagging

An important task in object recognition is to identify what is in the image and with what level of confidence. This is indicated as the probability percent in the picture below.

Classification and tagging in object recognition.
Classification (left) and tagging (right).

The mechanism of this task is (relatively) straightforward. It starts with the definition of the ontology, i.e. the class of objects to detect. Then, both classification and tagging identify what is in the image and the associated level of confidence.

While classification recognizes only one class of objects, tagging can recognize multiple ones for a given image. In other words, in classification the algorithm will only remember that there is a dog, ignoring all other classes. On the other hand, in tagging, it will try to return all the best classes corresponding to the image.

Detection and segmentation

Once identified what is in the image, we want to locate the objects. There are two ways to do so: detection and segmentation.

Detection outputs a rectangle, also called bounding box, where the objects are. It is a very robust technology, prone to minor errors and imprecisions. Alternatively, segmentation identifies the objects for each pixel  in the image, resulting in a very precise map. However, the accuracy of segmentation depends on an extensive and often time-consuming training of the neural network.

Detection and segmentation in object recognition.
Detection (left) and segmentation (right).

If the performance of the operation is high enough, it can deliver very impressive results in use cases like cancer detection. If you want to know more, read our blog post on image recognition and cancer detection.


In this article, we have seen that image and object recognition are the same concept. We then looked at the four main building blocks of the technology.

Although it may sound rather theoretical and abstract, object recognition has a lot of interesting use cases in business. For example, through object recognition, we developed an automated checkout system for a major player in the foodservice industry.


Our Blog Articles