According to most research firms such as Gartner, Forrester or MarketsAndMarkets, the global computer vision market is set to reach a total of USD 20 billion by 2025. Two trends explain this exponential growth. On a technological level, deep learning changed the game and the associated computing costs have rapidly decreased. On a business level, companies are doubling down on their AI investments after spending a decade on digital transformation.

The business world is still in the early adoption stages but there are already established key solution providers divided with different approaches :

Vertical players usually capture more of the value proposition as they go deeper into the specific use case they’re tackling. For instance, there are already a bunch of companies focused on verifying identification documents such as driving license, passport, etc. What they gain in-depth, they lack in breadth as each new use case requires finding, evaluating, buying, and integrating a new software provider. Vertical players also tend to generalize the use case they are focusing on in order to sell it to the largest possible audience which makes it hard to adapt if it’s not exactly what you need.

The horizontal approach is to develop a general platform that is able to tackle most computer vision use cases, and only then, perform the integration with the existing business workflow. This has the strength of providing one centralized hub for all computer vision applications and drastically reducing time-to-market cycles for new applications.

In this series of articles,  we will focus on Enterprise Computer Vision Platform providers, this means:

Providers review


Recommended if you are a developer and want to build a quick hackathon app with a pretrained model and few custom modifications. However, the platform does not allow you to build more complex projects, thus less customized to your needs.

Click here to read the full review.


Matroid has a strong presence in San Francisco area, as they organize the ScaledML conference, However, during our tests, it was very complicated to carry out a project from start to finish. Data entry is complicated, and their detection capabilities are still in the early stages of development.

Click here to read the full review.

Google AutoML Vision

Google AutoML Vision is a bit different from other platforms as it focuses primarily on automatic model training, which is an important building block of the whole application lifecycle but not sufficient by itself to deploy applications in production.

Click here to read the full review.

Microsoft Custom Vision

Microsoft Custom Vision provides fairly good annotation and automatic training capabilities but is a bit more fragile when it comes to edge deployment. However today, the performance of the platform is not high enough to meet the requirements of an enterprise-level application.

Click here to read the full review.

AWS Sagemaker

Amazon Sagemaker targets developers, providing finer control on some basic elements but requiring a fair amount of coding to get it working. This leads to the most complex platform of the survey with some good results if properly configured.

Click here to read the full review.


Deepomatic is the go-to-platform if you want to be able to address your whole enterprise applications lifecycle from a centralized place with built-in industry best practices and state of the art models. This is the most feature-rich platform while at the same time requiring the least amount of coding and development skills.


If you want to go beyond a simple concept and if you need high performance, the only platforms today that allow it are Deepomatic, AWS Sagemaker and Google AutoML.

If you would like to read concrete use cases, click here

If you want to know the 6 steps to create your own image recognition system, click here.