Different Type of Computer Vision Problems


Computer vision was introduced as a way to derive repeated predictive insights from visual data.

But what kinds of problems can you solve with computer vision?

Since the 1960s
, when neurophysiologists tried to understand cat vision for the first time, scientists have been trying to develop ways for machines to derive insights from visual input.

Although AI
and computer vision was only an academic field of study in the 1960s

the development of the first robust optical character recognition, in short OCR
, system redirected the focus of AI
in 1974
.

Instead of exclusively focusing on academic studies, computer vision researchers began to address the human vision problem in daily life.

By the 2000s
, computer vision was attempting to solve much more complex challenges, such as

image classification,

image segmentation

object detection

image recognition

pattern recognition

and facial recognition.

Let’s look at some of these problem types:

The most well-known computer vision task, image classification, groups images into different categories.

It’s mostly used for images with a single object.

The key here is assigning a label to an entire image.

Unlike image classification, semantic segmentation does not assign a label to an entire image.

Instead, it partitions an image into multiple regions,

and segments all pixels in the image into different categories.

Then it labels each pixel in the image, including background, with different colors based on their category class or class label.

The next type of problem is instance segmentation,

which identifies

the boundaries of the objects in an image

and labels pixels with different colors.

The exact outline of the object within an image is provided by the image segmentation.

Image classification with localization is a more complex version of an image classification problem.

It assigns a class label to an image,

and also creates a bounding box around a single object in an image.

Similarly, object recognition identifies the objects in an image

by outputting

the class labels

and class probabilities of the objects.

For example, a class label could be “rose”

and the associated class probability could be 0.1

The key here is that the object recognition model recognizes whether an image has a rose in it.

However, it cannot detect where the object is located.

Object detection, as the name suggests,

detects a specific object in an image.

It’s similar to image classification with localization, and especially useful when multiple types of objects are in a single image.
Bounding boxes are used for detection and localization of objects.

Unlike object recognition, object detection both tells you which objects are present in the image and outputs bounding boxes (x, y, width, height) to indicate the location of the objects inside the bounding box.

Pattern recognition detects and identifies

repeated shapes,

colors,

and other visual indicators in visual inputs.

Popular pattern recognition applications for computer vision include facial recognition,

movement recognition,

OCR
,

and medical image recognition.

you’ll explore OCR
services

like converting digital images of typed, handwritten, and printed text into machine-readable forms that can be used for data processing, editing, or searching.

Facial recognition is an advanced type of object detection where the main object is the human face.

Facial recognition can detect multiple faces within an image,

along with key facial attributes,

such as emotional state

or the presence of headwear.

Some facial recognition models can also confirm identity.

Therefore, they can be used to control access to sensitive areas.

Edge detection is a technique used to

extract the edges from an image

by identifying the boundaries of objects within an image.
It is the initial step for object recognition.

The main principle of edge detection is detecting changes in brightness

and intensity levels.

Feature matching is a type of pattern detection that compares the features of images

that might be different in orientations,

perspective,

lighting,

sizes,

and colors.

Automatic object tracking,

3D object reconstruction,

robot navigation,

image retrieval

and indexing are just some of the applications of feature detection and matching.

Now that you’ve reviewed a brief history of computer vision and seen some of the different problem types, you’ll explore some examples of where organizations have applied this powerful technology.