Introduction to Image Annotation Blog • Feb 10, 2020 • 4 min read

Image annotation is used to train machine learning AI on how to perceive and identify objects. Using the information that is fed to the AI, the algorithm can accurately and independently label images with the correct annotations by recognizing and applying similar patterns from its previous learnings. In this post, we will discuss how computer vision applications can upgrade our lives moving into the future by taking a look at different annotation methods for computer vision models.

Types Of Image Annotation

Let us get familiar with different annotation methods before jumping into the use-cases for these annotation methods. Let’s examine the most common image annotation techniques.

Bounding Boxes

In computer vision bounding boxes are one of the most commonly used types of image annotation, this is due to this type of annotation being versatile and simple. Bounding boxes encompass objects which help the computer vision network to locate objects of interest. They are easy to create, using simple X and Y coordinates for the upper left and bottom right corners of the box.

The bounding box can be applied to almost any conceivable object, and they can substantially improve the accuracy of an object detection system. The only drawback to bounding boxes is the noise incorporated with using this type of annotation method.

Polygonal Segmentation

A more accurate type of image annotation is the polygonal segmentation. The theory behind it is just a more accurate representation than bounding boxes. Like bounding boxes, polygonal segmentation aids a computer vision model to look for an object, but as complex polygons are used instead of simply using a box, the object’s location and boundaries can be determined with much more accurate results.

Even though polygonal segmentation is more time consuming, the precision allows for a great advantage of using polygonal segmentation over bounding boxes. As it eliminates the noise around the object that can potentially confuse the classifier.

Line Annotation

Line annotation encompasses the creation of lines and splines used primarily to delineate boundaries between one part of an image and another. Line annotation is used when a region that needs to be annotated can be considered a boundary, but for bounding boxes or other types of annotations, it is too small or thin.

Splines and lines are easy to create annotations for and commonly used for situations like autonomous vehicles to recognizing lanes or for training robots to identify the difference between parts of a conveyor belt.

Semantic Segmentation

The fourth type of image annotation for computer vision systems, Semantic segmentation is a form of image annotation that involves separating images into different areas, labeling every pixel in an image.

Different regions of an image carry different semantic meanings and are considered separate from other regions. For example, one part of an image could be “rail track”, while another could be “ground”. The idea is that regions are defined based on semantic information and that the image classifier gives a label to every pixel that comprises that region.

Categorization

The fifth type of image annotation for computer vision systems is categorization, it involves simple yes and no labels for images for computer vision models to identify. This is a basic yet crucial type of labeling for computer vision models as this sets a base for images as a whole to determine objects of interest.

3D Cuboid annotation

3D cuboid annotation is a powerful type of image annotation, similar to bounding boxes as they distinguish where a computer vision model should look for objects. Additionally, 3D cuboid annotations have depth in addition to height and width.

The method of creating 3D cuboid annotations is to initially create anchor points which are typically placed at the edges of the item, and the space within these anchors is filled in with a line. Creating a 3D representation of the object which means the computer vision system can learn to distinguish features like volume and position in a 3D space.

Almost anything you want to do with computer vision can be accomplished, it's just a matter of selecting the right tools for the job. Here at Inresource , we strive to create the best training dataset for your innovative AI solutions. We work with various types of image annotation tools, the best thing to do is to see which annotation technique works the best for your application. Get upto 50 hours of free demo and let’s experiment by implementing them today!