Automatic Labelling: A Big Leap in Data Preparation for AI/ML Models
Tagging or labeling of data is an essential step in training computer vision models. With more and more data being needed for training, it is imperative to label the data in a hassle-free and less time-consuming fashion. This is where automatic labeling comes into the picture.
Challenge
When we started the computer vision model that can identify objects in an image and video, we never realized that the objects, we need to identify may take us over a year to label. We needed to label thousands of objects in millions of images to train our model. Innovation is the mother of necessity and we were forced to come up with options to automate our instrument labeling task.
We could not use solutions from companies such as Hive as those label objects in rectangle boxes and we needed to label exact boundaries of instruments.
Solution
We analyzed multiple tools which could help us in reducing the time and cost of labeling. The following are the tools for evaluation:
Tool | Pros | Cons |
Amazon Sagemaker Ground Truth | Accurately labeled data can manage big data, competitive pricing ($8/100 objects) | Need machine learning experience to carry out labeling jobs |
Lionbridge AI | Highly accurate labeled data, better project management features | Higher pricing |
V7 Darwin | Speeds up labeling time dramatically | Bugs are not managed |
Label | Opensource tool, user friendly | No project management features
|
Based on our selection criteria,3-4 seconds for labeling, we selected Label.
We needed to select a model which can help in automating the labeling of objects. Among the various options below, we selected Detectron2.
This approach allowed us to finish the labeling task in a week.
Results and Learning:
- With the advent of Big Data, the future of data labeling is active learning.
- Data labeling requires quality control, manual intervention , and collaboration to produce high-quality training data.
- The cost of data annotation was scaled down by 5 times.
- Too many data points were created by automation, hence, another algorithm was created to reduce the number of data points.
Our Automatic Labelling to the rescue solution is available now to all our customers at no charge for the models which we are developing. In case you have any queries on how to auto-label the images, please contact us for more information at [email protected].
Visit our website at www.nextgeninvent.com
Stay In the Know
Get Latest updates and industry insights every month.