How to build a labeled dataset for training your image recognition algorithm?

Image recognition algorithms are becoming quite common due to the increase in the use of artificial intelligence in our day to day life. With the advancement in tech, the demand for machine learning developments is significantly increasing. So the question is, how can you train your image recognition algorithm to build your image classifier?

Well, the answer is simple. You need a labeled dataset to get started and train your algorithm. Don’t know how? Not to worry because today, in this article, we are going to be discussing in detail how you can build a labeled dataset. So let us go ahead and jump right into the process.

Download The Images

The first thing that you need is scrap the data itself. With the built-in JavaScript in Windows, it has become quite easy for us to web scrape multiple images from the Google search results and download them with their URLs in a single attempt. Here is how:

Launch your Google Chrome web browser. Make sure to disable the ad blocker option in your web browser settings. Go to the Google Image search section on the web page. Search for the desired image data using a specific keyword or by using the reverse image search function. Let all images load and scroll to the bottom to make sure all images are fully loaded. Now press CTRL + SHIFT + J and copy the following command in the pop-up JavaScript window.

URLs = Array.from(document.querySelectorAll(‘.rg_di .rg_meta’)).map(el=>JSON.parse(el.textContent).ou);

window.open(‘data:text/csv;charset=utf-8,’ + escape(urls.join(‘

’)));

This will download all the images URLs in a file at your default download folder. Run the following command next to download the images automatically. Make sure to add the right paths for the accuracy of the download.

From fastai.vision import *

download_images(/path/to/download/file, destination_folder)

And that is all. You have your set of data ready to be labeled.

Label The Dataset

The annotated images that you wish to use to train your algorithms need to be labeled using image annotation tools. You need to understand that there are different types of image annotations, such as 3D Cuboid, Land marketing, semantic segmentation, polygon annotation, and bounding box. The type that you go with depends on your algorithms and your model type,

Once you have that in mind, you can use any image annotation tool, such as Pigeon, to label your data and classify it into different classes or multi-label them for more precise learning. Simply add your downloaded images to the tool and start annotating your images to have your own labeled data set in no time.

Final Words

Building a labeled data set required determination and hard work. Even though most tasks can be today carried out automatically, when it comes to labeling your images, the work has to be manually. But once it is done, you have an automated, learned machine that is capable of repointing images for you. Cool, isn’t it?