7 steps of image pre-processing to improve OCR using Python

Posted on May 11, 2022 - 6:13 pm by Ashwani Mishra

[clear by=”” id=”” class=”seven_step_heading_dima_clear_item1″][text]

[/text][custom_heading level=”h2″ id=”” class=”” style=””]

7 steps of image pre-processing to improve OCR using Python

[image lightbox=”” width=”” is_gallert_item=”” src=”14865″ alt=”” href=”” title=”” popup_content=”” id=”” class=”” style=””]

[custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

What is OCR?

[/custom_heading][text]Optical Character Recognition (OCR) is a course of perceiving text inside pictures and changing it into an electronic structure. These pictures could be of manually written text, printed text like records, receipts, name cards, and so forth, or even a characteristic scene photo. OCR uses two techniques together to extract text from any image. First, it must do text detection to determine where the text resides in the image. In the second technique, OCR recognizes and extracts the text using text recognition techniques. OCR is an active research area and with the introduction of deep learning, the performance of various OCR models has been increased sufficiently. [/text]

[clear by=”15px” id=”” class=””]

[custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

What are the application areas for OCR?

[/custom_heading][text]OCR has many application areas in the real world and one particularly important benefit is to minimize the human effort across various industries in our everyday life. Some of the popular application areas for OCR are the digitization of various paperwork, book scanning, reading signboards to translate into various languages, reading signboards for self-driving cars, registration number extraction from vehicle’s number plate, and handwritten recognition tasks, etc,[/text][clear by=”15px” id=”” class=””]

[custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

Why does the image pre-processing important for any OCR model’s performance?

[/custom_heading]

[text]OCR has many application areas in the real world and one particularly important benefit is to minimize the human effort across various industries in our everyday life. Some of the popular application areas for OCR are the digitization of various paperwork, book scanning, reading signboards to translate into various languages, reading signboards for self-driving cars, registration number extraction from vehicle’s number plate, and handwritten recognition tasks, etc,

We have consolidated seven useful steps for pre-processing the image before providing it to OCR for text extraction. Explain these pre-processing steps, we are going to use OpenCV and Pillow library.[/text]

[image lightbox=”” width=”” is_gallert_item=”” src=”15240″ alt=”” href=”” title=”” popup_content=”” id=”” class=”” style=””]

[clear by=”35px” id=”” class=””][custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

Installing required software for OCR pre-processing

[/custom_heading]

[text]Install OpenCV and Pillow Library:

Install the main module of OpenCV using pip command
pip install OpenCV-python
Or you can Install the full package of OpenCV using pip command
pip install OpenCV-contrib-python
Install Pillow library using pip command
pip install pillow
Import the OpenCV in the code as given below
import cv2

[/text]

[clear by=”25px” id=”” class=””]

[clear by=”15px” id=”” class=””][custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

Seven steps to perform image pre-processing for OCR

[/custom_heading][clear by=”15px” id=”” class=””][image lightbox=”” width=”” is_gallert_item=”” src=”15234″ alt=”” href=”” title=”” popup_content=”” id=”” class=”” style=””][clear by=”35px” id=”” class=””]

[clear by=”15px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

1. Normalization

[/custom_heading][text]This process changes the range of pixel intensity values. The purpose of performing normalization is to bring image to range that is normal to sense. OpenCV uses normalize () function for the image normalization.[/text][text]

norm_img = np.zeros((img.shape[0], img.shape[1]))
img = cv2.normalize(img, norm_img, 0, 255, cv2.NORM_MINMAX)

[/text]

[clear by=”15px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

2. Skew Correction

[/custom_heading][text]While scanning or taking a picture of any document, it is possible that the scanned or captured image might be slightly skewed sometimes. For the better performance of the OCR, it is good to determine the skewness in image and correct it.[/text][text]

def deskew(image):

co_ords = np.column_stack(np.where(image > 0))

angle = cv2.minAreaRect(co_ords)[-1]

if angle < -45:

angle = -(90 + angle)

else:

angle = -angle

(h, w) = image.shape[:2]

center = (w // 2, h // 2)

M = cv2.getRotationMatrix2D(center, angle, 1.0)

rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC,

borderMode=cv2.BORDER_REPLICATE)

return rotated

[/text]

[clear by=”25px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

3. Image Scaling

[/custom_heading][text]To achieve a better performance of OCR, the image should have more than 300 PPI (pixel per inch). So, if the image size is less than 300 PPI, we need to increase it. We can use the Pillow library for this.[/text][text]

from PIL import Image

def set_image_dpi(file_path):

I’m = Image.open(file_path)

length_x, width_y = im.size

factor = min(1, float(1024.0 / length_x))

size = int(factor * length_x), int(factor * width_y)

im_resized = im.resize(size, Image.ANTIALIAS)

temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=’.png’)

temp_filename = temp_file.name

im_resized.save(temp_filename, dpi=(300, 300))

return temp_filename

[/text]

[clear by=”25px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

4. Noise Removal

[/custom_heading][text]This step removes the small dots/patches which have high intensity compared to the rest of the image for smoothening of the image. OpenCV’s fast Nl Means Denoising Coloured function can do that easily.[/text][text]

def remove_noise(image):

return cv2.fastNlMeansDenoisingColored(image, None, 10, 10, 7, 15)

[/text]

[clear by=”25px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

5. Thinning and Skeletonization

[/custom_heading][text]This step is performed for the handwritten text, as different writers use different stroke widths to write. This step makes the width of strokes uniform. This can be done in OpenCV[/text][text]

img = cv2.imread(‘j.png’,0)
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(img, kernel, iterations = 1)

[/text]

[clear by=”10px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

6. Gray Scale image

[/custom_heading][text]This process converts an image from other color spaces to shades of Gray. The colour varies between complete black and complete white. OpenCV’s cvtColor() function perform this task very easily.[/text][text]

def get_grayscale(image):

return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

[/text]

[clear by=”25px” id=”” class=””][custom_heading level=”h3″ id=”” class=”” style=”margin-bottom: 0px;”]

7. Thresholding or Binarization

[/custom_heading][text]This step converts any colored image into a binary image that contains only two colors black and white. It is done by fixing a threshold (normally half of the pixel range 0-255, i.e., 127). The pixel value having greater than the threshold is converted into a white pixel else into a black pixel. To determine the threshold value according to the image Otsu’s Binarization and Adaptive Binarization can be a better choice. In OpenCV, this can be done as given.[/text][text]

def thresholding(image):

return cv2.threshold(image, 0, 255, cv2.THRESH_BINARY +

cv2.THRESH_OTSU)[1]

[/text]

[clear by=”40px” id=”” class=””][custom_heading level=”h2″ id=”” class=”” style=”margin-bottom: 0px;”]

Conclusion:

[/custom_heading][text]OCR has a wide range of application areas in the real world and improving the performance of OCR models is necessary to avoid the mistakes in the real world. Image pre-processing reduces the error by a significant margin and helps to perform OCR better. Image pre-processing steps can be decided based on the images available for text extraction. Based on the image, some steps can be removed, and some others can be added as per requirement. The pre-processing becomes more effective when applied after having a better understanding of the input data (images) and the task to perform.[/text][text]If you like the article, please let us know via your comments. If you are looking for help in NLP projects then schedule a discussion using the link or send an email to [email protected]
https://calendly.com/nextgeninvent
Follow us for more information: https://www.linkedin.com/company/nextgen-invent-corporation/[/text]

[clear by=”40px” id=”” class=””]

[clear by=”20px” id=”” class=””][text]

Stay In the know

Get latest updates and industry insights every month.

[/text][clear by=”15px” id=”” class=””]

Machine Learning – Supervised, Unsupervised, & Reinforced Learning

Posted on April 1, 2022 - 9:32 pm by Bhaskar Sharma

[clear by=”20px” id=”” class=””][text]

Artificial Intelligence, Machine Learning

[/text][custom_heading id=”” class=”” style=””]

Machine Learning – Supervised, Unsupervised, & Reinforced Learning

[/custom_heading][share facebook=”true” twitter=”true” linkedin=”true” email=”true” size=”small” id=”” class=”” style=”margin-top: 10px;”][clear by=”15px” id=”” class=””][text]Machine learning is a vast topic with so many intricacies that it can be confusing where to start. Machine learning is the force behind many of the algorithms that govern our lives like Amazon’s recommendation engine, fraud detection software, financial market tracking, and supply chain logistics management across the industry.

The fundamental function of all these algorithms is their ability to learn. Artificial intelligence and machine learning models can learn in many ways. Each of these learning methods can sound complicated if you don’t have in-depth technical knowledge, so let’s dive in for a simple explainer on different learning models in machine learning.[/text]

[image lightbox=”” width=”” is_gallert_item=”” src=”14637″ alt=”” href=”” title=”” popup_content=”” id=”” class=”” style=””]

[custom_heading id=”” class=”” style=””]

Supervised Learning

[/custom_heading][text]

Supervised learning is the simplest to understand. You need to provide an input to which you already know what the output should be. What you don’t know is how you can reach this output, and this is what the model will study and base a prediction on.

Supervised learning involves the use of existing data to train models. The input data is training data. The algorithm works on this to produce a prediction. It can compare the output produced to the intended output and find errors that is modified as needed.

[/text][clear by=”35px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

Unsupervised learning

[/custom_heading][text]

Unsupervised learning involves the use of unlabelled data to train the machine to identify patterns and cluster the data. It is best for situations where you have complex data and are not sure what your desired outcome is. This method of learning gives you a better understanding of the inner relationships within your data that you may process further.

The two most important types of unsupervised learning are clustering and association. Clustering involves the grouping of items based on similarities. Association learning is used when you want to find the link between data columns.

[/text][clear by=”35px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

Reinforced learning

[/custom_heading][text]

In reinforced learning, models are created to incorporate rewards and punishments. This encourages the model to chase rewards and minimize punishments, teaching it how to make decisions. The model learns to recognize relevant signals and decide the best action to maximize the reward. When a loop is completed, a reinforcement signal is needed to give feedback to the model on how to proceed further.

Reinforced learning is a sort of middle ground between supervised and unsupervised learning. The model is provided labeled data, like in supervised learning but the model is able to make judgments by itself, like unsupervised learning. Recommendation algorithms often function on reinforced learning.

[/text][clear by=”35px” id=”” class=””]

[text]

Machine learning as a field is expanding by the day, with more accurate and complex algorithms. A solid understanding of the basic way these algorithms learn will help you get a better understanding of what you can achieve with AI and machine learning systems. Much like how we learn, an abundance of patience and effort we need to ensure your AI system learns well. Remember, it only gets better with every round of learning!

[/text]

[clear by=”40px” id=”” class=””]

[clear by=”20px” id=”” class=””][text]

Stay In the know

Get latest updates and industry insights every month.

[/text][clear by=”15px” id=”” class=””]

AI Driving 2022 Future Business Trends

Posted on April 1, 2022 - 9:25 pm by Bhaskar Sharma

[clear by=”76px” id=”” class=””][text]

Data + AI + Analytics, Machine Learning

[/text][custom_heading id=”” class=”” style=””]

Artificial Intelligence, driving 2022 future business trends

[/custom_heading][share facebook=”true” twitter=”true” linkedin=”true” email=”true” size=”small” id=”” class=”” style=”margin-top: 10px;”][clear by=”15px” id=”” class=””][text]The role of AI in work life and business is undisputed. As we move forward, this role will only expand to include more functionalities and use cases like artificial intelligence, machine learning, and data processing becomes more advanced and efficient. In the past few years alone, we’ve seen AI grow by leaps and bounds and this growth shows no signs of slowing down just yet.

Looking to the future, how will AI impact work and business? It can be hard to predict but as an overview, we can expect the following changes globally:[/text]

[image lightbox=”” width=”” is_gallert_item=”” src=”14679″ alt=”” href=”” title=”” popup_content=”” id=”” class=”” style=””]

[custom_heading id=”” class=”” style=””]

1. Productivity will increase

[/custom_heading][text]One of the biggest advantages of integrating AI into your business is the major boost in productivity that you receive. An efficient and effective AI system can reduce human error, find new solutions to old problems, and help you avoid pitfalls in the future. As the technology behind artificial improves, we can expect this productivity boost to keep getting better. The goal is to reach a stage where less training is needed for your AI model in order to get more things done. This will directly result in enhanced productivity across industries.[/text][clear by=”35px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

2. Consolidation

[/custom_heading][text]Services via companies using the power of Artificial intelligence will have lower operations costs and will be able to scale operations faster and cheaper. That will accelerate consolidation in different industries. Think of the Chrome browser, it’s so ubiquitous in its market that many users forget that other browsers exist.[/text][clear by=”35px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

3. Personalization

[/custom_heading][text]Consolidation can spell doom for smaller companies unless they pivot and adapt. Personalization of products and services should be the focus of small companies to stay competitive and profitable. This means personalization not only at a customer segment level but delving deeper so that each customer becomes a segment unto themselves. For example, personalized medicine is one such opportunity for hyper-personalization.

Integrating AI systems into their businesses models is the only way for small companies to make this shift towards personalization. The strength of a company’s AI capabilities will likely be one of the biggest factors for business success in the future.[/text][clear by=”35px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

4. Less time from ideation to launch

[/custom_heading][text]The current turnaround time for various product development clocks in at about 6 months, from ideation to launch. This production window varies based on the industry or product. Overall, we can expect a shortening of the time taken to ideate, develop, test, and launch new products and services. Artificial intelligence and machine learning can play vital roles in reducing the development process. Zara is known today for taking less time from idea to launch. But in the future, this duration will be in hours and days and not in weeks or months. Further along, AI can speed up the testing process and help products launch quicker and with fewer setbacks.[/text][clear by=”40px” id=”” class=””]

[custom_heading id=”” class=”” style=”margin-bottom: 0px;”]

5. Convergence of physical and digital world

[/custom_heading][text]By 2030, about $5.5 trillion to $12.6 trillion of value will be unlocked globally using IoT products and services. Virtual Reality, Mixed Reality, and extended reality are continually challenging customer experience and re-writing the rules. The convergence of the digital and physical worlds is going to be a reality in the future.

AI and machine learning are now a mainstay of the business. As the field grows, it’s still not too late to get in on the action. Adopting AI in your business can help you grow and explore exciting new opportunities. AI can give your company a competitive edge while ensuring that your technological strategy stays updated. Now is a great time to invest in AI and see the wonders it can do for your business.[/text]

[clear by=”40px” id=”” class=””]

[clear by=”20px” id=”” class=””][text]

Stay In the know

Get latest updates and industry insights every month.

[/text][clear by=”15px” id=”” class=””]

Tag: Machine Learning

7 steps of image pre-processing to improve OCR using Python

7 steps of image pre-processing to improve OCR using Python

What is OCR?

What are the application areas for OCR?

Why does the image pre-processing important for any OCR model’s performance?

Installing required software for OCR pre-processing

Seven steps to perform image pre-processing for OCR

1. Normalization

2. Skew Correction

3. Image Scaling

4. Noise Removal

5. Thinning and Skeletonization

6. Gray Scale image

7. Thresholding or Binarization

Conclusion:

Stay In the know

Machine Learning – Supervised, Unsupervised, & Reinforced Learning

Machine Learning – Supervised, Unsupervised, & Reinforced Learning

Supervised Learning

Unsupervised learning

Reinforced learning

Stay In the know

AI Driving 2022 Future Business Trends

Artificial Intelligence, driving 2022 future business trends

1. Productivity will increase

2. Consolidation

3. Personalization

4. Less time from ideation to launch

5. Convergence of physical and digital world

Stay In the know

Services

Insights