Unable to Extract Numbers from Image Using Tesseract OCR in Python

32 Views Asked by At

I'm currently working on a project where I need to extract numbers from images using Tesseract OCR in Python. However, I'm facing difficulties in achieving accurate results.

here is the image : enter image description here

and here is my code :

from PIL import Image, ImageEnhance, ImageOps
import pytesseract

# Load the screenshot
screenshot = Image.open("screenshot2.png")

# Crop the region containing the numbers
bbox = (970, 640, 1045, 675)
cropped_image = screenshot.crop(bbox)

# Enhance contrast
enhancer = ImageEnhance.Contrast(cropped_image)
enhanced_image = enhancer.enhance(2.0)

# Convert to grayscale
gray_image = enhanced_image.convert("L")

# Apply thresholding
thresholded_image = gray_image.point(lambda p: p > 150 and 255)

# Invert colors
inverted_image = ImageOps.invert(thresholded_image)

# Convert to binary
binary_image = inverted_image.convert("1")

# Save the processed image
binary_image.save("processed_image.png")

# Perform OCR
text = pytesseract.image_to_string(binary_image, config="--psm 13")

# Extract numbers from the OCR result
numbers = [int(num) for num in text.split() if num.isdigit()]

print(numbers)

and the output is simply : []

what i want is to get only the 2 numbers. but if i can just retrieve all the texts, after i can proceces the string to get the 2 numbers only.

Here's what I've tried so far:

I've captured screenshots containing numeric values.
I've cropped the screenshots to focus only on the region containing the numbers.
I've enhanced the contrast and converted the images to grayscale to improve OCR accuracy.
I've applied thresholding and inverted the colors to prepare the images for OCR.
I've tried converting the images to binary format for better recognition.

Despite trying these preprocessing steps and adjusting the OCR configuration (e.g., using --psm 13), I'm still unable to accurately extract the numbers from the images. The OCR output either contains incorrect numbers or fails to detect any numbers at all.

I appreciate any insights or suggestions on how to improve the accuracy of my OCR extraction process. Thank you!

1

There are 1 best solutions below

0
felix st-aubin On

so I did a some research and found this article : https://nanonets.com/blog/ocr-with-tesseract/

!! THE CODE IS CREATE BY THE USE OF THE WEBSITE ARTICLE CREATED BY Filip Zelic & Anuj Sable !!

import cv2
import pytesseract
import numpy as np
from PIL import Image

# Read the image
image = cv2.imread('screenshot2.png')

# Coordinates from PIL cropping (left, top, right, bottom)
left = 970
top = 640
right = 1045
bottom = 675

# Convert to OpenCV coordinates (x, y, width, height)
x = left
y = top
width = right - left
height = bottom - top

# Crop the image
cropped_image = image[y:y+height, x:x+width]

# Convert the cropped image numpy array to a PIL image object
cropped_pil_image = Image.fromarray(cv2.cvtColor(cropped_image, cv2.COLOR_BGR2RGB))

# Convert the PIL image object to a numpy array
cropped_array = np.array(cropped_pil_image)

# Convert the image to grayscale
gray = cv2.cvtColor(cropped_array, cv2.COLOR_BGR2GRAY)

# Perform OCR on the grayscale image
ocr_text = pytesseract.image_to_string(gray)

print(ocr_text)

With this approach, we achieve better accuracy in extracting the numbers [-2, -1] from the provided image consistently.

Thank you for your patience and collaboration in finding an effective solution. If you have any further questions or need additional assistance, feel free to ask!