如何识别检查叶数据

发布于 2025-01-24 10:46:51 字数 1804 浏览 2 评论 0原文

再会。我正在尝试识别下面的印刷和手写文本，请检查叶子

，这是预处理后的图像，在下面使用，代码

import cv2 
import pytesseract
import numpy as np

img = cv2.imread('Images/cheque_leaf.jpg')

# Rescaling the image (it's recommended if you’re working with images that have a DPI of less than 300 dpi)
img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
h, w = img.shape[:2]

# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
# it to reduce noise
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]  # perform OTSU threhold
thresh = cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2) # draw a rectangle around regions of interest in an image

# Dilates an image by using a specific structuring element.
# enrich the charecters(to large)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))

# The function erodes the source image using the specified structuring element that determines 
# the shape of a pixel neighborhood over which the minimum is taken
erode = cv2.erode(thresh, kernel, iterations = 1)

# To extract the text
custom_config = r'--oem 3 --psm 6'
pytesseract.image_to_string(thresh, config=custom_config)

，现在使用pytesseract。 image_to_string（）方法将图像转换为文本。在这里，我正在获得无备用的输出。在上面的图像中，我想确定日期，分支机构，数字和措辞的金额以及数字签名名称，然后是帐号。

通过提取上述确切数据来解决上述问题的任何OCR技术。提前致谢

原文

Good day. I'm trying to identify the both printed and hand written text from the below check leaf

and here is the image after preprocessing, used below code

import cv2 
import pytesseract
import numpy as np

img = cv2.imread('Images/cheque_leaf.jpg')

# Rescaling the image (it's recommended if you’re working with images that have a DPI of less than 300 dpi)
img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
h, w = img.shape[:2]

# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
# it to reduce noise
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]  # perform OTSU threhold
thresh = cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2) # draw a rectangle around regions of interest in an image

# Dilates an image by using a specific structuring element.
# enrich the charecters(to large)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))

# The function erodes the source image using the specified structuring element that determines 
# the shape of a pixel neighborhood over which the minimum is taken
erode = cv2.erode(thresh, kernel, iterations = 1)

# To extract the text
custom_config = r'--oem 3 --psm 6'
pytesseract.image_to_string(thresh, config=custom_config)

and now using pytesseract.image_to_string() method to convert image to text. here I'm getting irrelavant output. In that above image I wanted to identify the date,branch payee,amount in both numbers and wordings and digital signature name followed by account number.

any OCR Techniques to solve the above problem by extract the exact data as mentioned above. Thanks in advance

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

找回味觉 2025-01-31 10:46:51

以下只是几种方法之一。

我建议使用沙文纱阈值技术。使用特定公式这里提到的。它涉及计算某个窗口中像素值的平均值和标准偏差。

此功能可在spimage库中可用（也称为scikit-image）

以下是给定图像的工作示例：

from skimage.filters import threshold_sauvola
img = cv2.imread('cheque.jpg', cv2.IMREAD_GRAYSCALE)

# choosing a window size of 13 (feel free to change it and visualize)
thresh_sauvola = threshold_sauvola(img, window_size=13)
binary_sauvola = img > thresh_sauvola

# converting resulting Boolean array to unsigned integer array of 8-bit (0 - 255) 
binary_sauvola_int = binary_sauvola.astype(np.uint8)
result = cv2.normalize(binary_sauvola_int, dst=None, alpha=0, beta=255,norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)

结果：

注意： 此结果只是尝试其他图像处理技术以获得所需结果的启动板。

The following is just one of the several approaches.

I would suggest using Sauvola threshold technique. Threshold is calculated for each pixel in the image using a specific formula mentioned here. It involves calculating the mean and standard deviation of pixel values within a certain window.

This functionality is available in the skimage library (also known as scikit-image)

Following is the working example for the given image:

from skimage.filters import threshold_sauvola
img = cv2.imread('cheque.jpg', cv2.IMREAD_GRAYSCALE)

# choosing a window size of 13 (feel free to change it and visualize)
thresh_sauvola = threshold_sauvola(img, window_size=13)
binary_sauvola = img > thresh_sauvola

# converting resulting Boolean array to unsigned integer array of 8-bit (0 - 255) 
binary_sauvola_int = binary_sauvola.astype(np.uint8)
result = cv2.normalize(binary_sauvola_int, dst=None, alpha=0, beta=255,norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)

Result: