pytesseract 提高图像上模糊数字的 OCR 准确性

发布于 2025-01-10 10:23:49 字数 545 浏览 1 评论 0原文

数字示例

数字示例

我正在使用标准 pytesseract img 来发送文本。我尝试过仅使用数字选项,90% 的情况下它都是完美的,但上面是一个出现严重错误的示例!这个例子根本没有产生任何字符

正如你所看到的,现在有字母,所以语言选项没有用,我确实尝试在抓取的图像中添加一些文本,但它仍然出错。

我使用 CV2 增加了对比度,捕获的上游文本已模糊

对于提高准确性有什么想法吗?

经过多次测试后使用以下建议。我发现锐度滤镜给出的结果不可靠。您可以使用的另一个工具是contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200) 我用这个作为我的黑白文本,最终使用convertScaleAbs在灰色背景上得到浅灰色文本我能够增加对比度以获得几乎黑白图像

OCR的基本步骤

  1. 转换为单色
  2. 将图像裁剪为目标文本
  3. 过滤器获取黑白图像
  4. 执行 OCR

Example of numbers

Example of numbers

I am using the standard pytesseract img to text. I have tried with digits only option 90% of the time it is perfect but above is a example where it goes horribly wrong! This example produced no characters at all

As you can see there are now letters so language option is of no use, I did try adding some text in the grabbed image but it still goes wrong.

I increased the contrast using CV2 the text has been blurred upstream of my capture

Any ideas on increasing accuracy?

After many tests using the suggestions below. I found the sharpness filter gave unreliable results. another tool you can use is contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200)
I used this as my text in black and white ended up light gray text on a gray background with convertScaleAbs I was able to increase the contrast to get almost a black and white image

Basic steps for OCR

  1. Convert to monochrome
  2. Crop image to your target text
  3. Filter image to get black and white
  4. perform OCR

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

扛刀软妹 2025-01-17 10:23:49

这是使用 OpenCV 和 Pytesseract OCR 的简单方法。要对图像执行 OCR,对图像进行预处理非常重要。这个想法是获得处理后的图像,其中要提取的文本为黑色,背景为白色。为此,我们可以转换为 灰度,然后使用 cv2. filter2D() 增强模糊部分。一般的锐化内核如下所示:

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

其他内核变体可以在此处找到。根据图像,您可以调整滤镜的强度。从这里我们Otsu 阈值 获取二进制图像,然后使用 --psm 6 配置选项执行文本提取,以假设单个统一块 文本。请查看此处了解更多 OCR 配置选项。


这是图像处理管道的可视化:

输入图像

在此处输入图像描述

转换为灰度 -> 应用锐化滤镜

在此处输入图像描述

Otsu 的阈值

在此处输入图像描述

结果来自Pytesseract OCR

124,685

代码

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, then apply a sharpening kernel using cv2.filter2D() to enhance the blurred sections. A general sharpening kernel looks like this:

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

Other kernel variations can be found here. Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.


Here's a visualization of the image processing pipeline:

Input image

enter image description here

Convert to grayscale -> apply sharpening filter

enter image description here

Otsu's threshold

enter image description here

Result from Pytesseract OCR

124,685

Code

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文