pytesseract 提高图像上模糊数字的 OCR 准确性

发布于 2025-01-10 10:23:49 字数 545 浏览 1 评论 0原文

数字示例

我正在使用标准 pytesseract img 来发送文本。我尝试过仅使用数字选项，90% 的情况下它都是完美的，但上面是一个出现严重错误的示例！这个例子根本没有产生任何字符

正如你所看到的，现在有字母，所以语言选项没有用，我确实尝试在抓取的图像中添加一些文本，但它仍然出错。

我使用 CV2 增加了对比度，捕获的上游文本已模糊

对于提高准确性有什么想法吗？

经过多次测试后使用以下建议。我发现锐度滤镜给出的结果不可靠。您可以使用的另一个工具是contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200) 我用这个作为我的黑白文本，最终使用convertScaleAbs在灰色背景上得到浅灰色文本我能够增加对比度以获得几乎黑白图像

OCR的基本步骤

转换为单色
将图像裁剪为目标文本
过滤器获取黑白图像
执行 OCR

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

扛刀软妹 2025-01-17 10:23:49

这是使用 OpenCV 和 Pytesseract OCR 的简单方法。要对图像执行 OCR，对图像进行预处理非常重要。这个想法是获得处理后的图像，其中要提取的文本为黑色，背景为白色。为此，我们可以转换为灰度，然后使用 cv2. filter2D() 增强模糊部分。一般的锐化内核如下所示：

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

其他内核变体可以在此处找到。根据图像，您可以调整滤镜的强度。从这里我们Otsu 阈值获取二进制图像，然后使用 --psm 6 配置选项执行文本提取，以假设单个统一块文本。请查看此处了解更多 OCR 配置选项。

这是图像处理管道的可视化：

输入图像

转换为灰度 -> 应用锐化滤镜

Otsu 的阈值

结果来自Pytesseract OCR

124,685

代码

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()

Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, then apply a sharpening kernel using cv2.filter2D() to enhance the blurred sections. A general sharpening kernel looks like this:

[[-1,-1,-1], [-1,9,-1], [-1,-1,-1]]

Other kernel variations can be found here. Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.

Here's a visualization of the image processing pipeline:

Input image

Convert to grayscale -> apply sharpening filter

Otsu's threshold

Result from Pytesseract OCR

124,685

Code

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Load image, grayscale, apply sharpening filter, Otsu's threshold 
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# OCR
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)

cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.waitKey()

回复收藏 0 原文

~没有更多了~