pytesseract 提高图像上模糊数字的 OCR 准确性
数字示例
我正在使用标准 pytesseract img 来发送文本。我尝试过仅使用数字选项,90% 的情况下它都是完美的,但上面是一个出现严重错误的示例!这个例子根本没有产生任何字符
正如你所看到的,现在有字母,所以语言选项没有用,我确实尝试在抓取的图像中添加一些文本,但它仍然出错。
我使用 CV2 增加了对比度,捕获的上游文本已模糊
对于提高准确性有什么想法吗?
经过多次测试后使用以下建议。我发现锐度滤镜给出的结果不可靠。您可以使用的另一个工具是contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200) 我用这个作为我的黑白文本,最终使用convertScaleAbs在灰色背景上得到浅灰色文本我能够增加对比度以获得几乎黑白图像
OCR的基本步骤
- 转换为单色
- 将图像裁剪为目标文本
- 过滤器获取黑白图像
- 执行 OCR
Example of numbers
I am using the standard pytesseract img to text. I have tried with digits only option 90% of the time it is perfect but above is a example where it goes horribly wrong! This example produced no characters at all
As you can see there are now letters so language option is of no use, I did try adding some text in the grabbed image but it still goes wrong.
I increased the contrast using CV2 the text has been blurred upstream of my capture
Any ideas on increasing accuracy?
After many tests using the suggestions below. I found the sharpness filter gave unreliable results. another tool you can use is contrast=cv2.convertScaleAbs(img2,alpha=2.5,beta=-200)
I used this as my text in black and white ended up light gray text on a gray background with convertScaleAbs I was able to increase the contrast to get almost a black and white image
Basic steps for OCR
- Convert to monochrome
- Crop image to your target text
- Filter image to get black and white
- perform OCR
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是使用 OpenCV 和 Pytesseract OCR 的简单方法。要对图像执行 OCR,对图像进行预处理非常重要。这个想法是获得处理后的图像,其中要提取的文本为黑色,背景为白色。为此,我们可以转换为 灰度,然后使用
cv2. filter2D()
增强模糊部分。一般的锐化内核如下所示:其他内核变体可以在此处找到。根据图像,您可以调整滤镜的强度。从这里我们Otsu 阈值 获取二进制图像,然后使用
--psm 6
配置选项执行文本提取,以假设单个统一块 文本。请查看此处了解更多 OCR 配置选项。这是图像处理管道的可视化:
输入图像
转换为灰度
->
应用锐化滤镜Otsu 的阈值
结果来自Pytesseract OCR
代码
Here's a simple approach using OpenCV and Pytesseract OCR. To perform OCR on an image, it's important to preprocess the image. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, then apply a sharpening kernel using
cv2.filter2D()
to enhance the blurred sections. A general sharpening kernel looks like this:Other kernel variations can be found here. Depending on the image, you can adjust the strength of the filter. From here we Otsu's threshold to obtain a binary image then perform text extraction using the
--psm 6
configuration option to assume a single uniform block of text. Take a look here for more OCR configuration options.Here's a visualization of the image processing pipeline:
Input image
Convert to grayscale
->
apply sharpening filterOtsu's threshold
Result from Pytesseract OCR
Code