We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 9 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
TesseractOCR 可能是目前最好的开源 OCR 引擎,并且非常灵活它能识别什么。它允许使用自定义数据进行训练,因此只要您愿意投入工作(即创建训练集),基本上任何语言都是可能的。
tesseract 提供的工具(带有 GUI 界面)可以帮助创建数据集,其中您指定字符的边界框和相应的转录。
编辑:从另一篇文章(上面链接)注意到,已经为 3.01 版本创建了阿拉伯语训练集。您只需要插入阿拉伯语数据即可解决您的问题:)。
TesseractOCR is the probably the best open source OCR engine out there and is very flexible as to what it can recognize. It allows for training with custom data, so essentially any language is possible as long as your willing to put in the work (i.e. create the training set).
There are tools provided by tesseract (with a gui interface) that can help create the data set where you specify the bounding box of characters and the corresponding transcription.
EDIT: Noticed from another post (linked above) that a training set on Arabic has already been created for version 3.01. You'd just need to plug in the Arabic data and your problem is solved :).
您可以尝试 Abbyy Fine Reader,他们可能有您正在寻找的语言。
You may try the Abbyy Fine Reader, they may have the language you are looking for.