使用 PB 、 EZTWAIN 和 TOCR 3.0 无法识别带有希腊文字的 pdf 扫描页面
我使用 Dosadi 的 PB 10.5.2 和 EZTwain 3.30.0.28、XDefs 1.36b1 进行扫描。
我还使用 TOCR 3.0 进行 OCR 管理。
在一个函数中,我们使用以下内容:
...
Long ll_acquire
(as_path_filename is a function argument)
...
...
TWAIN_SetAutoOCR(1)
ll_acquire = TWAIN_AcquireMultipageFile(0, as_path_filename)
问题是扫描的 pdf 页面包含拉丁语(英语)和希腊语单词。 英文字符的搜索非常精确,但希腊字符则根本没有。
您认为这与 TOCR 软件有关吗? 我只想搜索 AND 希腊单词
提前致谢
Iam using PB 10.5.2 and EZTwain 3.30.0.28, XDefs 1.36b1 by Dosadi for scanning.
Also Iam using the TOCR 3.0 for OCR management.
In a function we use the following among all others :
...
Long ll_acquire
(as_path_filename is a function argument)
...
...
TWAIN_SetAutoOCR(1)
ll_acquire = TWAIN_AcquireMultipageFile(0, as_path_filename)
the problem is that the scanned pdf page has latin (english) and greek words.
The English characters are searched quite precisely but the greek don't at all.
Do you think this that this has to do with the TOCR software.
I just want to search AND for greek words
Thanks in advance
OCR 软件应该无法将希腊语单词转换为 OCR 文本。看起来您正在使用 EZTwain 作为 OCR 部分,该部分使用 TOCR 作为其实际的 OCR 引擎。您可能需要查看该软件的文档,看看它们是否提到了可以修改以供多语言使用的任何设置。
The OCR software should be where it is failing to convert the Greek words into OCR'd text. It looks like you are using EZTwain for the OCR portion which uses TOCR for its actual OCR engine. You may want to look at the docs for that software and see if they mention any settings that can be modified for multilingual usage.