我可以使用哪些其他诊断方法来解决这个特定的 Perl 问题?
经过大量实验,我仍然无法让以下脚本工作。我需要一些关于如何诊断这个特定 Perl 问题的指导。提前致谢。
该脚本用于测试Office 2007 OCR API的使用:
use warnings;
use strict;
use Win32::OLE;
use Win32::OLE::Const;
Win32::OLE::Const->Load("Microsoft Office Document Imaging 12\.0 Type Library")
or
die "Cannot use the Office 2007 OCR API";
my $miDoc = Win32::OLE->new('MODI.Document')
or die "Cannot create a MODI object";
#Loads an existing TIFF file
$miDoc->Create('OCR-test.tif');
#Performs OCR with the OCR language set to English
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
#Get the OCR result
my $OCRresult = $miDoc->{Images}->Item(0)->{Layout}{Text};
print $OCRresult;
我做了一个小测试。我加载了包含 OCR 信息的 .MDI 文件。我删除了 OCR 方法行并运行脚本,得到了预期的文本输出“print $OCRresult”。但除此之外,Perl 会向我抛出错误,说
Use of uninitialized value $OCRresult in print at E:\OCR-test.pl line 15
行有问题
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
我怀疑我尝试将括号留空或使用三个参数(如 'miLANG_ENGLISH',1,1 等)的 ,但没有任何运气。 我还尝试使用 Microsfot Office Document Imaging 来测试我正在尝试的 TIF 是否可以识别文本,结果是肯定的。
那么我还有哪些其他诊断方法呢?
或者,恰好拥有 Office 2007 的人可以使用具有文本内容的任何 jpg、bmp 或 tif 图片来测试我的代码,看看是否有问题吗?
提前致谢。
更新
哈哈,我终于弄清楚问题出在哪里以及如何解决它。 @hobbs,感谢您留下评论:)事情很有趣。当我试图回复您的评论时,我添加了 Office Document Imaging 2003 VBA 语言参考,我又看了一遍那里的内容。以下信息引起了我的注意:
LangId can be one of the following MiLANGUAGES constants.
miLANG_CHINESE_SIMPLIFIED (2052, &H804)
我将以下 OCR 方法行:更改
$miDoc->OCR('miLANG_ENGLISH',1,1);
为:
$miDoc->OCR(2052,1,1);
一些注意事项: 1.我在Windows XP(中文版)上运行ActivePerl 5.10.0 2.在此之前,我已经尝试过 $miDoc->(9) 但没有运气
,突然间,有点神奇的是,令人讨厌的错误说“在 E:\OCR-test.pl 第 15 行打印中使用未初始化的值 $OCRresult”完全消失,OCRed 文本出现在屏幕上。 OCR结果并不令人满意,但参数“2052”指的是中文,而TIF图像包含全英文。所以我将参数更改为 $miDoc->OCR(9,1,1) 但这一次没有运气。 Windows 向我抛出了这个错误:
unknown software exception (0x0000000d)
我将 TIF 图像更改为包含所有汉字的图像,并将参数更改为“$miDoc->OCR(2052,1,1);”再次,这一次一切都按预期进行。 OCR结果令人满意。
现在我认为我的 Office 2007 OCR API 有一些奇怪的地方,如果有人碰巧运行 Windows XP(英文版)并安装了 Office 2007 可能不会遇到该参数的异常错误
$miDoc->OCR(9,1,1);
无论如何,我真的很高兴我'终于开始工作了:D
After a lot of experiments, I still can't get the following script working. I need some guidance on how to diagnoze this particular Perl problem. Thanks in advance.
This script is for testing the use of Office 2007 OCR API:
use warnings;
use strict;
use Win32::OLE;
use Win32::OLE::Const;
Win32::OLE::Const->Load("Microsoft Office Document Imaging 12\.0 Type Library")
or
die "Cannot use the Office 2007 OCR API";
my $miDoc = Win32::OLE->new('MODI.Document')
or die "Cannot create a MODI object";
#Loads an existing TIFF file
$miDoc->Create('OCR-test.tif');
#Performs OCR with the OCR language set to English
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
#Get the OCR result
my $OCRresult = $miDoc->{Images}->Item(0)->{Layout}{Text};
print $OCRresult;
I did a small test. I loaded an .MDI file containing the OCR information. I deleted the OCR method line and ran the script and I got the expected text output of "print $OCRresult". But otherwise, Perl throws me the error saying
Use of uninitialized value $OCRresult in print at E:\OCR-test.pl line 15
I'm suspecting that something's wrong with the line
$miDoc->OCR(LangId => 'miLANG_ENGLISH');
I tried leaving the parens empty or using three paraments, like 'miLANG_ENGLISH',1,1 etc but without any luck.
I also tried using Microsfot Office Document Imaging to test if the TIF I'm experimenting with was text recognizable and the result was positive.
So what other diagnostic methods do I have?
Or can someone who happens to have Office 2007 test my code with a whatever jpg,bmp or tif pictures that have text content and see if something's wrong?
Thanks in advance.
UPDATE
Haha, I've finally figured out where the problem is and how I can solve it. @hobbs, thank you for leaving the comment :) Things are interesting. When I was trying to respond to your comment, I added the link of the url of Office Document Imaging 2003 VBA Language Reference and I took yet another look at the stuff there. And the following information caught my eyes:
LangId can be one of the following MiLANGUAGES constants.
miLANG_CHINESE_SIMPLIFIED (2052, &H804)
I changed the following OCR method line:
$miDoc->OCR('miLANG_ENGLISH',1,1);
to this:
$miDoc->OCR(2052,1,1);
A few notes:
1. I'm running ActivePerl 5.10.0 on Windows XP (Chinese version)
2. Before this, I already tried $miDoc->(9) but without luck
And suddenly and kind of magically that pesky ERROR saying "Use of uninitialized value $OCRresult in print at E:\OCR-test.pl line 15" disappeared completely and the OCRed text appeared on the screen. The OCR result was not satisfying but the parameter "2052" refers to Chinese and the TIF image contains all English. So I changed the parameter to
$miDoc->OCR(9,1,1) but this time without luck. Windows threw me this error:
unknown software exception (0x0000000d)
I changed the TIF image to one that contains all Chinese characters and changed the parameter to "$miDoc->OCR(2052,1,1);" again and this time everything worked just like expected. The OCR result was satisfying.
Now I think there's something weird about my Office 2007 OCR API and if someone who happens to run Windows XP (English version) and have installed Office 2007 would probably not encounter that exception error with the parameter
$miDoc->OCR(9,1,1);
Anyway, I'm really happy that I've finally get things working :D
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对于初学者,我会尝试转储
$miDoc->{Images}
的值——它存在吗?如果它存在并且是一个集合,它包含什么吗?如果它包含任何东西,它是什么?错误?或者也许只是结构与您预期的不同?warn
、Dumper
和一点探索可以大有帮助。顺便说一句,如果您想做“现代”的事情并且不介意从 CPAN 中获取一个漂亮的工具,请尝试
Devel::Dwarn
——它使得转储到 stderr 甚至更多< /em> 比以前更有趣:)For starters I would try dumping the value of
$miDoc->{Images}
-- does it exist? If it exists and it's a collection does it contain anything? If it contains anything, what is it? An error? Or maybe just a different structure than you're expecting?warn
,Dumper
, and a little exploration can go a long way.Incidentally, if you want to do the "modern" thing and don't mind grabbing a nifty tool off of CPAN, try
Devel::Dwarn
-- it makes dumping to stderr even more fun than it was already :)