从 Word 表格中提取原始数据?使用 Perl
我正在尝试从 Word 文档中的多个表格中提取数据。当尝试将表中的数据转换为文本时,出现错误。 ConvertToText 方法有两个可选参数(如何分隔数据和布尔值)。这是我当前的代码:
#usr/bin/perl
#OLEWord.pl
#Use string and print warnings
use strict;use warnings;
#Using OLE + OLE constants for Variants and OLE enumeration for Enumerations
use Win32::OLE qw(in);
use Win32::OLE::Const 'Microsoft Word';
use Win32::OLE::Variant;
my $var1 = Win32::OLE::Variant->new(VT_BOOL, 'true');
$Win32::OLE::Warn = 3;
#set the file to be opened
my $file = 'C:\work\SCL_International Financial New Fund Setup Questionnaire V1.6.docx';
#Create a new instance of Win32::OLE for the Word application, die if could not open the application
my $MSWord = Win32::OLE->GetActiveObject('Excel.Application') or Win32::OLE->new('Word.Application','Quit');
#Set the screen to Visible, so that you can see what is going on
$MSWord->{'Visible'} = 1;
$MSWord->{'DisplayAlerts'} = 0; #Supress Alerts, such as 'Save As....'
#open the request file or die and print warning message
my $Doc = $MSWord->{'Documents'}->Open($file) or die "Could not open ", $file, " Error:", Win32::OLE->LastError();
#$MSWord->ActiveDocument->SaveAs({Filename => 'AlteredTest.docx',
#FileFormat => wdFormatDocument});
my $tables = $MSWord->ActiveDocument->{'Tables'};
for my $table (in $tables){
my $tableText = $table->ConverToText(wdSeparateByParagraphs,$var1);
print "Table: ", $tableText, "\n";
}
$MSWord->ActiveDocument->Close;
$MSWord->Quit;
我收到此错误:
在 OLEWord.pl 第 31 行使用“严格替换”时不允许使用裸字“VT_BOOL”
在 OLEWord.pl 第 31 行使用“严格替换”时不允许使用裸字“true”
由于编译错误,OLEWord.pl 的执行中止。
I'm trying to extract data from multiple Tables in a Word document. When trying to convert the data in the tables to text I get an error. The ConvertToText method has two optional parameters(how to seperate the data, and a boolean).Here is my current code:
#usr/bin/perl
#OLEWord.pl
#Use string and print warnings
use strict;use warnings;
#Using OLE + OLE constants for Variants and OLE enumeration for Enumerations
use Win32::OLE qw(in);
use Win32::OLE::Const 'Microsoft Word';
use Win32::OLE::Variant;
my $var1 = Win32::OLE::Variant->new(VT_BOOL, 'true');
$Win32::OLE::Warn = 3;
#set the file to be opened
my $file = 'C:\work\SCL_International Financial New Fund Setup Questionnaire V1.6.docx';
#Create a new instance of Win32::OLE for the Word application, die if could not open the application
my $MSWord = Win32::OLE->GetActiveObject('Excel.Application') or Win32::OLE->new('Word.Application','Quit');
#Set the screen to Visible, so that you can see what is going on
$MSWord->{'Visible'} = 1;
$MSWord->{'DisplayAlerts'} = 0; #Supress Alerts, such as 'Save As....'
#open the request file or die and print warning message
my $Doc = $MSWord->{'Documents'}->Open($file) or die "Could not open ", $file, " Error:", Win32::OLE->LastError();
#$MSWord->ActiveDocument->SaveAs({Filename => 'AlteredTest.docx',
#FileFormat => wdFormatDocument});
my $tables = $MSWord->ActiveDocument->{'Tables'};
for my $table (in $tables){
my $tableText = $table->ConverToText(wdSeparateByParagraphs,$var1);
print "Table: ", $tableText, "\n";
}
$MSWord->ActiveDocument->Close;
$MSWord->Quit;
and I'm getting this error:
Bareword "VT_BOOL" not allowed while "strict subs" in use at OLEWord.pl line 31
Bareword "true" not allowed while "strict subs" in use at OLEWord.pl line 31
Execution of OLEWord.pl aborted due to compilation errors.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
当
VT_BOOL
之类的东西没有定义为常量时,perl 会将它们视为裸字。其他人已经提供了有关他们的信息。问题的根本原因是缺少 Win32::OLE 导出的常量::变体模块。添加:
到您的脚本中以删除第一个错误。第二个是类似的问题,
true
也没有定义。将其替换为1
或自己定义常量:编辑: 以下是提取表格文本的示例:
在您的代码中,方法名称
ConverToText
。该方法还返回Range
对象,因此您必须使用Text
方法来获取实际文本。When things like
VT_BOOL
are not defined as constant, perl will consider them bareword. Others already provided info on them.The root cause of your problem are missing constants that are exported by Win32::OLE::Variant module. Add:
to your script to remove first error. The second one is similar problem,
true
is not defined as well. Replace it with1
or define constant yourself with:Edit: Here is example of extracting table text:
In your code you had typo in method name
ConverToText
. Also the method returnsRange
object, so you have to useText
method to get actual text.不使用 strict 不会给您警告。 (但是您应该将其用于良好的代码)
阅读有关 Bareword 的内容,以便您知道它们是什么,并且您自己知道如何纠正此错误。
以下是一些有关 Bareword 的学习链接:
1. perl.com
2. 校友
Not using strict will not give you the warning. (But you should use it for a good code)
Read about Bareword so that you will know what are they and you will know by your own that how can you correct this error.
Here are some links for study about Bareword:
1. perl.com
2. alumnus
删除“use strict”将删除“Bareword”错误
removing "use strict" will remove the "Bareword" errors
将所有 doc 表提取到一个 xls 文件中
extract all the doc tables into a single xls file