如何使用perl消除xml文件中的标签名称
我在一个文件夹中有多个 XML 文件,所以我编写了这样的脚本来组合成一个 xml 文件,
#!/usr/bin/perl
use warnings;
use XML::LibXML;
use Carp;
use File::Find;
use File::Spec::Functions qw( canonpath );
use XML::LibXML::Reader;
use Digest::MD5 'md5';
if ( @ARGV == 0 ) {
push @ARGV, "c:/main/work";
warn "Using default path $ARGV[0]\n Usage: $0 path ...\n";
}
open( my $allxml, '>', "all_xml_contents.combined.xml" )
or die "can't open output xml file for writing: $!\n";
print $allxml '<?xml version="1.0" encoding="UTF-8"?>',
"\n<Shiporder xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\n";
my %shipto_md5;
find(
sub {
return unless ( /(_stc\.xml)$/ and -f );
extract_information();
return;
},
@ARGV
);
print $allxml "</Shiporder>\n";
sub extract_information {
my $path = $_;
if ( my $reader = XML::LibXML::Reader->new( location => $path )) {
while ( $reader->nextElement( 'data' )) {
my $elem = $reader->readOuterXml();
my $md5 = md5( $elem );
print $allxml $reader->readOuterXml() unless ( $shipto_md5{$md5}++ );
}
}
return;
}
它将所有 xml 文件打印到一个 xml 中,如下所示。
all_xml.combined.xml
<?xml version="1.0" encoding="UTF-8"?>
<student specification xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<student>
<name>johan</name>
</student>
<student>
<name>benny</name>
</student>
<student>
<name>kent</name>
</student>
</student specification>
但我在一个 xml 文件中还有一个节点信息,我尝试在 while 循环中像这样提取该信息。
$reader->nextElement( 'details' );
$information = $reader->readInnerXml();
但我如何将此信息添加到输出文件中,请帮助我解决这个问题。
I have multiple XML files in a folder,so I written script like this to combine into one xml file
#!/usr/bin/perl
use warnings;
use XML::LibXML;
use Carp;
use File::Find;
use File::Spec::Functions qw( canonpath );
use XML::LibXML::Reader;
use Digest::MD5 'md5';
if ( @ARGV == 0 ) {
push @ARGV, "c:/main/work";
warn "Using default path $ARGV[0]\n Usage: $0 path ...\n";
}
open( my $allxml, '>', "all_xml_contents.combined.xml" )
or die "can't open output xml file for writing: $!\n";
print $allxml '<?xml version="1.0" encoding="UTF-8"?>',
"\n<Shiporder xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\n";
my %shipto_md5;
find(
sub {
return unless ( /(_stc\.xml)$/ and -f );
extract_information();
return;
},
@ARGV
);
print $allxml "</Shiporder>\n";
sub extract_information {
my $path = $_;
if ( my $reader = XML::LibXML::Reader->new( location => $path )) {
while ( $reader->nextElement( 'data' )) {
my $elem = $reader->readOuterXml();
my $md5 = md5( $elem );
print $allxml $reader->readOuterXml() unless ( $shipto_md5{$md5}++ );
}
}
return;
}
It printing all xml files into one xml like this.
all_xml.combined.xml
<?xml version="1.0" encoding="UTF-8"?>
<student specification xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<student>
<name>johan</name>
</student>
<student>
<name>benny</name>
</student>
<student>
<name>kent</name>
</student>
</student specification>
but I have one more node information in one xml file, i tried to extract that information like this in while loop.
$reader->nextElement( 'details' );
$information = $reader->readInnerXml();
but how can i add this information to output file, please help me with this problem.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
三个明显的点。
Three obvious points.
您可以切换到 XML::Twig 吗?它提供了处理标签的绝佳方法。
也许您需要类似的东西
您需要修改学生的规范才能为您工作。抱歉,我没有太多时间,否则我就写完整的代码了。
Will it be possible for you to switch to XML::Twig? It provides excellent way of handling the tags.
Probably you need something like
You need to modify the student with specification to work for you. Sorry, I don't have much time, otherwise I would have written complete code.
下面是一些使用 DOMDocument() 执行此操作的代码
总体而言,
1)从字符串或类似内容创建父文档
2)加载每个文件,导入并追加
3) 保存结果。
在 XML 编程中,使用 XML 解析器函数通常比字符串操作更好。
祝你好运。
Here's some code that does it using DOMDocument()
Over all,
1) Create a parent document from a string or similar
2) Load each file, import, and append
3) Save the results.
It's usually better in XML programming to use XML parser functions, rather than string manipulation.
Good luck.