LIBXML - 如何获取标签的名称?

发布于 2024-11-29 07:51:54 字数 466 浏览 1 评论 0原文

我有以下内容:

my $string='<entry><name>Bob</name><zip>90210</zip></entry>';

my $parser=XML::LibXML->new(); 
use HTML::Entities;
my $encodedXml=encode_entities($string,'&\'');

my $doc=$parser->parse_string($encodedXml);

foreach my $text($doc->findnodes("//text()")){
print $text->to_literal,"\n";
}

这会打印出“Bob”和“90210”;

如何获取实际的节点名称...我需要一种方法来获取 xml 树中的所有节点...即“名称”和“zip”

I have the following:

my $string='<entry><name>Bob</name><zip>90210</zip></entry>';

my $parser=XML::LibXML->new(); 
use HTML::Entities;
my $encodedXml=encode_entities($string,'&\'');

my $doc=$parser->parse_string($encodedXml);

foreach my $text($doc->findnodes("//text()")){
print $text->to_literal,"\n";
}

This prints out 'Bob' and '90210';

How do I get the actual node names...I need a way to get all the nodes within my xml tree....ie 'name' and 'zip'

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

似梦非梦 2024-12-06 07:51:54

文本节点没有名称。也许您想要父母的名字?

我认为这会起作用:

for my $node ($doc->findnodes('//text()')) {
   print $node->parentNode()->nodeName(), ": ", $node->nodeValue(), "\n";
}

我会使用

for my $node ($doc->findnodes('//*[text()]')) {
   print $node->nodeName(), ": ", $node->textContent(), "\n";
}

注意:此较新版本组合了该元素的所有文本子元素,因此如果一个节点具有多个文本子元素,则它是不等效的。不过,它们对你来说应该是等效的。

Text nodes don't have names. Perhaps you want the name of the parent?

I think this will work:

for my $node ($doc->findnodes('//text()')) {
   print $node->parentNode()->nodeName(), ": ", $node->nodeValue(), "\n";
}

I would use

for my $node ($doc->findnodes('//*[text()]')) {
   print $node->nodeName(), ": ", $node->textContent(), "\n";
}

Note: This later version combines all the text children of the element, so it's not equivalent if a node has more than one text child. They should be equivalent for you, though.

风蛊 2024-12-06 07:51:54

您的代码所做的就是选择 text 节点,这些节点作为您要查找的节点的子节点存在。文本节点是一个单独的实体,并且没有名称。您需要导航到文本节点的父节点,并且该节点将包含标签名称。

对于同时包含文本和元素节点的混合内容节点,事情会变得更加棘手,例如

<p>Beginning of <i>sentence</i> and now the end</p>

在这种情况下,结构是

<p>
 |
 +---text (Beginning of )
 |
 +---<i>
 |    |
 |    +---text (sentence)
 |
 +---text ( and now the end)

What your code does is select the text nodes, which exist as children of the nodes you are looking for. A text node is a separate entity, and it does not have a name. You need to navigate to the text node's parent and that node will contain the tag name.

Things get trickier with mixed-content nodes that contain both text and element nodes, such as

<p>Beginning of <i>sentence</i> and now the end</p>

In this case the structure is

<p>
 |
 +---text (Beginning of )
 |
 +---<i>
 |    |
 |    +---text (sentence)
 |
 +---text ( and now the end)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文