使用默认命名空间绑定对 XML 进行 PHP xpath 查询

发布于 2024-11-17 03:19:51 字数 3016 浏览 0 评论 0原文

我对这个主题问题有一个解决方案，但这是一个黑客，我想知道是否有更好的方法来做到这一点。

下面是一个示例 XML 文件和一个 PHP CLI 脚本，该脚本执行作为参数给出的 xpath 查询。对于这个测试用例，命令行是：

./xpeg "//MainType[@ID=123]"

最奇怪的是这一行，没有它我的方法就不起作用：

$result->loadXML($result->saveXML($result));

据我所知，这只是重新解析修改后的 XML，在我看来这应该没有必要。

有没有更好的方法在 PHP 中对此 XML 执行 xpath 查询？

XML（注意默认命名空间的绑定）：

<?xml version="1.0" encoding="utf-8"?>
<MyRoot
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.example.com/data http://www.example.com/data/MyRoot.xsd"
 xmlns="http://www.example.com/data">
  <MainType ID="192" comment="Bob's site">
    <Price>$0.20</Price>
    <TheUrl><![CDATA[http://www.example.com/path1/]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="123" comment="Test site">
    <Price>$99.95</Price>
    <TheUrl><![CDATA[http://www.example.com/path2]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="922" comment="Health Insurance">
    <Price>$600.00</Price>
    <TheUrl><![CDATA[http://www.example.com/eg/xyz.php]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="389" comment="Used Cars">
    <Price>$5000.00</Price>
    <TheUrl><![CDATA[http://www.example.com/tata.php]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
</MyRoot>

PHP CLI 脚本：

#!/usr/bin/php-cli
<?php

$xml = file_get_contents("xpeg.xml");

$domdoc = new DOMDocument();
$domdoc->loadXML($xml);

// remove the default namespace binding
$e = $domdoc->documentElement;
$e->removeAttributeNS($e->getAttributeNode("xmlns")->nodeValue,"");

// hack hack, cough cough, hack hack
$domdoc->loadXML($domdoc->saveXML($domdoc));

$xpath = new DOMXpath($domdoc);

$str = trim($argv[1]);
$result = $xpath->query($str);
if ($result !== FALSE) {
  dump_dom_levels($result);
}
else {
  echo "error\n";
}

// The following function isn't really part of the
// question. It simply provides a concise summary of
// the result.
function dump_dom_levels($node, $level = 0) {
  $class = get_class($node);
  if ($class == "DOMNodeList") {
    echo "Level $level ($class): $node->length items\n";
    foreach ($node as $child_node) {
      dump_dom_levels($child_node, $level+1);
    }
  }
  else {
    $nChildren = 0;
    foreach ($node->childNodes as $child_node) {
      if ($child_node->hasChildNodes()) {
        $nChildren++;
      }
    }
    if ($nChildren) {
      echo "Level $level ($class): $nChildren children\n";
    }
    foreach ($node->childNodes as $child_node) {
      if ($child_node->hasChildNodes()) {
        dump_dom_levels($child_node, $level+1);
      }
    }
  }
}
?>

原文

I have one solution to the subject problem, but it’s a hack and I’m wondering if there’s a better way to do this.

Below is a sample XML file and a PHP CLI script that executes an xpath query given as an argument. For this test case, the command line is:

./xpeg "//MainType[@ID=123]"

What seems most strange is this line, without which my approach doesn’t work:

$result->loadXML($result->saveXML($result));

As far as I know, this simply re-parses the modified XML, and it seems to me that this shouldn’t be necessary.

Is there a better way to perform xpath queries on this XML in PHP?

XML (note the binding of the default namespace):

<?xml version="1.0" encoding="utf-8"?>
<MyRoot
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.example.com/data http://www.example.com/data/MyRoot.xsd"
 xmlns="http://www.example.com/data">
  <MainType ID="192" comment="Bob's site">
    <Price>$0.20</Price>
    <TheUrl><![CDATA[http://www.example.com/path1/]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="123" comment="Test site">
    <Price>$99.95</Price>
    <TheUrl><![CDATA[http://www.example.com/path2]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="922" comment="Health Insurance">
    <Price>$600.00</Price>
    <TheUrl><![CDATA[http://www.example.com/eg/xyz.php]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
  <MainType ID="389" comment="Used Cars">
    <Price>$5000.00</Price>
    <TheUrl><![CDATA[http://www.example.com/tata.php]]></TheUrl>
    <Validated>N</Validated>
  </MainType>
</MyRoot>

PHP CLI Script:

#!/usr/bin/php-cli
<?php

$xml = file_get_contents("xpeg.xml");

$domdoc = new DOMDocument();
$domdoc->loadXML($xml);

// remove the default namespace binding
$e = $domdoc->documentElement;
$e->removeAttributeNS($e->getAttributeNode("xmlns")->nodeValue,"");

// hack hack, cough cough, hack hack
$domdoc->loadXML($domdoc->saveXML($domdoc));

$xpath = new DOMXpath($domdoc);

$str = trim($argv[1]);
$result = $xpath->query($str);
if ($result !== FALSE) {
  dump_dom_levels($result);
}
else {
  echo "error\n";
}

// The following function isn't really part of the
// question. It simply provides a concise summary of
// the result.
function dump_dom_levels($node, $level = 0) {
  $class = get_class($node);
  if ($class == "DOMNodeList") {
    echo "Level $level ($class): $node->length items\n";
    foreach ($node as $child_node) {
      dump_dom_levels($child_node, $level+1);
    }
  }
  else {
    $nChildren = 0;
    foreach ($node->childNodes as $child_node) {
      if ($child_node->hasChildNodes()) {
        $nChildren++;
      }
    }
    if ($nChildren) {
      echo "Level $level ($class): $nChildren children\n";
    }
    foreach ($node->childNodes as $child_node) {
      if ($child_node->hasChildNodes()) {
        dump_dom_levels($child_node, $level+1);
      }
    }
  }
}
?>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

行雁书 2024-11-24 03:19:51

解决方案是使用命名空间，而不是摆脱它。

$result = new DOMDocument();
$result->loadXML($xml);

$xpath = new DOMXpath($result);
$xpath->registerNamespace("x", trim($argv[2]));

$str = trim($argv[1]);
$result = $xpath->query($str);

并在命令行上这样调用它（注意 XPath 表达式中的 x:）。

./xpeg "//x:MainType[@ID=123]" "http://www.example.com/data"

来使其更加闪亮。

您可以通过自己查找默认命名空间（通过查看文档元素的命名空间属性）
支持在命令行上支持多个命名空间，并在 $xpath->query() 之前注册所有命名空间，
支持 xyz=http//namespace.uri/ 形式的参数> 创建自定义命名空间前缀

底线是：当您真正指的是 //namespace:foo 时，XPath 无法查询 //foo。这些根本不同，因此选择不同的节点。 XML 可以定义默认名称空间（因此可以删除文档中显式的名称空间使用）这一事实并不意味着您可以删除 XPath 中的名称空间使用。

The solution is using the namespace, not getting rid of it.

$result = new DOMDocument();
$result->loadXML($xml);

$xpath = new DOMXpath($result);
$xpath->registerNamespace("x", trim($argv[2]));

$str = trim($argv[1]);
$result = $xpath->query($str);

And call it as this on the command line (note the x: in the XPath expression)

./xpeg "//x:MainType[@ID=123]" "http://www.example.com/data"

You can make this more shiny by

finding out default namespaces yourself (by looking at the namespace property of the document element)
supporting more than one namespace on the command line and register them all before $xpath->query()
supporting arguments in the form of xyz=http//namespace.uri/ to create custom namespace prefixes

Bottom line is: In XPath you can't query //foo when you really mean //namespace:foo. These are fundamentally different and therefore select different nodes. The fact that XML can have a default namespace defined (and thus can drop explicit namespace usage in the document) does not mean you can drop namespace usage in XPath.

回复收藏 0 原文

芯好空 2024-11-24 03:19:51

只是出于好奇，如果删除这条线会发生什么？

$e->removeAttributeNS($e->getAttributeNode("xmlns")->nodeValue,"");

在我看来，这最有可能导致您需要进行黑客攻击。您基本上是删除 xmlns="http://www.example.com/data" 部分，然后重新构建 DOMDocument。您是否考虑过简单地使用字符串函数来删除该名称空间？

$pieces = explode('xmlns="', $xml);
$xml = $pieces[0] . substr($pieces[1], strpos($pieces[1], '"') + 1);

然后继续你的路吗？它甚至可能会变得更快。

Just out of curiosity, what happens if you remove this line?

$e->removeAttributeNS($e->getAttributeNode("xmlns")->nodeValue,"");

That strikes me as the most likely to cause the need for your hack. You're basically removing the xmlns="http://www.example.com/data" part and then re-building the DOMDocument. Have you considered simply using string functions to remove that namespace?

$pieces = explode('xmlns="', $xml);
$xml = $pieces[0] . substr($pieces[1], strpos($pieces[1], '"') + 1);

Then continue on your way? It might even end up being faster.

回复收藏 0 原文

不美如何 2024-11-24 03:19:51

考虑到 XPath 语言的当前状态，我认为 Tomalek 提供了最好的答案：将前缀与默认名称空间关联起来，并为所有标记名称添加前缀。这就是我打算在当前应用程序中使用的解决方案。

当这不可能或不可行时，比我的黑客更好的解决方案是调用与重新扫描执行相同操作的方法（希望更有效）： DOMDocument::normalizeDocument()。该方法的行为“就像您保存然后加载文档一样，将文档置于‘正常’形式。”

回复收藏 0 原文

菩提树下叶撕阳。 2024-11-24 03:19:51

另外，作为变体，您可以使用 xpath 掩码：

//*[local-name(.) = 'MainType'][@ID='123']

Also as a variant you may use a xpath mask:

//*[local-name(.) = 'MainType'][@ID='123']

回复收藏 0 原文

~没有更多了~

关于作者

无戏配角

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

使用默认命名空间绑定对 XML 进行 PHP xpath 查询

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

一梦浮鱼

mb_Z9jVigFL

伴随着你

耳钉梦

18618447101

蜗牛

友情链接

使用默认命名空间绑定对 XML 进行 PHP xpath 查询

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

一梦浮鱼

mb_Z9jVigFL

伴随着你

耳钉梦

18618447101

蜗牛

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。