为什么需要 XmlNamespaceManager？

发布于 2024-12-01 17:00:11 字数 2469 浏览 6 评论 0原文

对于为什么（至少在 .Net Framework 中），我的想法有点干巴巴的——有必要使用 XmlNamespaceManager 来处理命名空间（或执行 XPath 查询时相当笨重且冗长的 [local-name()=... XPath 谓词/函数/其他）。我确实理解为什么命名空间是必要的或至少是有益的，但是为什么它如此复杂？

为了查询一个简单的 XML 文档（没有命名空间）...

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode>
   <nodeName>Some Text Here</nodeName>
</rootNode>

...可以使用类似 doc.SelectSingleNode("//nodeName") （这将匹配 ;这里有一些文本）

谜团#1：我的第一个烦恼 - 如果我理解正确的话 - 仅仅是添加一个命名空间引用父/根标签（无论是否用作子节点标记的一部分）如下所示：

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns="http://example.com/xmlns/foo">
   <nodeName>Some Text Here</nodeName>
</rootNode>

...需要几行额外的代码才能获得相同的结果：

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("ab", "http://example.com/xmlns/foo")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//ab:nodeName", nsmgr)

...本质上是梦想一个不存在的前缀（“ab") 来查找甚至不使用前缀的节点。这有什么意义？doc.SelectSingleNode("//nodeName") 有什么问题（概念上）？

谜团#2：所以，假设您有一个使用前缀的 XML 文档：

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns:cde="http://example.com/xmlns/foo" xmlns:feg="http://example.com/xmlns/bar">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

...如果我理解正确的话，您必须将两个命名空间添加到 XmlNamespaceManager ，为了查询单个节点......

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("cde", "http://example.com/xmlns/foo")
nsmgr.AddNamespace("feg", "http://example.com/xmlns/bar")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//feg:nodeName", nsmgr)

为什么在这种情况下，我需要（概念上）命名空间管理器？

******已编辑到下面的评论中****

编辑添加： 我修改和完善的问题是基于我认为大多数情况下 XmlNamespaceManager 的明显冗余以及使用命名空间管理器来指定前缀到 URI 的映射：

当直接映射命名空间前缀 (" cde") 到命名空间 URI ("http://example.com/xmlns/foo") 是源文件中明确指出：

...<rootNode xmlns:cde="http://example.com/xmlns/foo"...

概念需求是什么让程序员在进行查询之前重新创建该映射？

原文

I've come up kinda dry as to why -- at least in the .Net Framework -- it is necessary to use an XmlNamespaceManager in order to handle namespaces (or the rather clunky and verbose [local-name()=... XPath predicate/function/whatever) when performing XPath queries. I do understand why namespaces are necessary or at least beneficial, but why is it so complex?

In order to query a simple XML Document (no namespaces)...

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode>
   <nodeName>Some Text Here</nodeName>
</rootNode>

...one can use something like doc.SelectSingleNode("//nodeName") (which would match <nodeName>Some Text Here</nodeName>)

Mystery #1: My first annoyance -- If I understand correctly -- is that merely adding a namespace reference to the parent/root tag (whether used as part of a child node tag or not) like so:

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns="http://example.com/xmlns/foo">
   <nodeName>Some Text Here</nodeName>
</rootNode>

...requires several extra lines of code to get the same result:

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("ab", "http://example.com/xmlns/foo")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//ab:nodeName", nsmgr)

...essentially dreaming up a non-existent prefix ("ab") to find a node that doesn't even use a prefix. How does this make sense? What is wrong (conceptually) with doc.SelectSingleNode("//nodeName")?

Mystery #2: So, say you've got an XML document that uses prefixes:

<?xml version="1.0" encoding="ISO-8859-1"?>
<rootNode xmlns:cde="http://example.com/xmlns/foo" xmlns:feg="http://example.com/xmlns/bar">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

... If I understand correctly, you would have to add both namespaces to the XmlNamespaceManager, in order to make a query for a single node...

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("cde", "http://example.com/xmlns/foo")
nsmgr.AddNamespace("feg", "http://example.com/xmlns/bar")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//feg:nodeName", nsmgr)

... Why, in this case, do I need (conceptually) a namespace manager?

******REDACTED into comments below****

Edit Added:
My revised and refined question is based upon the apparent redundancy of the XmlNamespaceManager in what I believe to be the majority of cases and the use of the namespace manager to specify a mapping of prefix to URI:

When the direct mapping of the namespace prefix ("cde") to the namespace URI ("http://example.com/xmlns/foo") is explicitly stated in the source document:

...<rootNode xmlns:cde="http://example.com/xmlns/foo"...

what is the conceptual need for a programmer to recreate that mapping before making a query?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

-柠檬树下少年和吉他 2024-12-08 17:00:11

基本点（正如上面 Kev 所指出的）命名空间 URI 是命名空间的重要组成部分，而不是命名空间前缀，前缀是一种“任意便利”

至于为什么需要命名空间管理器，而不是有什么魔法可以起作用使用该文档，我可以想到两个原因。

原因 1

如果只允许向 documentElement 添加名称空间声明（如您的示例中所示），则 selectSingleNode 仅使用定义的任何内容确实很简单。

但是，您可以在文档中的任何元素上定义命名空间前缀，并且命名空间前缀不会唯一绑定到文档中的任何给定命名空间。考虑以下示例

<w xmlns:a="mynamespace">
  <a:x>
    <y xmlns:a="myOthernamespace">
      <z xmlns="mynamespace">
      <b:z xmlns:b="mynamespace">
      <z xmlns="myOthernamespace">
      <b:z xmlns:b="myOthernamespace">
    </y>
  </a:x>
</w>

在此示例中，您希望 //z、//a:z 和 //b:z 返回什么？如果没有某种外部名称空间管理器，您将如何表达这一点？

原因 2

它允许您对任何等效文档重用相同的 XPath 表达式，而无需了解有关所使用的命名空间前缀的任何信息。

myXPathExpression = "//z:y"
doc1.selectSingleNode(myXPathExpression);
doc2.selectSingleNode(myXPathExpression);

doc1:

<x>
  <z:y xmlns:z="mynamespace" />
</x>

doc2：

<x xmlns"mynamespace">
  <y>
</x>

为了在没有名称空间管理器的情况下实现后一个目标，您必须检查每个文档，为每个文档构建自定义 XPath 表达式。

The basic point (as pointed out by Kev, above), is that the namespace URI is the important part of the namespace, rather than the namespace prefix, the prefix is an "arbitrary convenience"

As for why you need a namespace manager, rather than there being some magic that works it out using the document, I can think of two reasons.

Reason 1

If it were permitted to only add namespace declarations to the documentElement, as in your examples, it would indeed be trivial for selectSingleNode to just use whatever is defined.

However, you can define namespace prefixes on any element in a document, and namespace prefixes are not uniquely bound to any given namespace in a document. Consider the following example

<w xmlns:a="mynamespace">
  <a:x>
    <y xmlns:a="myOthernamespace">
      <z xmlns="mynamespace">
      <b:z xmlns:b="mynamespace">
      <z xmlns="myOthernamespace">
      <b:z xmlns:b="myOthernamespace">
    </y>
  </a:x>
</w>

In this example, what would you want //z, //a:z and //b:z to return? How, without some kind of external namespace manager, would you express that?

Reason 2

It allows you to reuse the same XPath expression for any equivalent document, without needing to know anything about the namespace prefixes in use.

myXPathExpression = "//z:y"
doc1.selectSingleNode(myXPathExpression);
doc2.selectSingleNode(myXPathExpression);

doc1:

<x>
  <z:y xmlns:z="mynamespace" />
</x>

doc2:

<x xmlns"mynamespace">
  <y>
</x>

In order to achieve this latter goal without a namespace manager, you would have to inspect each document, building a custom XPath expression for each one.

回复收藏 0 原文

携君以终年 2024-12-08 17:00:11

据我所知，如果您有这样的文档，则没有充分的理由需要手动定义 XmlNamespaceManager 来获取 abc 前缀的节点：

<itemContainer xmlns:abc="http://abc.com" xmlns:def="http://def.com">
    <abc:nodeA>...</abc:nodeA>
    <def:nodeB>...</def:nodeB>
    <abc:nodeC>...</abc:nodeC>
</itemContainer>

Microsoft 根本懒得编写一些东西来检测 xmlns:abc 是否已在父节点中指定。我可能是错的，如果是这样，我欢迎对此答案发表评论，以便我可以更新它。

然而，这篇博文似乎证实了我的怀疑。它基本上表示您需要手动定义一个 XmlNamespaceManager 并手动迭代 xmlns: 属性，将每个属性添加到命名空间管理器中。不知道为什么微软不能自动做到这一点。

这是我根据该博客文章创建的方法，用于根据源 XmlDocument 的 xmlns: 属性自动生成 XmlNamespaceManager：

/// <summary>
/// Creates an XmlNamespaceManager based on a source XmlDocument's name table, and prepopulates its namespaces with any 'xmlns:' attributes of the root node.
/// </summary>
/// <param name="sourceDocument">The source XML document to create the XmlNamespaceManager for.</param>
/// <returns>The created XmlNamespaceManager.</returns>
private XmlNamespaceManager createNsMgrForDocument(XmlDocument sourceDocument)
{
    XmlNamespaceManager nsMgr = new XmlNamespaceManager(sourceDocument.NameTable);

    foreach (XmlAttribute attr in sourceDocument.SelectSingleNode("/*").Attributes)
    {
        if (attr.Prefix == "xmlns")
        {
            nsMgr.AddNamespace(attr.LocalName, attr.Value);
        }
    }

    return nsMgr;
}

并且我使用就像这样：

XPathNavigator xNav = xmlDoc.CreateNavigator();
XPathNodeIterator xIter = xNav.Select("//abc:NodeC", createNsMgrForDocument(xmlDoc));

As far as I can tell, there is no good reason that you should need to manually define an XmlNamespaceManager to get at abc-prefixed nodes if you have a document like this:

<itemContainer xmlns:abc="http://abc.com" xmlns:def="http://def.com">
    <abc:nodeA>...</abc:nodeA>
    <def:nodeB>...</def:nodeB>
    <abc:nodeC>...</abc:nodeC>
</itemContainer>

Microsoft simply couldn't be bothered to write something to detect that xmlns:abc had already been specified in a parent node. I could be wrong, and if so, I'd welcome comments on this answer so I can update it.

However, this blog post seems to confirm my suspicion. It basically says that you need to manually define an XmlNamespaceManager and manually iterate through the xmlns: attributes, adding each one to the namespace manager. Dunno why Microsoft couldn't do this automatically.

Here's a method I created based on that blog post to automatically generate an XmlNamespaceManager based on the xmlns: attributes of a source XmlDocument:

/// <summary>
/// Creates an XmlNamespaceManager based on a source XmlDocument's name table, and prepopulates its namespaces with any 'xmlns:' attributes of the root node.
/// </summary>
/// <param name="sourceDocument">The source XML document to create the XmlNamespaceManager for.</param>
/// <returns>The created XmlNamespaceManager.</returns>
private XmlNamespaceManager createNsMgrForDocument(XmlDocument sourceDocument)
{
    XmlNamespaceManager nsMgr = new XmlNamespaceManager(sourceDocument.NameTable);

    foreach (XmlAttribute attr in sourceDocument.SelectSingleNode("/*").Attributes)
    {
        if (attr.Prefix == "xmlns")
        {
            nsMgr.AddNamespace(attr.LocalName, attr.Value);
        }
    }

    return nsMgr;
}

And I use it like so:

XPathNavigator xNav = xmlDoc.CreateNavigator();
XPathNodeIterator xIter = xNav.Select("//abc:NodeC", createNsMgrForDocument(xmlDoc));

回复收藏 0 原文

自由范儿 2024-12-08 17:00:11

原因很简单。您在 XPath 查询中使用的前缀与 xml 文档中声明的前缀之间没有必需的连接。举个例子，以下 xml 在语义上是等效的：

<aaa:root xmlns:aaa="http://someplace.org">
 <aaa:element>text</aaa:element>
</aaa:root>

  <bbb:root xmlns:bbb="http://someplace.org">
     <bbb:element>text</bbb:element>
  </bbb:root>

“ccc:root/ccc:element”查询将匹配两个实例，前提是命名空间管理器中有一个映射。

nsmgr.AddNamespace("ccc", "http://someplace.org")

.NET 实现并不关心 xml 中使用的文字前缀，只关心为查询文字定义的前缀以及命名空间值与文档的实际值匹配。即使使用的文档之间的前缀有所不同，这也需要具有恒定的查询表达式，并且它是一般情况的正确实现。

The reason is simple. There is no required connection between the prefixes you use in your XPath query and the declared prefixes in the xml document. To give an example the following xmls are semantically equivalent:

<aaa:root xmlns:aaa="http://someplace.org">
 <aaa:element>text</aaa:element>
</aaa:root>

  <bbb:root xmlns:bbb="http://someplace.org">
     <bbb:element>text</bbb:element>
  </bbb:root>

The "ccc:root/ccc:element" query will match both instances provided there is a mapping in the namespace manager for that.

nsmgr.AddNamespace("ccc", "http://someplace.org")

The .NET implementation does not care about the literal prefixes used in the xml only that there is a prefix defined for the query literal and that the namespace value matches the actual value of the doc. This is required to have constant query expressions even if the prefixes vary between consumed documents and it's the correct implementation for the general case.

回复收藏 0 原文

写下不归期 2024-12-08 17:00:11

我回答第 1 点：

为 XML 文档设置默认命名空间仍然意味着节点，即使没有命名空间前缀，即：

<rootNode xmlns="http://someplace.org">
   <nodeName>Some Text Here</nodeName>
</rootNode>

不再位于“空”命名空间中。您仍然需要某种方式使用 XPath 来引用这些节点，因此您创建一个前缀来引用它们，即使它是“编造的”。

回答第 2 点：

<rootNode xmlns:cde="http://someplace.org" xmlns:feg="http://otherplace.net">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

在实例文档内部，驻留在命名空间中的节点与其节点名称和长命名空间名称一起存储，这称为（用 W3C 术语）扩展名称。

例如本质上存储为。命名空间前缀对于人类来说是一种任意的便利，因此当我们输入 XML 或必须读取它时，我们不必这样做：

<rootNode>
   <http://someplace.org:nodeName>Some Text Here</http://someplace.org:nodeName>
   <http://otherplace.net:nodeName>Some Other Value</http://otherplace.net:nodeName>
   <http://otherplace.net:otherName>Yet Another Value</http://otherplace.net:otherName>
</rootNode>

当搜索 XML 文档时，它不是通过友好前缀搜索的，而是通过以下方式完成的：命名空间 URI，因此您必须通过使用 XmlNamespaceManager 传入的命名空间表告诉 XPath 您的命名空间。

I answer to point 1:

Setting a default namespace for an XML document still means that the nodes, even without a namespace prefix, i.e.:

<rootNode xmlns="http://someplace.org">
   <nodeName>Some Text Here</nodeName>
</rootNode>

are no longer in the "empty" namespace. You still need some way to reference these nodes using XPath, so you create a prefix to reference them, even if it is "made up".

To answer point 2:

<rootNode xmlns:cde="http://someplace.org" xmlns:feg="http://otherplace.net">
   <cde:nodeName>Some Text Here</cde:nodeName>
   <feg:nodeName>Some Other Value</feg:nodeName>
   <feg:otherName>Yet Another Value</feg:otherName>
</rootNode>

Internally in the instance document, the nodes that reside in a namespace are stored with their node name and their long namespace name, it's called (in W3C parlance) an expanded name.

For example <cde:nodeName> is essentially stored as <http://someplace.org:nodeName>. A namespace prefix is an arbitrary convenience for humans so that when we type out XML or have to read it we don't have to do this:

<rootNode>
   <http://someplace.org:nodeName>Some Text Here</http://someplace.org:nodeName>
   <http://otherplace.net:nodeName>Some Other Value</http://otherplace.net:nodeName>
   <http://otherplace.net:otherName>Yet Another Value</http://otherplace.net:otherName>
</rootNode>

When an XML document is searched, it's not searched by the friendly prefix, they search is done by namespace URI so you have to tell XPath about your namespaces via a namespace table passed in using XmlNamespaceManager.

回复收藏 0 原文

超可爱的懒熊 2024-12-08 17:00:11

这篇文章帮助我更清楚地理解了命名空间的问题。谢谢。当我看到 Jez 的代码时，我尝试了它，因为它看起来比我编写的解决方案更好。不过，我发现了它的一些缺点。正如所写，它只在根节点中查找（但命名空间可以在任何地方列出。），并且它不处理默认命名空间。我试图通过修改他的代码来解决这些问题，但没有成功。

这是我的该函数的版本。它使用正则表达式来查找整个文件中的名称空间映射；使用默认命名空间，给它们任意前缀“ns”；并处理同一名称空间的多次出现。

private XmlNamespaceManager CreateNamespaceManagerForDocument(XmlDocument document)
{
    var nsMgr = new XmlNamespaceManager(document.NameTable);

    // Find and remember each xmlns attribute, assigning the 'ns' prefix to default namespaces.
    var nameSpaces = new Dictionary<string, string>();
    foreach (Match match in new Regex(@"xmlns:?(.*?)=([\x22\x27])(.+?)\2").Matches(document.OuterXml))
        nameSpaces[match.Groups[1].Value + ":" + match.Groups[3].Value] = match.Groups[1].Value == "" ? "ns" : match.Groups[1].Value;

    // Go through the dictionary, and number non-unique prefixes before adding them to the namespace manager.
    var prefixCounts = new Dictionary<string, int>();
    foreach (var namespaceItem in nameSpaces)
    {
        var prefix = namespaceItem.Value;
        var namespaceURI = namespaceItem.Key.Split(':')[1];
        if (prefixCounts.ContainsKey(prefix)) 
            prefixCounts[prefix]++; 
        else 
            prefixCounts[prefix] = 0;
        nsMgr.AddNamespace(prefix + prefixCounts[prefix].ToString("#;;"), namespaceURI);
    }
    return nsMgr;
}

This thread has helped me understand the issue of namespaces much more clearly. Thanks. When I saw Jez's code, I tried it because it looked like a better solution than I had programmed. I discovered some shortcomings with it, though. As written, it looks only in the root node (but namespaces can be listed anywhere.), and it doesn't handle default namespaces. I tried to address these issues by modifying his code, but to no avail.

Here is my version of that function. It uses regular expressions to find the namespace mappings throughout the file; works with default namespaces, giving them the arbitrary prefix 'ns'; and handles multiple occurrences of the same namespace.

private XmlNamespaceManager CreateNamespaceManagerForDocument(XmlDocument document)
{
    var nsMgr = new XmlNamespaceManager(document.NameTable);

    // Find and remember each xmlns attribute, assigning the 'ns' prefix to default namespaces.
    var nameSpaces = new Dictionary<string, string>();
    foreach (Match match in new Regex(@"xmlns:?(.*?)=([\x22\x27])(.+?)\2").Matches(document.OuterXml))
        nameSpaces[match.Groups[1].Value + ":" + match.Groups[3].Value] = match.Groups[1].Value == "" ? "ns" : match.Groups[1].Value;

    // Go through the dictionary, and number non-unique prefixes before adding them to the namespace manager.
    var prefixCounts = new Dictionary<string, int>();
    foreach (var namespaceItem in nameSpaces)
    {
        var prefix = namespaceItem.Value;
        var namespaceURI = namespaceItem.Key.Split(':')[1];
        if (prefixCounts.ContainsKey(prefix)) 
            prefixCounts[prefix]++; 
        else 
            prefixCounts[prefix] = 0;
        nsMgr.AddNamespace(prefix + prefixCounts[prefix].ToString("#;;"), namespaceURI);
    }
    return nsMgr;
}

回复收藏 0 原文

左秋 2024-12-08 17:00:11

您需要将 URI/前缀对注册到 XmlNamespaceManager 实例，以便让 SelectSingleNode() 知道您所引用的特定“nodeName”节点 - 来自“http://someplace.org”的节点或来自“http://otherplace.net”的一个。

请注意，当您执行 XPath 查询时，具体的前缀名称并不重要。我相信这也有效：

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("any", "http://someplace.org")
nsmgr.AddNamespace("thing", "http://otherplace.net")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//thing:nodeName", nsmgr)

SelectSingleNode() 只需要 XPath 表达式的前缀与命名空间 URI 之间的连接。

You need to register the URI/prefix pairs to the XmlNamespaceManager instance to let SelectSingleNode() know which particular "nodeName" node you're referring to - the one from "http://someplace.org" or the one from "http://otherplace.net".

Please note that the concrete prefix name doesn't matter when you're doing the XPath query. I believe this works too:

Dim nsmgr As New XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace("any", "http://someplace.org")
nsmgr.AddNamespace("thing", "http://otherplace.net")
Dim desiredNode As XmlNode = doc.SelectSingleNode("//thing:nodeName", nsmgr)

SelectSingleNode() just needs a connection between the prefix from your XPath expression and the namespace URI.

回复收藏 0 原文

~没有更多了~