如何从 XmlNode 实例获取 xpath
有人可以提供一些代码来获取 System.Xml.XmlNode 实例的 xpath 吗?
谢谢!
Could someone supply some code that would get the xpath of a System.Xml.XmlNode instance?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
好吧,我忍不住想尝试一下。 它只适用于属性和元素,但是嘿... 15 分钟内你能期待什么:)同样,很可能有一种更简洁的方法来做到这一点。
在每个元素(尤其是根元素!)上都包含索引是多余的,但这比尝试弄清楚是否存在任何歧义要容易。
Okay, I couldn't resist having a go at it. It'll only work for attributes and elements, but hey... what can you expect in 15 minutes :) Likewise there may very well be a cleaner way of doing it.
It is superfluous to include the index on every element (particularly the root one!) but it's easier than trying to work out whether there's any ambiguity otherwise.
Jon 是正确的,有任意数量的 XPath 表达式将在实例文档中产生相同的节点。 构建明确产生特定节点的表达式的最简单方法是使用谓词中的节点位置的节点测试链,例如:
显然,该表达式不使用元素名称,但是如果您想要要做的是在文档中定位一个节点,您不需要它的名称。 它也不能用于查找属性(因为属性不是节点并且没有位置;您只能通过名称查找它们),但它会查找所有其他节点类型。
要构建此表达式,您需要编写一个方法来返回节点在其父节点的子节点中的位置,因为
XmlNode
不会将其公开为属性:(可能有一种更优雅的方法来做到这一点使用 LINQ,因为
XmlNodeList
实现了IEnumerable
,但我将使用我在这里所知道的内容。)然后您可以编写如下递归方法:
如您所见,我以某种方式对其进行黑客攻击,使其也能找到属性。
当我正在写我的版本时,乔恩偷偷地带来了他的版本。 他的代码中有些东西现在会让我咆哮起来,如果听起来像是我在对乔恩咆哮,我提前道歉。 (我不是。我很确定 Jon 需要向我学习的内容非常短。)但是我认为我要提出的观点对于任何使用 XML 的人来说都是非常重要的。想一想。
我怀疑 Jon 的解决方案源于我看到许多开发人员所做的事情:将 XML 文档视为元素和属性的树。 我认为这主要来自于主要使用 XML 作为序列化格式的开发人员,因为他们习惯使用的所有 XML 都是以这种方式构建的。 您可以发现这些开发人员,因为他们交替使用术语“节点”和“元素”。 这导致他们提出将所有其他节点类型视为特殊情况的解决方案。 (我自己在很长一段时间里都是这些人中的一员。)
这感觉就像是你在做的时候的一个简化假设。 但事实并非如此。 它使问题变得更加困难并且代码更加复杂。 它引导您绕过 XML 技术的各个部分(例如 XPath 中的
node()
函数),这些技术是专门为一般处理所有节点类型而设计的。Jon 的代码中有一个危险信号,即使我不知道要求是什么,也会让我在代码审查中查询它,这就是
GetElementsByTagName
。 每当我看到使用该方法时,脑海中总会浮现出一个问题:“为什么它必须是一个元素?” 答案通常是“哦,这段代码也需要处理文本节点吗?”Jon's correct that there are any number of XPath expressions that will yield the same node in an an instance document. The simplest way to build an expression that unambiguously yields a specific node is a chain of node tests that use the node position in the predicate, e.g.:
Obviously, this expression isn't using element names, but then if all you're trying to do is locate a node within a document, you don't need its name. It also can't be used to find attributes (because attributes aren't nodes and don't have position; you can only find them by name), but it will find all other node types.
To build this expression, you need to write a method that returns a node's position in its parent's child nodes, because
XmlNode
doesn't expose that as a property:(There's probably a more elegant way to do that using LINQ, since
XmlNodeList
implementsIEnumerable
, but I'm going with what I know here.)Then you can write a recursive method like this:
As you can see, I hacked in a way for it to find attributes as well.
Jon slipped in with his version while I was writing mine. There's something about his code that's going to make me rant a bit now, and I apologize in advance if it sounds like I'm ragging on Jon. (I'm not. I'm pretty sure that the list of things Jon has to learn from me is exceedingly short.) But I think the point I'm going to make is a pretty important one for anyone who works with XML to think about.
I suspect that Jon's solution emerged from something I see a lot of developers do: thinking of XML documents as trees of elements and attributes. I think this largely comes from developers whose primary use of XML is as a serialization format, because all the XML they're used to using is structured this way. You can spot these developers because they're using the terms "node" and "element" interchangeably. This leads them to come up with solutions that treat all other node types as special cases. (I was one of these guys myself for a very long time.)
This feels like it's a simplifying assumption while you're making it. But it's not. It makes problems harder and code more complex. It leads you to bypass the pieces of XML technology (like the
node()
function in XPath) that are specifically designed to treat all node types generically.There's a red flag in Jon's code that would make me query it in a code review even if I didn't know what the requirements are, and that's
GetElementsByTagName
. Whenever I see that method in use, the question that leaps to mind is always "why does it have to be an element?" And the answer is very often "oh, does this code need to handle text nodes too?"我知道,旧帖子,但我最喜欢的版本(有名字的版本)有缺陷:
当父节点有不同名称的节点时,它在找到第一个不匹配的节点名称后停止计算索引。
这是我的固定版本:
I know, old post but the version I liked the most (the one with names) was flawed:
When a parent node has nodes with different names, it stopped counting the index after it found the first non-matching node-name.
Here is my fixed version of it:
这是我用过的一个简单的方法,对我有用。
Here's a simple method that I've used, worked for me.
我的 10 便士价值是罗伯特和科里答案的混合体。 我只能将额外代码行的实际输入归功于我。
My 10p worth is a hybrid of Robert and Corey's answers. I can only claim credit for the actual typing of the extra lines of code.
不存在节点的“该”xpath 之类的东西。 对于任何给定的节点,很可能有许多与其匹配的 xpath 表达式。
您可能可以在考虑到特定元素的索引等的情况下,对树进行构建一个表达式来匹配它,但这不会是非常好的代码。
为什么需要这个? 可能有更好的解决方案。
There's no such thing as "the" xpath of a node. For any given node there may well be many xpath expressions which will match it.
You can probably work up the tree to build up an expression which will match it, taking into account the index of particular elements etc, but it's not going to be terribly nice code.
Why do you need this? There may be a better solution.
如果这样做,您将获得一条包含节点名称和位置的路径,如果您有具有相同名称的节点,如下所示:
“/服务[1]/系统[1]/组[1]/文件夹[2]/文件[2]”
If you do this, you will get a Path with Names of der Nodes AND the Position, if you have Nodes with the same name like this:
"/Service[1]/System[1]/Group[1]/Folder[2]/File[2]"
我发现上述方法都不适用于
XDocument
,因此我编写了自己的代码来支持XDocument
并使用了递归。 我认为此代码比此处的其他一些代码更好地处理多个相同的节点,因为它首先尝试尽可能深入地了解 XML 路径,然后备份以仅构建所需的内容。 因此,如果您有/home/white/bob
和/home/white/mike
并且您想要创建/home/white/bob/garage
> 代码会知道如何创建它。 然而,我不想弄乱谓词或通配符,所以我明确禁止这些; 但添加对它们的支持很容易。I found that none of the above worked with
XDocument
, so I wrote my own code to supportXDocument
and used recursion. I think this code handles multiple identical nodes better than some of the other code here because it first tries to go as deep in to the XML path as it can and then backs up to build only what is needed. So if you have/home/white/bob
and/home/white/mike
and you want to create/home/white/bob/garage
the code will know how to create that. However, I didn't want to mess with predicates or wildcards, so I explicitly disallowed those; but it would be easy to add support for them.使用类扩展怎么样? ;)
我的版本(建立在其他工作的基础上)使用语法 name[index]... 省略索引是元素没有“兄弟”。
获取元素索引的循环位于独立例程(也是类扩展)的外部。
只需在任何实用程序类(或主程序类)中添加以下内容即可
What about using class extension ? ;)
My version (building on others work) uses the syntaxe name[index]... with index omited is element has no "brothers".
The loop to get the element index is outside in an independant routine (also a class extension).
Just past the following in any utility class (or in the main Program class)
我为 Excel 制作了 VBA 来为工作项目执行此操作。 它输出 Xpath 的元组以及来自元素或属性的关联文本。 目的是让业务分析师能够识别和映射一些 xml。 感谢这是一个 C# 论坛,但认为这可能会引起兴趣。
使用以下方法管理元素的计数:
I produced VBA for Excel to do this for a work project. It outputs tuples of an Xpath and the associated text from an elemen or attribute. The purpose was to allow business analysts to identify and map some xml. Appreciate that this is a C# forum, but thought this may be of interest.
Manages the counting of elements using:
问题的另一个解决方案可能是“标记”您稍后希望使用自定义属性进行标识的 xmlnode:
例如,您可以将其存储在字典中。
您稍后可以使用 xpath 查询来识别节点:
我知道这不是您问题的直接答案,但如果您希望了解节点的 xpath 的原因是为了有一种“到达”节点的方法,那么它会有所帮助。稍后在代码中丢失对它的引用后。
这也克服了文档添加/移动元素时的问题,这可能会弄乱 xpath(或索引,如其他答案中所建议的)。
Another solution to your problem might be to 'mark' the xmlnodes which you will want to later identify with a custom attribute:
which you can store in a Dictionary for example.
And you can later identify the node with an xpath query:
I know this is not a direct answer to your question, but it can help if the reason you wish to know the xpath of a node is to have a way of 'reaching' the node later after you have lost the reference to it in code.
This also overcomes problems when the document gets elements added/moved, which can mess up the xpath (or indexes, as suggested in other answers).
这更容易
This is even easier
我最近不得不这样做。 仅需要考虑元素。 这就是我想出的:
I had to do this recently. Only elements needed to be considered. This is what I came up with: