如何获取 XPath/XQuery 中的所有后代?

发布于 2024-11-17 12:53:39 字数 211 浏览 2 评论 0原文

我正在尝试浏览文档以了解其结构。该文档正在提供给我,因此我无权访问原始文档,但我可以对服务器进行查询。我相信它是无模式的。我通过 CQ Web 应用程序访问该文档,该应用程序是 MarkLogic 的一部分。

我基本上希望将一棵完全填充的树归还给我。这看起来确实很容易,但尚未被证明如此。我浏览了 W3C 和其他几个网站,但似乎没有任何效果。

提前致谢,

吉多

I am trying to navigate a document to learn about its structure. The document is being served to me so I don't have access to the raw document, but I can exercise queries against the server. I believe it is schema-less. I am accessing the document through the CQ web application which is part of MarkLogic.

I would basically like to get a fully populated tree returned to me. This seems really easy, but has not proven to be. I've looked through the W3C and a couple other sites and nothing seems to work.

Thanks-in-advance,

Guido

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

匿名的好友 2024-11-24 12:53:39

也许文档太大而无法返回 - 如果您使用 MarkLogic,也许您正在尝试查询数千或数百万个子文档的“森林”?

了解文档结构而不尝试返回所有文档结构的一个好方法是使用连续的 XPath 查询来提供元素名称。例如,

name(/*)

这将告诉您最外层元素的名称。然后,

name(/*/*[1]) <!-- name of first child of outermost element -->
name(/*/*[2])

/*/text()[1]  <!-- content of first text node under outermost element -->

count(/*/*)   <!-- number of children of outermost element -->

name(/*/@*[1]) <!-- name of first attribute of outermost element (untested) -->

等等。

由于您可以使用 XQuery,因此您可以执行一个循环来打印文档前三个级别的前三个元素的所有上述数据。

或者,/ 可能不返回任何内容,因为在 XPath 中这意味着“包含上下文节点的文档的根节点”;在对 XML 文档数据库进行 XQuery 时,可能还没有上下文节点(警告:我对 XQuery 不是很熟练,所以请检查您的参考资料)。相反,您可能必须使用 document('...')/; 来启动 XPath 表达式。希望您知道文档的名称?

此外,此屏幕截图显示了一些可能有用的查询。我认为。

Maybe the document is too big to return - if you're using MarkLogic, maybe you're trying to query a "forest" of thousands or millions of subdocuments?

A good way to learn about the structure of a document without trying to return all of it would be to use successive XPath queries that give you the names of elements. E.g.

name(/*)

This will tell you the name of the outermost element. Then,

name(/*/*[1]) <!-- name of first child of outermost element -->
name(/*/*[2])

/*/text()[1]  <!-- content of first text node under outermost element -->

count(/*/*)   <!-- number of children of outermost element -->

name(/*/@*[1]) <!-- name of first attribute of outermost element (untested) -->

etc.

Since you can use XQuery, you could do a loop that prints out, say, all the above data for the first three elements at the top three levels of the document.

Alternatively, / may return nothing, because in XPath this means "the root node of the document containing the context node"; and in XQuerying a database of XML documents, there may not yet be a context node (caveat: I'm not real fluent in XQuery, so check your references). Instead, you may have to start your XPath expression with document('...')/; hopefully you know the name of a document?

Also, this screenshot shows some potentially useful queries. I think.

や莫失莫忘 2024-11-24 12:53:39

@LarsH 推荐了一个有用的探索策略。

另一种方法是获取整个 XML 文档,例如应用 XSLT 身份转换:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

</xsl:stylesheet>

此转换会生成一个 XML 文档,该文档在大多数情况下与应用它的源 XML 文档(任何 XML 文档)相同。

查看确切 XML 文档的另一种方法是使用调试器并在代码中已接收到 XML 文档的位置设置断点。然后使用调试器可视化功能来获取 XMLDocument 对象的“outerxml”或“innerxml”属性。

当然,没有什么可以阻止服务器根据不同的请求返回不同的 XML 文档。

@LarsH recommended a useful exploration strategy.

An alternative is to get the whole XML document, for example applying the XSLT identity transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

</xsl:stylesheet>

This transformation produces as result an XML document that in most cases is identical to the source XML document (any XML document) on which it is applied.

Another way of seeing the exact XML document is to use a debugger and set a breakpoint at a place in the code where the XML document has already been received. Then use the debugger visualization capabilitis to get the "outerxml" or "innerxml" property of the XMLDocument object.

Of course, nothing prevents the server of returning different XML documents on different requests.

抱猫软卧 2024-11-24 12:53:39

由于您使用的是 CQ,因此可以单击“浏览”链接(位于查询窗格的左上角)。这将为您提供所选数据库中的文档列表。然后,您可以使用其中一个文档的 URI 并对其执行 fn:doc:

fn:doc("/myuri.xml")

这将返回该文档。然后您可以添加 XPath 步骤来向下导航。

Since you are using CQ, you can click the "explore" link (towards the upper left of the query pane). This will give you a list of documents in the database you have selected. You can then use the URI of one of the documents and do an fn:doc of it:

fn:doc("/myuri.xml")

That will return that one document. Then you can add XPath steps to navigate down it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文