当前位置：文江博客话题详情

使用 DOM 树进行 XML 解析什么时候是一件好事？

发布于 2024-12-16 13:13:59 字数 687 浏览 0 评论 0原文

我对通过 DOM 眼镜查看 XML 的想法感到困惑。

可能像大多数人一样，我处理 XML 的第一个方法是构建一个带有 DOM 接口的解析器。然而多年后我不再确定为什么我应该使用基于 DOM 的解析器。

我通常会遇到以下情况：

我必须只读取 XML 文件。我只能想到 xpath 和 XML 查询
我必须只写。最新一代的任何模板引擎都可以做得更好
我必须读写 XML。这可以是：
3.1 纯粹懒惰的症状，不想学习 SQL 和 SQLite 等工具。
3.2 应用程序数据采用树状建模的情况。由于这个原因，出现了 XML 数据库。
3.3 必须转换结构的情况。这里的 XSLT 规则

一些ORM也可以帮助解决第2点和第3点。在第3点我已经看到许多不同成熟度级别的自制实现

读取并转储到内存中。读取对象并转储回 XML。两者都是独立的进程
读取、塑造你的树并为你的对象提供一个自我支持的接口。保留指向 DOM 节点的指针，并重用已解析的树来写回修改，
就像第 2 点一样，但人们知道存在一些很好的模式，例如活动记录，

但即使在第 3 级。我不确定为什么必须使用 DOM 。

在哪些情况下实现基于 DOM 解析器才是解决问题的最佳方法？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淡淡的优雅 2024-12-23 13:13:59

那么您将需要一个文档对象模型

如果可以随时访问模型的任何部分，
。应该保留实际的标记结构，而不是它的一些抽象。

如果您只需要 XML 中的一些值，则可以使用 XPath 或 XQuery。如果您有严格的输入到输出处理要求，您会发现 XSLT 更合适。如果您只需要解析 XML 一次并对其内容进行操作，那么 SAX 将是更好的选择。如果 XML 用于表示结构化数据并且您希望如此，那么某些抽象（例如 JAXB 为 Java 提供的抽象）会更容易处理。

但有时，我们真正感兴趣的是实际的标记+内容，并且我们必须在内存中拥有完整的模型。就像在浏览器中一样。如果没有 (X)HTML 页面的 DOM 表示，JavaScript 操作将会困难得多，并且性能可能会降低。

还要注意的是，当我应用上面的第一点时，但在有足够的工具来提供抽象层之前，就使用了 DOM。在 Java 中，人们倾向于频繁使用 DOM 或类似的解决方案（如 JDOM 和 DOM4J），这对于大型文档有时会出现问题。但随着 JAXB 的出现和成熟，处理 XML 中封装的数据的首选方法已将其转变为更自然的 Java 表示形式。

因此，DOM 的某些用途实际上是遗留的东西，但也有合法的用途。在我个人看来，对于 XML 处理，这通常是您最不想（并且应该）考虑的事情，因为有大量的特定工具可以用于任何可以想象的任务。

编辑：回应进一步的询问。请记住，虽然我会尽力在解释中尽可能准确，但我不是其中某些方面的专家，也没有阅读所有相关规范。所以我会尽力将假设保持在最低限度。

让我们为我的第一点想象一些场景。我有一个 XML 文档，它维护与某些应用程序相关的大量设置。应用程序将在不同的时刻希望检索其中的一些。这可以是在任何时间，并且所需的设置可以在 XML 文档中的任何位置。如何从 XML 中获取它们？

正如您在评论中所述，XPath 绝对可以是答案。但问题是，我们要根据什么来计算 XPath 表达式？

您可以做的一件事是将 XPath 表达式嵌入到 XSLT 样式表中，并通过它运行您的 XML。或者，如果可用的 XPath API 提供了这种可能性，只需通过那里发送 SAX 流即可。问题在于，无论采用哪种方式，我们每次都会解析全部或至少部分 XML 文档。假设您需要 XML 中的 100 条数据，其中 30 条位于文档末尾附近。这意味着您需要通过解析器发送整个文档 30 次，每一个标记都必须正确解释 30 次。在糟糕的情况下，XML 文档是磁盘上的文件，过多的磁盘访问会降低性能。

现在想象我们使用 DOM 来代替。缺点是，整个 XML 文档驻留在内存中并占用一些 RAM。好处是...整个 XML 文档都驻留在内存中！并且以结构化的方式不再需要正确解析标记，因为我们已经有了“元素”和“属性”对象。所有东西都被嚼成可食用的块了。如果我们对此释放 XPath 一百次，那就是对内存进行 100 次快速查询。特别是如果 DOM 实现足够智能，可以对元素和属性名称和值进行一些索引。想想这样的区别：记住一本书的结构并能够快速翻到正确的章节来查找内容，而不是每次仅仅因为你想要一些信息而必须从头开始阅读这本书第10章。

然后是第二点。你的问题是这与我的其余解释有什么关系。我认为这里最好的用例是 Web 浏览器中的 DOM。

假设您想要一个动态网页，它可以运行一些 JavaScript 以便您在浏览器中更改它，而无需重新加载页面。所有干净整洁的 Web 2.0 东西。问题是，我们要怎么做呢？想象一下，我们回到了简单地在内存中保存一个代表 (X)HTML 的大字符串的方法。为了改变它，我们必须解析整个文档，进行相应的更改，然后浏览器必须解析整个文档并完全重新渲染页面，因为它不知道实际更改了什么。也许它可以很聪明，并与原始文档进行一些比较，以仅更新其演示文稿的相关部分。不管怎样，它都很笨拙。

但是任何现代浏览器都会向 JavaScript 提供底层页面文档的 DOM 接口。现在，浏览器实际上并不需要直接使用此 DOM 进行渲染，但它应该使其中的内容与您在屏幕上看到的内容保持一致。如果我们在 GUI 中流行的模型-视图-控制器模式中考虑这一点，那么 DOM 就是模型，浏览器的表示就是视图，现在使用 JavaScript，我们实际上还有一个控制器。如果控制器（脚本）对底层模型（DOM）进行了某些更改，那么视图（呈现的页面）就会相应更新。由于我们被迫访问由浏览器维护的 DOM，因此只要有更改，它就会注意到。如果实现有点智能，它就会有一个侦听器，每当有东西通过 DOM 接口进行实际更改时，该侦听器就会收到通知。

您会看到，借助 DOM，我们的脚本可以直接访问页面的任何部分，并且由于模型与演示相对应，因此任何更改都可以快速有效地传播到重要的渲染内容。

现在，事实仍然是 DOM 是通用的并且对实际标记进行建模。它并没有真正赋予它意义。我们不会有一些“超链接”对象可供使用，我们将有名为“a”的“元素”。我们不会有一个带有“tablerow”条目的“table”对象，我们将有名称为“table”的元素以及名称为“tr”的元素。一切都是一些节点，如元素、属性或文本，没有任何意义。如果涉及任何语义而不是让我们理解它，那就更好了。就像 JAXB 在 Java 中的实现方式与使用 DOM/JDOM/DOM4J 或任何类似解决方案的不同。

但这也是它的优势。它的普遍性让我们能够以统一的方式来处理它。它允许我们应用样式表并根据元素名称编写查询。它让我们可以在新版本的 HTML 中引入新元素，而无需更改 DOM，因为它已经拥有了表示这些元素所需的一切。 DOM 已由 W3C 标准化，因此我们不会将自己锁定在特定于浏览器的文档表示形式中。而且浏览器不必处理可能损害性能的另一层抽象和间接层。

如今，这种抽象通常留给像 JQuery 这样的框架，它们将一些低级细节从我们手中夺走。但从根本上来说，让我们做这些事情的仍然是 DOM。即使在“ajax”成为一个词之前，它对于使动态 HTML 成为现实至关重要。如果我们想进一步抽象事物，我们就会看到一场新的浏览器战争，即谁可以决定哪种模型最好。我们会减慢 HTML 开发速度（甚至更慢），因为需要将新内容合并到模型中，而 DOM 只是对标记进行建模，因此拥有所需的一切。可能是 HTML 5 引入的一个重大新事物，浏览器现在需要添加对它的支持，但对于 DOM 来说，它只是另一个恰好被称为“canvas”的元素。我们的 JavaScript 仍然可以工作，我们的 CSS 也可以。

所以你看，DOM 仍然有它的地位和用途。但是，曾经人们经常很快地使用它来完成任何与 XML 相关的任务，而可用的工具已经变得如此众多且优秀，以至于现在您将把它作为最后的手段。但它还没有走出去。也许在 Java 领域，但不是在较低级别。

最后，关于 XQuery 和 FLWOR 表达式...我从未使用过 XQuery，只使用过 XPath。因此，如果不进行猜测，我无法真正得出任何像样的结论。

长篇大论，但在晚上我往往会进入这种漫无目的的意识流模式。 Stack Overflow 的好处在于，现在我可以用它来向那些不具备想象力的人解释一些东西。在办公室里来回踱步，同时对着墙壁说话。

You'll want a document object model if

Access to any part of the model at any moment should be possible.
The actual markup structure should be retained, rather than some abstraction of it.

If you just need some values from the XML, you'd use XPath or XQuery. If you have strict input-to-output processing requirements you'd find XSLT to be more suitable. If you just need to parse the XML once and act on its contents then SAX would be a better choice. If the XML is used to represent structured data and you want that, some abstraction (like what JAXB offers for Java) is easier to handle.

But sometimes, what we're really interested in is the actual markup + content and we must have a full model in memory. Like in a browser. Without a DOM representation of (X)HTML pages, JavaScript manipulation would be much harder and probably not nearly as performant.

Also mind that DOM was used when my first point above applied but before there were quite enough tools to provide an abstraction layer. In Java, people tended to use DOM or similar solutions like JDOM and DOM4J frequently, which would sometimes be problematic for large documents. But with the advent and maturation of JAXB, the preferred method for dealing with data encapsulated in XML has become turning it into a much more natural Java representation.

So some use of DOM is actually legacy stuff, but there are legitimate uses. In my personal opinion, it's usually the last thing you would (and should) look at for XML processing, what with the huge amount of specific tools for any conceivable task out there.

EDIT: in response to the further inquiry. Do keep in mind that while I'll try to be as accurate as possible in my explanations, I'm not a specialist in some of these things, nor have I read all relevant specifications. So I'll try to keep assumptions to a minimum.

Let's conjure up some scenario for my first point. I've got an XML document that maintains a large amount of settings relevant to some application. The application will at various points wish to retrieve some of these. This could be at any time and the required setting could be at any place in the XML document. How do I get them from the XML?

XPath could definitely be the answer, as you stated in your comment. But the question is, what are we going to evaluate the XPath expression on?

One thing you could do is embed the XPath expression in an XSLT stylesheet and run your XML through that. Or if the XPath API that's available offers the possiblity, just send a SAX stream through there. The problem is that either way we're parsing all or at least part of the XML document each and every time. Suppose you needed 100 pieces of data from the XML and 30 of those are located near the end of the document. That's 30 times you're sending the entire document through a parser, 30 times every piece of markup must be properly interpreted. In a bad case, the XML document is a file on disk and you're trashing your performance with excessive disk access.

Now imagine we used DOM instead. The downside is, the entire XML document resides in memory and eats up some RAM. The upside is... the entire XML document resides in memory! And in a structured way that doesn't require markup to be properly parsed any more, since we've got "element" and "attribute" objects. Everything's been chewed into edible chunks. If we unleash our XPath on that a hundred times, that's 100 fast queries to memory. Particularly if the DOM implementation is smart enough to do some indexing regarding element and attribute names and values. Think of the difference like this: having memorized a book's structure and being capable of quickly leafing to the right chapter for looking something up, versus having to start reading the book from the beginning every time simply because you want some piece of info that turns out to be in chapter 10.

Then for the second point. Your question there is how that related to the rest of my explanation. I think the best use-case to look at here is DOM in web browsers.

Suppose you wanted a dynamic web page, that can run some JavaScript to let you alter it in the browser, without page reloads. All nice clean Web 2.0 stuff. The question is, how're we gonna do that. Imagine we went back to the approach of simply having a large string in memory that represents the (X)HTML. In order to alter it, we'd have to parse through possibly the entire document, make the changes accordingly and then the browser would have to parse through the entire thing and re-render the page entirely, cause it has no idea what actually changed. Maybe it can be smart and do some diff with the original document to only update relevant portions of its presentation. Either way, it's unwieldy.

But any modern browser presents a DOM interface to the underlying page document to JavaScript. Now, the browser isn't required to actually directly use this DOM for its rendering, but it should keep whatever's in there consistent with what you see on screen. If we think of this in the model-view-controller pattern that's prevalent for GUIs, then the DOM is the model, the browser's presentation is the view and now with JavaScript we actually also have a controller. If the controller (script) changes something to the underlying model (DOM), then the view (rendered page) gets updated accordingly. Since we're forced to go through the DOM, which is maintained by the browser, it'll notice whenever something gets changed. If the implementation is somewhat smart it'd have a listener that's notified whenever something goes through the DOM interface for an actual change.

You see, thanks to DOM, our scripts have direct access to any portion of the page, and since the model corresponds to the presentation, any changes can be quickly and efficiently propagated to the rendered stuff where it counts.

Now, the fact remains that DOM is general-purpose and models the actual markup. It doesn't really attribute meaning to it. We won't have some "hyperlink" objects to work with, we'll have "elements" with the name "a". And we won't have a "table" object with "tablerow" entries, we'll have elements with names "table" and in there elements with name "tr". Everything's some node like element, attribute or text without any meaning. It'd be nicer if there were any semantics involved instead of having us make sense of it. Like how JAXB does it in Java versus using DOM/JDOM/DOM4J or any similar solution.

But that's also its strength. Its generality lets us approach it in a uniform manner. It lets us apply stylesheets and write queries based on element names. It lets us introduce new elements in new versions of HTML without having to change DOM, since it's already got everything that's needed to represent those elements. DOM has been standardized by the W3C so we don't lock ourselves into browser-specific representations of documents. And the browser doesn't have to deal with yet another layer of abstraction and indirection that could harm performance.

These days, such abstraction is usually left to frameworks like JQuery which take some of the low-level details out of our hands. But at the basis it's still DOM that lets us do such stuff. It was essential in making dynamic HTML a reality, even before "ajax" became a word. If we wanted to abstract things further, we'd be looking at a new browser war regarding who gets to decide what model is best. We'd slow down HTML development (even more) because new stuff would need to be incorporated in the model, while DOM just models markup and thus has everything that's needed. <canvas> may be a big new thing that HTML 5 introduced which browsers now need to add support for, but for DOM it's just another element which happens to be called "canvas". Our JavaScript still works, so does our CSS.

So you see, DOM still has its place and its uses. But where once people were often quick to resort to it for any XML-related task, the available tools have become so numerous and good that these days you'll be using it as a last resort. But it's not on the way out yet. Maybe in the Java landscape, but not on lower levels.

Finally, regarding XQuery and FLWOR expressions... I've never used XQuery, only XPath. So I can't really draw any decent conclusions there without guessing.

Long rant, but in the evening I tend to get into this rambling flow of consciousness mode. What's good about Stack Overflow is that now I can actually use it to explain something to people who aren't imaginary. Beats pacing up and down the office while talking to walls.

回复收藏 0 原文

独留℉清风醉 2024-12-23 13:13:59

我很少决定 DOM 就是我所需要的。但这种情况时有发生。

我花了一些时间研究网络测试框架。本质上，它们吸收一些 (X)HTML，然后让您对其做出断言。执行此操作的自然方法是将 (X)HTML 解析为 DOM 树，然后检查它。许多检查都是使用 XPath 完成的，但 XPath 是针对 DOM 运行的。还有一定数量的直接操作 - 例如，为了检查网页上的某些图像的顺序是否正确，我会编写一个 XPath 表达式来选择图像元素，然后循环它们，查看 src 和每一项的替代属性。

我曾经编写过一个 JAX-WS 消息处理程序，用于在将消息发送到不合作的 Web 服务之前处理消息上的名称空间（它不喜欢默认名称空间；我必须将它们全部重写为前缀名称空间）。最简单的方法是以 DOM 树的形式获取消息，然后遍历它，根据需要操作每个元素上的名称空间。

我最近编写了一个脚本来自定义 JBoss 配置文件（我认为特别是 Web 部署程序 web.xml）。我们的想法是，我们将读取 JBoss 提供的文件，应用一些修改，然后写出结果。我们希望以简单且可移植的方式表达修改，因此我们编写了一个 Groovy 脚本来操作解析后的 HTML。这不是完全 DOM，但也很接近了。最后，事实证明它非常冗长和尴尬，因此我们最终使用 XPath 和命令行 XML 处理器 (XMLStarlet) 来代替。

这些情况的共同因素是，我需要读入一些 XML，以某种中等复杂的、依赖于数据的方式对其进行操作，然后经常再次将其写回。这就是 DOM 的优点。

您也可以使用 XSLT 来实现这一点，但坦率地说，我宁愿在头上钻一个洞。

回复收藏 0 原文

梦中楼上月下 2024-12-23 13:13:59

我想在之前的答案中添加一个限定条件。在某些应用程序中，使用类似 DOM 的导航树界面可能是有意义的。但不要误以为 DOM 是这一类别中唯一或最好的 API。在 Java 世界中，DOM 非常可怕，因为它的所有遗留问题（HTML、前命名空间）以及多年来添加的所有包袱。其他类似的模型（例如 JDOM 和 XOM）更加干净且更易于使用（并且通常更快）。

回复收藏 0 原文

~没有更多了~