xml 解析中哪个更快:元素还是属性?

发布于 2024-10-04 00:23:27 字数 138 浏览 1 评论 0原文

我正在编写解析 XML 的代码。

我想知道什么解析速度更快:元素还是属性。

这将对我的 XML 设计产生直接影响。

请针对 C# 以及 LINQ 和 XmlReader 之间的差异提供答案。

谢谢。

I am writing code that parses XML.

I would like to know what is faster to parse: elements or attributes.

This will have a direct effect over my XML design.

Please target the answers to C# and the differences between LINQ and XmlReader.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃酥萝莉 2024-10-11 00:23:28

设计您的 XML 模式,使信息的表示真正有意义。通常,在属性或元素中做出某些决定不会影响性能。

大多数情况下,XML 的性能问题与以非常冗长的 XML 方言表示的大量数据有关。典型的对策是在通过网络存储或传输 XML 数据时对其进行压缩。

如果这还不够,那么切换到其他格式(例如 JSON、ASN.1 或自定义二进制格式)可能是可行的方法。

解决问题的第二部分:XDocument (LINQ) 和 XmlReader 类之间的主要区别在于 XDocument 类构建了完整的内存中的文档对象模型 (DOM),这可能是一项昂贵的操作,而 XmlReader 类为您提供输入文档上的标记化流。

Design your XML schema so that representation of the information actually makes sense. Usually, the decision between making something in attribute or an element will not affect performance.

Performance problems with XML are in most cases related to large amounts of data that are represented in a very verbose XML dialect. A typical countermeasures is to zip the XML data when storing or transmitting them over the wire.

If that is not sufficient then switching to another format such as JSON, ASN.1 or a custom binary format might be the way to go.

Addressing the second part of your question: The main difference between the XDocument (LINQ) and the XmlReader class is that the XDocument class builds a full document object model (DOM) in memory, which might be an expensive operation, whereas the XmlReader class gives you a tokenized stream on the input document.

最舍不得你 2024-10-11 00:23:28

对于 XML,速度取决于很多因素。

对于属性或元素,选择与数据更匹配的属性或元素。作为指导,我们使用属性来表示对象的属性;以及包含的子对象的元素。

根据您所讨论的数据量,使用属性可以节省一点 xml 流的大小。例如, 小于 123 这个并不会真正影响解析,但会影响通过网络发送数据或从磁盘加载数据的速度...如果我们谈论的是数千条此类记录,那么它可能会对您的应用程序产生影响。

当然,如果这确实有所作为,那么使用 JSON 或某种二进制表示形式可能是更好的方法。

您需要问的第一个问题是是否需要 XML。如果不需要人类可读,那么二进制可能更好。哎呀,CSV 甚至固定宽度的文件可能会更好。

关于 LINQ 与 XmlReader,这将归结为您在解析数据时如何处理数据。您是否需要实例化一堆对象并以这种方式处理它们,或者您只需要读取传入的流?您甚至可能发现仅对数据进行基本的字符串操作可能是最简单/最好的方法。

要点是,您可能需要检查每种方法的优点,而不仅仅是“解析速度更快”。

With XML, speed is dependent on a lot of factors.

With regards to attributes or elements, pick the one that more closely matches the data. As a guideline, we use attributes for, well, attributes of an object; and elements for contained sub objects.

Depending on the amount of data you are talking about using attributes can save you a bit on the size of your xml streams. For example, <person id="123" /> is smaller than <person><id>123</id></person> This doesn't really impact the parsing, but will impact the speed of sending the data across a network wire or loading it from disk... If we are talking about thousands of such records then it may make a difference to your application.

Of course, if that actually does make a difference then using JSON or some binary representation is probably a better way to go.

The first question you need to ask is whether XML is even required. If it doesn't need to be human readable then binary is probably better. Heck, a CSV or even a fixed-width file might be better.

With regards to LINQ vs XmlReader, this is going to boil down to what you do with the data as you are parsing it. Do you need to instantiate a bunch of objects and handle them that way or do you just need to read the stream as it comes in? You might even find that just doing basic string manipulation on the data might be the easiest/best way to go.

Point is, you will probably need to examine the strengths of each approach beyond just "what parses faster".

狼亦尘 2024-10-11 00:23:28

没有任何确凿的数字来证明这一点,我知道 Microsoft 的 WCF 团队选择将 DataContractSerializer 作为他们的 WCF 标准。它的局限性在于它不支持 XML 属性,但它确实比 XmlSerializer 快 10-15%。

根据该信息,我认为使用 XML 属性的解析速度会比仅使用 XML 元素慢。

Without having any hard numbers to prove it, I know that the WCF team at Microsoft chose to make the DataContractSerializer their standard for WCF. It's limited in that it doesn't support XML attributes, but it is indeed up to 10-15% faster than the XmlSerializer.

From that information, I would assume that using XML attributes will be slower to parse than if you use only XML elements.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文