性能:XmlReader 或 LINQ to XML
我有一个 150 MB 的 XML 文件,在我的项目中用作数据库。目前我正在使用 XmlReader
从中读取内容。我想知道对于这种情况,使用 XmlReader
或 LINQ to XML 是否更好。
请注意,我正在此 XML 中搜索项目并显示搜索结果,因此可能需要很长时间或只需要一会儿。
I have a 150 MB XML file which is used as DB in my project. Currently I'm using XmlReader
to read content from it. I want to know if it is better to use XmlReader
or LINQ to XML for this scenario.
Note that I'm searching for an item in this XML and display search result, so it can take a long time or just a moment.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您想要性能,请使用 XMLReader。它不会读取整个文件并在内存中构建 DOM 树。相反,它从磁盘读取文件并返回它在途中找到的每个节点。
通过快速谷歌搜索,我发现了 XmlReader、LINQ to XML 和 XmlDocument.Load 的性能比较。
https://web. archive.org/web/20130517114458/http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html
If you want performance use XMLReader. It doesn't read the whole file and build the DOM tree in memory. It instead, reads the file from disk and gives you back each node it finds on the way.
With a quick google search I found a performance comparison of XmlReader, LINQ to XML and XmlDocument.Load.
https://web.archive.org/web/20130517114458/http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html
我个人会考虑利用 Microsoft 帮助文件中概述的流技术来使用 Linq to Xml:
http://msdn.microsoft.com/en -us/library/system.xml.linq.xstreamingelement.aspx#Y1392
这是一个使用简单过滤器从 200mb xml 文件读取的快速基准测试:
这是我的计算机上的处理时间和内存使用情况:
I would personally look at using Linq to Xml utilizing the streaming techniques outlined in the Microsoft help file:
http://msdn.microsoft.com/en-us/library/system.xml.linq.xstreamingelement.aspx#Y1392
Here's a quick benchmark test reading from a 200mb xml file with a simple filter:
And here's the processing time and memory usage on my machine:
编写一些基准测试来准确确定适合您的情况,然后从那里开始...... Linq2XML 引入了很多灵活性......
Write a few benchmark tests to establish exactly what the situation is for you, and take it from there... Linq2XML introduces a lot of flexibility...
绝对需要编写自己的简单基准来找到答案,因为它因情况而异。最简单的方法是在以下应用之一中使用 BenchmarkDotNet 作为 Nuget 包。 BenchmarkDotNet 允许您测量内存分配以及每种方法的计时。它比任何使用秒表的简单解决方案都要先进得多,并且所有 .NET 团队都使用它来测量进入 .NET 的代码!
在所有情况下,您必须在发布/优化模式下构建和运行测试程序(最好是从在任何调试器之外)。
请发回您的发现。
Definitely need to write your own simple benchmarks to find the answer as it varies from case to case. The easiest way to do this is to use BenchmarkDotNet as a Nuget Package in one of the following apps. BenchmarkDotNet allows you to measure both the Memory allocation as well as the timings of each method. It is FAR more advanced than any trivial solution using a Stopwatch and all of the .NET Teams use it to measure the code that goes into .NET!
In all cases, you MUST build and run the test program in Release/Optimized mode (and preferably from outside any debugger).
Please post back your findings.