性能:XmlReader 或 LINQ to XML

发布于 2024-08-30 16:33:29 字数 187 浏览 5 评论 0原文

我有一个 150 MB 的 XML 文件,在我的项目中用作数据库。目前我正在使用 XmlReader 从中读取内容。我想知道对于这种情况,使用 XmlReader 或 LINQ to XML 是否更好。

请注意,我正在此 XML 中搜索项目并显示搜索结果,因此可能需要很长时间或只需要一会儿。

I have a 150 MB XML file which is used as DB in my project. Currently I'm using XmlReader to read content from it. I want to know if it is better to use XmlReader or LINQ to XML for this scenario.

Note that I'm searching for an item in this XML and display search result, so it can take a long time or just a moment.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

冷夜 2024-09-06 16:33:29

如果您想要性能,请使用 XMLReader。它不会读取整个文件并在内存中构建 DOM 树。相反,它从磁盘读取文件并返回它在途中找到的每个节点。

通过快速谷歌搜索,我发现了 XmlReader、LINQ to XML 和 XmlDocument.Load 的性能比较。

https://web. archive.org/web/20130517114458/http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html

If you want performance use XMLReader. It doesn't read the whole file and build the DOM tree in memory. It instead, reads the file from disk and gives you back each node it finds on the way.

With a quick google search I found a performance comparison of XmlReader, LINQ to XML and XmlDocument.Load.

https://web.archive.org/web/20130517114458/http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html

旧情别恋 2024-09-06 16:33:29

我个人会考虑利用 Microsoft 帮助文件中概述的流技术来使用 Linq to Xml:
http://msdn.microsoft.com/en -us/library/system.xml.linq.xstreamingelement.aspx#Y1392

这是一个使用简单过滤器从 200mb xml 文件读取的快速基准测试:

var xmlFilename = "test.xml";

//create test xml file
var initMemoryUsage = GC.GetTotalMemory(true);
var timer = System.Diagnostics.Stopwatch.StartNew();
var rand = new Random();
var testDoc = new XStreamingElement("root", //in order to stream xml output XStreamingElement needs to be used for all parent elements of collection so no XDocument
    Enumerable.Range(1, 10000000).Select(idx => new XElement("child", new XAttribute("id", rand.Next(0, 1000))))
);
testDoc.Save(xmlFilename);
var outStat = String.Format("{0:f2} sec {1:n0} kb //linq to xml ouput streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml not streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col1 = XDocument.Load(xmlFilename).Root.Elements("child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat1 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input not streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//xmlreader
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col2 = new List<int>();
using (var reader = new XmlTextReader(xmlFilename))
{
    while (reader.ReadToFollowing("child"))
    {
        reader.MoveToAttribute("id");
        int value = Convert.ToInt32(reader.Value);
        if (value < 10)
            res2.Add(value);
    }
}
var stat2 = String.Format("{0:f2} sec {1:n0} kb //xmlreader", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col3 = StreamElements(xmlFilename, "child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat3 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//util method
public static IEnumerable<XElement> StreamElements(string filename, string elementName)
{
    using (var reader = XmlTextReader.Create(filename))
    {
        while (reader.Name == elementName || reader.ReadToFollowing(elementName))
            yield return (XElement)XElement.ReadFrom(reader);
    }
}

这是我的计算机上的处理时间和内存使用情况:

11.49 sec 225 kb      // linq to xml ouput streamed

17.36 sec 782,312 kb  // linq to xml input not streamed
6.52 sec 1,825 kb     // xmlreader
11.74 sec 2,238 kb    // linq to xml input streamed

I would personally look at using Linq to Xml utilizing the streaming techniques outlined in the Microsoft help file:
http://msdn.microsoft.com/en-us/library/system.xml.linq.xstreamingelement.aspx#Y1392

Here's a quick benchmark test reading from a 200mb xml file with a simple filter:

var xmlFilename = "test.xml";

//create test xml file
var initMemoryUsage = GC.GetTotalMemory(true);
var timer = System.Diagnostics.Stopwatch.StartNew();
var rand = new Random();
var testDoc = new XStreamingElement("root", //in order to stream xml output XStreamingElement needs to be used for all parent elements of collection so no XDocument
    Enumerable.Range(1, 10000000).Select(idx => new XElement("child", new XAttribute("id", rand.Next(0, 1000))))
);
testDoc.Save(xmlFilename);
var outStat = String.Format("{0:f2} sec {1:n0} kb //linq to xml ouput streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml not streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col1 = XDocument.Load(xmlFilename).Root.Elements("child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat1 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input not streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//xmlreader
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col2 = new List<int>();
using (var reader = new XmlTextReader(xmlFilename))
{
    while (reader.ReadToFollowing("child"))
    {
        reader.MoveToAttribute("id");
        int value = Convert.ToInt32(reader.Value);
        if (value < 10)
            res2.Add(value);
    }
}
var stat2 = String.Format("{0:f2} sec {1:n0} kb //xmlreader", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//linq to xml streamed
initMemoryUsage = GC.GetTotalMemory(true);
timer.Restart();
var col3 = StreamElements(xmlFilename, "child").Where(e => (int)e.Attribute("id") < 10).Select(e => (int)e.Attribute("id")).ToArray();
var stat3 = String.Format("{0:f2} sec {1:n0} kb //linq to xml input streamed", timer.Elapsed.TotalSeconds, (GC.GetTotalMemory(false) - initMemoryUsage) / 1024);

//util method
public static IEnumerable<XElement> StreamElements(string filename, string elementName)
{
    using (var reader = XmlTextReader.Create(filename))
    {
        while (reader.Name == elementName || reader.ReadToFollowing(elementName))
            yield return (XElement)XElement.ReadFrom(reader);
    }
}

And here's the processing time and memory usage on my machine:

11.49 sec 225 kb      // linq to xml ouput streamed

17.36 sec 782,312 kb  // linq to xml input not streamed
6.52 sec 1,825 kb     // xmlreader
11.74 sec 2,238 kb    // linq to xml input streamed
梦纸 2024-09-06 16:33:29

编写一些基准测试来准确确定适合您的情况,然后从那里开始...... Linq2XML 引入了很多灵活性......

Write a few benchmark tests to establish exactly what the situation is for you, and take it from there... Linq2XML introduces a lot of flexibility...

落墨 2024-09-06 16:33:29

绝对需要编写自己的简单基准来找到答案,因为它因情况而异。最简单的方法是在以下应用之一中使用 BenchmarkDotNet 作为 Nuget 包。 BenchmarkDotNet 允许您测量内存分配以及每种方法的计时。它比任何使用秒表的简单解决方案都要先进得多,并且所有 .NET 团队都使用它来测量进入 .NET 的代码!

  • 一个简单的控制台应用程序
  • LinqPad(我最喜欢的)。有免费版和价格合理的高级版!
  • DotNetFiddle

在所有情况下,您必须在发布/优化模式下构建和运行测试程序(最好是从在任何调试器之外)。

请发回您的发现。

Definitely need to write your own simple benchmarks to find the answer as it varies from case to case. The easiest way to do this is to use BenchmarkDotNet as a Nuget Package in one of the following apps. BenchmarkDotNet allows you to measure both the Memory allocation as well as the timings of each method. It is FAR more advanced than any trivial solution using a Stopwatch and all of the .NET Teams use it to measure the code that goes into .NET!

  • A simple Console app
  • LinqPad (my favorite). There is a free and a Premium version that is reasonably priced!
  • DotNetFiddle

In all cases, you MUST build and run the test program in Release/Optimized mode (and preferably from outside any debugger).

Please post back your findings.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文