在大型 XML 文档中查找特定属性
我有一个大约 100mb 的大型 XML 文档。我需要在本文档中查找两个标签的属性。我可以通过使用与以下类似的代码来完成此操作:
XmlDocument xmlDocument = new XmlDocument ( );
xmlDocument.Load ( "C:\\myxml.xml" );
XmlNode node1 = xmlDocument.SelectSingleNode ( "/data/objects[@type='data type 1']" );
if ( null != node1 )
{
result = node1 [ "Version" ].Value;
}
但是这样做会将整个 XML 加载到内存中,这似乎需要大约 200mb。无论如何,我可以提高效率吗?
编辑:使用 XmlTextReader 有很多很好的答案,我已经编写了代码供现在使用。 (这会提高内存效率,但很难看:)。
I have a large XML document that is around 100mb. I need to find attributes for two tags in this document. I can do this by using similar code to the following:
XmlDocument xmlDocument = new XmlDocument ( );
xmlDocument.Load ( "C:\\myxml.xml" );
XmlNode node1 = xmlDocument.SelectSingleNode ( "/data/objects[@type='data type 1']" );
if ( null != node1 )
{
result = node1 [ "Version" ].Value;
}
But doing so loads the entire XML into memory which seems to take around 200mb. Is there anyway I can make this more efficient?
Edit: Lots of nice answers using the XmlTextReader which I have written my code to use now. (It will be more memory efficient, but ugly :).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
就性能而言,SAX 比 DOM 好得多,因为您实际上只需要一个值。 .NET Framework 中的 SAX 实现是 XmlTextReader。
For performance, SAX is much better than DOM since you actually need only one value. SAX implementation in .NET Framework is XmlTextReader.
您应该尝试使用 XmlReader。
来自 MSDN :
与 SAX 读取器一样,XmlReader 是只进、只读游标。它提供对输入的快速、非缓存流访问。它可以读取流或文档。它允许用户提取数据并跳过应用程序不感兴趣的记录。最大的区别在于,SAX 模型是“推送”模型,解析器将事件推送到应用程序,每次读取新节点时通知应用程序,而使用 XmlReader 的应用程序可以从读取器中提取节点将要。
示例此处。
You should try to use an XmlReader.
From MSDN :
Like the SAX reader, the XmlReader is a forward-only, read-only cursor. It provides fast, non-cached stream access to the input. It can read a stream or a document. It allows the user to pull data, and skip records of no interest to the application. The big difference lies in the fact that the SAX model is a "push" model, where the parser pushes events to the application, notifying the application every time a new node has been read, while applications using XmlReader can pull nodes from the reader at will.
An example here.
您可以使用 XmlReader 类来执行此操作。一个简单但有效的示例,其功能与上面的代码相同,如下所示:
You can use the XmlReader class to do this. A simple but working example that does the same as your code above looks like this: