.NET 1.1 中的 XMLTextReader
我有一个读取 XML 文件的进程。 它使用 XMLTextReader 类来执行此操作,因为它应该是一个快速、仅向前的 XML 解析器/读取器。
对于 1 MB 的测试文件工作得很好,但在实时系统中处理 12 MB 的文件时会完全停止。
除了编写自己的 XML 阅读器之外,还有其他解决方案吗? 这不是世界末日,但如果可能的话我更愿意使用可用的标准组件
I have a process that reads an XML file. It uses the XMLTextReader class to do this as it is supposed to be a fast, forward only XML parser/reader.
Works just great with a 1 megabyte test file but comes to a complete halt when working on a 12 meg file in the live system.
Are there any solutions to this other than writing my own XML reader? That's not the end of the world but I would prefer to use available standard components if possible
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
SAXExpat 曾经非常好。 Expat 是 XML 解析器,几乎是一个参考实现。 我记得用它来读取一些通过 TCP 连接发送的同步 XML 文件,有时是非常大的文件(大约 50mb),没有任何问题。 那是 3/4 年前的事了,那时的计算机还很糟糕,是 .NET 1.1。
SAXExpat used to be really good. Expat is the XML parser, almost a reference implementation. I remember using it to read some synchronization XML files sent over a TCP connection, sometimes really big files (around 50mb) without any kind of problem. And that was 3/4 years ago, in .NET 1.1 and really crappy computers.
我不想推荐这个,但如果该软件没有出售或外部,您可以尝试从 Mono 引入阅读器,看看是否可以解决您的问题。
I hate to recommend this, but if the software isn't sold or external, you could try bringing in the reader from Mono and see if that fixes your woes.
我过去也遇到过类似的性能问题。 我追溯到尝试远程解析 DTD/模式。 你在做这个吗? 如果可能,请尝试将 XmlTextReader.XmlResolver 设置为 null。
I have had similar performance issues in the past. I traced it back to trying to remotely resolve against a DTD/schema. Are you doing this? Try setting XmlTextReader.XmlResolver to null if possible.
取决于你如何处理从读者那里得到的信息。 您是否将其放入 XML DOM 或任何与此相关的对象模型中? 无论您使用什么语言或库,这都会对内存造成很大的影响。
也许1.1有缺陷,考虑尝试2.0吗? 我在 1.1 版本中从未使用过 XmlTextReader,所以我不能保证它:但从 2.0 开始它就很完美了。
Depends what you do with what you get out of the reader. Are you putting it in an XML DOM, or any object model for that matter? That would make a big memory hit not matter what language or library you use.
Maybe it is flawed in 1.1, thought about trying out 2.0? I never used the XmlTextReader in my 1.1 days, so I can't vouch for it: but since 2.0 it is perfect.
只是一个想法。 您是否在整个过程中打开数据库事务? 如果是这样,请在没有事务的情况下尝试,或者至少在此过程中更频繁地提交。
Just one thought. Are you opening a database transaction for the length of the entire process? If so try it without the transaction or at least commit more often during the process.
如果问题出在 XmlTextReader 中,我会感到非常惊讶。
如果您花几分钟编写一个测试程序,该程序创建一个 XmlTextReader,并简单地使用 Read() 读取文件中的每个节点,直到到达文档末尾,我敢打赌您会发现它会放大您的文档。 12mb 文件就像热刀切黄油一样。 如果我遇到这个问题,这是我会尝试的第一件事。
因为一旦消除了问题根源 XmlTextReader,您就可以将注意力集中在实际导致问题的原因上 - 这很可能是处理您正在读取的节点的代码,而不是读取节点的代码。
I would be very surprised if the problem were in the XmlTextReader.
If you spend a few minutes to write a test program that creates an XmlTextReader and simply uses Read() to read through each node in the file until it gets to the end of the document, I bet you'll find that it zooms through your 12mb file like a hot knife through butter. That's the first thing I'd try if I were experiencing this problem.
Because once you've eliminated XmlTextReader as the source of the problem, you can focus your attention on what's actually causing it - which is, very probably, the code that processes the nodes that you're reading, not the code that reads the nodes.