提高大型 XML 字符串反序列化的性能
我正在我的 WP7 应用程序中解析一个大的 xml 文件(1 MB)。该文件是项目的一部分, 所以它不是通过网络加载的。不幸的是,这需要很长时间,大约。 3秒,到 获取我需要的内容。我读到,问题是 xml 序列化,它的 最好进行二进制序列化。
但我现在有了 xml 文件,是否有可能更改格式或其他内容 我的 xml 文件,这样解析会更快?我已经把它分成很多部分了 但速度并没有显着加快。
i am parsing a big xml file (1 MB) in my WP7 App. The file is part of the project,
so its not loaded through the web. Unfortnuately it takes very long, ca. 3 seconds, to
get the content i need. I have read, that the problem is the xml serialization, and its
better to go for a binary serialization.
But i have my xml file now, is there any possibility to change the format or something
of my xml file, so that the parsing will go faster? I have split it in many parts already,
but its not significant faster.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
1兆字节并不是特别大。
二进制格式将更紧凑、更快,特别是如果您自己编写而不是使用 .net 序列化支持,这会增加大量数据开销。
如果您想坚持使用 xml,通常可以通过使用简短、紧凑的格式来显着提高性能:
分析并优化您的加载代码 - 您可能会发现与 xml 无关的瓶颈。您也许可以推迟一些工作,或者在另一个线程上进行一些数据转换处理,但要注意不要为了小收益而引入大复杂性。
最后,尝试不同的方法 - XmlDocument 而不是 XmlReader,或者不同的库,或者将数据预加载到 MemoryStream 中。您可能会发现那里也可以进行改进。
或者只是告诉你的老板这是因为你没有八核至强处理器和 1 TB 的快速 SSD...:-)
1 megabyte isn't particularly big.
A binary format will be more compact and faster, especially if you write your own rather than using the .net serialisation support, which adds a lot of overhead to the data.
If you want to stick with xml, you can usually improve performance significantly by using a brief, compact format:
Profile and optimise your loading code - you may find bottlenecks that are nothing to do with xml. You may be able to defer some work, or do some data conversion processing on another thread, but beware of introducing big complexity for small gains.
Finally, try different approaches - XmlDocument rather than XmlReader, or a different library, or pre-loading the data into a MemoryStream. You may find improvements can be made there too.
Or just tell your boss it's because you don't have an eight core xeon with a terabyte of fast ssds... :-)
如果您不需要一次需要所有数据,处理它的一种方法是手动异步加载数据块(您可能需要手动解析数据)并在加载时分块更新 UI。
另外,如果序列化中有任何额外的数据,您始终可以提出自己的 xml 架构,该架构不太冗长,并且仅包含您需要的裸信息。
If you don't need all the data at once, one way to handle it is to asynchronously load chunks of data manually (you might need to parse the data manually) and update the UI in chunks as it loads.
Also, if there is any extra data in the serialization, you could always come up with your own xml schema that is less verbose and only contains the bare information that you need.
您至少有四个选择:
You have at least four options: