高效的 XSLT 处理器
我之前使用免费版本的 Saxon 8.9 来基于一些 XSL 转换 XML。但该版本的问题在于大小为 260 MB 及以上的大型 XML 文件,Saxon 给出了“内存不足”异常。我得到了 Saxon 9.2 的免费版本,但问题仍然相同。该机器有 2GB RAM。有谁知道 Saxon 的更好版本或其他一些有效的转换器可以解决这个问题(但它必须是免费的)?如果没有免费软件可用,也可以建议购买可购买的转换器,但优先考虑 Saxon 产品。
I was previously using the free version of Saxon 8.9 to convert XML based on some XSL. But the issue with that version was on large XML files of size 260 MB and above, Saxon gave "out of memory" exceptions. I got a free version of Saxon 9.2, but the issue is still the same. The machine has 2GB of RAM. Does anybody know a better version of Saxon or some other efficient converter that could solve the issue (but it has to be free)? If no free software is available, a purchasable converter could be suggested as well, but a product of Saxon at priority.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
所以我尝试使用一个超过 300mb 的 xml 文件'
并在命令行上稍微提高了内存设置'
并且转换进行得很好。
请注意,
因此,从这里开始,有两个问题:
So I tried with an xml file of a bit over 300mb'
And on the command line, boosted the memory settings a bit'
And the transformation went fine.
Note that
So, from there on, two questions:
对于巨大的文档,您通常希望避免将整个文档一次性加载到内存中。不幸的是,XSLT 并不是真正设计来处理这种情况的(尽管看起来 XSLT 2.1 对流有一些考虑,但我不确定是否有任何实现)。
您能研究一下 XML 流式转换 的使用吗?
With huge documents, you usually want to avoid loading the whole document into memory at one time. Unfortunately XSLT isn't really designed to deal with this case (although it looks like XSLT 2.1 has some considerations for streaming, I'm not sure if there are any implementations yet).
Can you investigate the use of Streaming Transformations for XML ?
传统上,XSLT 的设计要求将整个 XML 文档加载到内存中。因此,平均而言,应用 XSL 所需的内存通常是输入 XML 大小的两倍或三倍,或者在最坏的情况下,可能需要高达输入 XML 大小的 10 倍的内存。 Saxon 9.3 提供了 XML 流式转换的功能。因此,在这种情况下,消耗的内存是一致的。但它需要改变XSL,并且一个接一个处理的节点应该是彼此独立的。 XML 的流式转换不会将整个文档加载到内存中,因此需要更少的内存,并且理想情况下可以处理任何大小的 XML 文档。
Traditionally, XSLT has been designed such that it requires the whole XML document to be loaded in the memory. So, on average, the memory required to apply the XSL is usually twice or thrice the size of the input XML, or in worst case it may require memory up to 10 times the input XML size. Saxon 9.3 provides the functionality of streamed transformation of the XML. So, in that case, the memory consumed is consistent. But it requires changes in the XSL, and nodes processing one after another should be independent of each other. Streamed transformation of XML doesn't load the entire document in memory, and thus requires less memory and ideally can handle XML documents of any size.