当前位置：文江博客话题详情

Xalan XSLT - 内存堆空间不足

发布于 2024-12-29 22:23:08 字数 922 浏览 5 评论 0原文

我的项目有一个报告模块，它以 XML 的形式从数据库收集数据，并在其上运行 XSLT 以生成用户所需的报告格式。此时的选项有 HTML 和 CSV。

我们使用 Java 和 Xalan 来完成与数据的所有交互。

不好的是，用户可以请求的这些报告之一仅 XML 部分就有 143MB（大约 430,000 条记录）。当将其转换为 HTML 时，我用完了堆空间，最多为堆保留了 4096G。这是不可接受的。

看起来问题只是数据太多了，但我不禁想到有比限制客户和无法满足功能需求更好的方法来处理这个问题。

我很高兴根据需要提供更多信息，但我不能透露太多有关该项目的信息，因为我相信你们大多数人都明白。另外，答案是肯定的；我同时需要所有数据：我无法对其进行分页。

谢谢

编辑

我使用的所有转换类都在 javax.xml.transform 包中。实现看起来像这样：

final Transformer transformer = 
  TransformerFactory.newInstance().newTransformer(
    new StreamSource(new StringReader(xsl)));
final StringWriter outWriter = new StringWriter();
transformer.transform(
  new StreamSource(new StringReader(xml)), new StreamResult(outWriter));
return outWriter.toString();

如果可能的话，我想保留 XSLT 的原样。 StreamSource 的处理方法应该允许我在处理数据时 GC 一些数据，但我不确定 XSLT（函数等）可能需要什么限制才能正确执行清理。如果有人可以向我指出详细说明这些限制的资源，那将会非常有帮助。

原文

My project has a reporting module that gathers data from the database in the form of XML and runs an XSLT on it to generate the user's desired format of report. Options at this point are HTML and CSV.

We use Java and Xalan to do all interaction with the data.

The bad part is that one of these reports that the user can request is 143MB (about 430,000 records) for just the XML portion. When this is transformed into HTML, I run out of heap space with a maximum of 4096G reserved for heap. This is unacceptable.

It seems that the problem is simply too much data, but I can't help but think there is a better way to deal with this than limiting the customer and not being able to meet functional requirements.

I am glad to give more information as needed, but I cannot disclose too much about the project as I'm sure most of you understand. Also, the answer is yes; I need all of the data at the same time: I cannot paginate it.

Thanks

EDIT

All the transformation classes I am using are in the javax.xml.transform package. The implementation looks like this:

final Transformer transformer = 
  TransformerFactory.newInstance().newTransformer(
    new StreamSource(new StringReader(xsl)));
final StringWriter outWriter = new StringWriter();
transformer.transform(
  new StreamSource(new StringReader(xml)), new StreamResult(outWriter));
return outWriter.toString();

If possible, I would like to leave the XSLT the way it is. The StreamSource method of doing things should allow me to GC some of the data as it is processed, but I'm not sure what limitations on XSLT (functions, etc) this might require for it to do proper cleanup. If someone could point me at a resource detailing those limitations, it would be very helpful.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

压抑⊿情绪 2025-01-05 22:23:08

XSLT 的问题是，在进行转换时，您需要在内存中拥有整个源文档（以及结果文档）的 DOM 表示形式。对于大型 XML 文件来说，这是一个严重的问题。

您对允许流式转换的系统感兴趣，其中完整文档不必重新存储在内存中。也许 STX 是一个选择：
http://www.xml.com/pub/a/2003 /02/26/stx.html
http://stx.sourceforge.net/。它与 XSLT 非常相似，因此如果您的 XSLT 样式表以直接的方式应用于 XML，那么将其重写为 STX 可能会非常简单。

回复收藏 0 原文