使用 Java 编程式 HTML 文档生成
有谁知道如何在Java中以编程方式生成HTMLDocument对象,而不需要在外部生成字符串,然后使用HTMLEditorKit#read来解析它? 我问的原因有两个:
首先,我的 HTML 生成例程需要非常快,并且我认为将字符串解析为内部模型比直接构建该模型成本更高。
其次,面向对象的方法可能会产生更清晰的代码。
我还应该提到,出于许可原因,我无法使用 JVM 附带的库以外的任何库。
Does anyone know how to generate an HTMLDocument object programmatically in Java without resorting to generating a String externally and then using HTMLEditorKit#read to parse it? Two reasons I ask:
Firstly my HTML generation routine needs to be very fast and I assume that parsing a string into an internal model is more costly than directly constructing this model.
Secondly, an object-oriented approach would likely result in cleaner code.
I should also mention that, for licensing reasons, I can't resort to using any libraries other than those shipped with the JVM.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
一种面向对象的方法是使用名为 ECS 的库。
这是一个非常简单的库,并且已经很长时间没有改变了。 话又说回来,HTML 4.01 规范也没有改变;)我使用过 ECS,并认为它比仅使用 Strings 或 StringBuffers/StringBuilders 生成大型 HTML 片段要好得多。
小示例:
optionElement.toString()
现在将产生:该库支持 HTML 4.0 和 XHTML。 最初唯一让我困扰的是与 XHTML 版本相关的类名以小写字母开头:
option
、input
、a
、tr
等等,这违背了最基本的Java约定。 但如果您想使用 XHTML,您可以习惯这一点; 至少我做到了,速度快得惊人。One object-oriented approach is to use a library called ECS.
It is quite simple library, and has not changed for ages. Then again, the HTML 4.01 spec has not changed either ;) I've used ECS and consider it far better than generating large HTML fragments with just Strings or StringBuffers/StringBuilders.
Small example:
optionElement.toString()
would now yield:The library supports both HTML 4.0 and XHTML. The only thing that initially bothered me a lot was that names of classes related to the XHTML version started with a lowercase letter:
option
,input
,a
,tr
, and so on, which goes against the most basic Java conventions. But that's something you can get used to if you want to use XHTML; at least I did, surprisingly fast.我会研究 JSP 的工作原理 - 即,它们编译成一个 servlet,该 servlet 基本上是一组巨大的 StringBuffer 附加。 这些标签还会编译成 Java 代码片段。 这很混乱,但是非常非常快,除非深入研究 Tomcat 的工作目录,否则您永远不会看到这段代码。 也许您想要的是从以 HTML 为中心的视图(如 JSP)实际编写 HTML 生成代码,并添加循环标记等,并在项目内部使用类似的代码生成引擎和编译器。
或者,只需在一个实用程序类中自行处理 StringBuilder,该实用程序类具有“openTag”、“closeTag”、“openTagWithAttributes”、“startTable”等方法......它可以使用 Builder 模式,并且您的代码将如下所示喜欢:
I'd look into how JSPs work - i.e., they compile down into a servlet that is basically one huge long set of StringBuffer appends. The tags also compile down into Java code snippets. This is messy, but very very fast, and you never see this code unless you delve into Tomcat's work directory. Maybe what you want is to actually code your HTML generation from a HTML centric view like a JSP, with added tags for loops, etc, and use a similar code generation engine and compiler internally within your project.
Alternatively, just deal with the StringBuilder yourself in a utility class that has methods for "openTag", "closeTag", "openTagWithAttributes", "startTable", and so on... it could use a Builder pattern, and your code would look like:
在处理 XHTML 时,我使用 Java 6 的 XMLStreamWriter 接口。
When dealing with XHTML, I have had much success using Java 6's XMLStreamWriter interface.
我认为通过 StringBuilder(或直接生成流)之类的东西手动生成 HTML 将是您的最佳选择,特别是如果您无法使用任何外部库。
由于无法使用任何外部库,您将在开发速度而不是性能方面遭受更多损失。
I think manually generating your HTML via something like a StringBuilder (or directly to a stream) is going to be your best option, especially if you cannot use any external libraries.
Not being able to use any external libraries, you will suffer more in terms of speed of development rather than performance.
javax.swing.text.html 有
HTMLWriter
和HTMLDocument
类等。 我没有用过它们。 我已经在.Net中使用了HtmlWriter
,它完全符合您的要求,但java版本可能不一样。这是文档: http ://java.sun.com/j2se/1.5.0/docs/api/javax/swing/text/html/HTMLWriter.html
另外,我无法想象
StringBuilder
比使用对象层构建要慢。 在我看来,任何面向对象的方法都必须构建对象图,然后生成字符串。 不对这些内容使用原始字符串的主要原因是,您肯定会遇到编码错误以及产生格式错误的文档的其他错误。选项 2:您可以使用您最喜欢的 XML api 并生成 XHTML。
javax.swing.text.html has
HTMLWriter
andHTMLDocument
class among others. I have not used them. I have used theHtmlWriter
in .Net and it does exactly what you want, but the java version may not work out to be the same.Here is the doc: http://java.sun.com/j2se/1.5.0/docs/api/javax/swing/text/html/HTMLWriter.html
Also, I can't imagine a
StringBuilder
being slower than building with an object layer. It seems to me that any object oriented approach would have to build the object graph AND then produce the string. The main reason not to use raw strings for this stuff is that you are sure to get encoding errors as well as other mistakes that produce malformed documents.Option 2: You could use your favorite XML api's and produce XHTML.
您可能想使用 render() 方法构建一些 Element 对象,然后将它们组装成树结构; 通过访问算法,您可以继续设置值,然后渲染整个内容。
PS:您是否考虑过像 freemarker 这样的模板引擎?
You may want to build some Element object with a render() method, and then assemble them in a tree structure; with a visit algorhytm you may then proceed to set the values and then render the whole thing.
PS: have you considered some templating engine like freemarker?
看来您可以使用直接构造
HTMLDocument.BlockElement
和HTMLDocument.BlockElement
对象来完成您尝试的任务。 这些构造函数的签名表明至少可以直接使用。我建议检查 OpenJDK 中的 Swing 源代码,看看解析器如何处理这个问题,并从那里导出您的逻辑。
我还建议这种优化可能为时过早,也许这应该是一种速度优化的替代方法,以替代更简单的方法(即生成 HTML 文本),只有在这确实成为应用程序中的性能热点时才引入。
It appears that you can accomplish what you are attempting using direct construction of
HTMLDocument.BlockElement
andHTMLDocument.BlockElement
objects. Theses constructors have a signature that suggests direct use is possible, at least.I would suggest examining the Swing sources in OpenJDK to see how the parser handles this, and derive your logic from there.
I would also suggest that this optimization may be premature, and perhaps this should be a speed-optimized replacement for a simpler approach (i.e. generating the HTML text) only introduced if this really does become a performance hotspot in the application.
您可以使用任何合适的 xml 库,例如 JDom、Xom 或 XStream。 Html 只是 XML 的一个特例。
或者,您可以使用服务器端 java 的现有模板引擎之一,例如 jsp 或velocity。
You can use any decent xml library like JDom or Xom or XStream. Html is just a special case of XML.
Or, you can use one of the existing templating engines for server side java like jsp or velocity.
基本上,您可以使用插入方法之一 insertBeforeEnd()、insertAfterEnd()、insertBeforeStart()、insertAfterStart() 将 html 插入 HTMLDocument。
您向该方法提供要插入的 html 以及要插入 html 的文档树中的位置。
例如。
doc.insertBeforeEnd(元素, html);
HTMLDocument 类还提供了遍历文档树的方法。
Basically you can insert html into your HTMLDocument using one of the insert methods, insertBeforeEnd(), insertAfterEnd(), insertBeforeStart(), insertAfterStart().
You supply the method with the html you want to insert and the position in the document tree that you want the html inserted.
eg.
doc.insertBeforeEnd(element, html);
The HTMLDocument class also provided methods for traversing the document tree.