使用什么库来构建 HTML 文档?
请有人推荐能够做与这些库相反的事情的库吗?
HtmlCleaner、TagSoup、HtmlParser、HtmlUnit、jSoup、jTidy、nekoHtml、WebHarvest 或 Jericho。
我需要构建 html 页面,从字符串内容构建 DOM 模型。
编辑:我需要它用于测试目的。我有各种类型的输入/字符串可能位于不同位置的 html 页面中...所以我需要动态构建它...然后我根据必须满足或不满足的各种标准处理 html 页面。
我将向您展示为什么我问这个问题,考虑使用 htmlCleaner 来完成这项工作:
List<String> paragraphs = getParagraphs(entity.getFile());
List<TagNode> pNodes = new ArrayList<TagNode>();
TagNode html = cleaner.clean("<html/>");
for(String paragraph : paragraphs) {
TagNode p = new TagNode("p");
pNodes.add(p);
// CANNOT setText() ?
}
html.addChildren(pNodes);
问题是 TagNode
有 getText()
方法,但没有 setText()< /code> 方法 ....
请添加更多关于这个问题有多模糊的评论...您可以做的最好的事情
Could please anybody recommend libraries that are able to do the opposite thing than these libraries ?
HtmlCleaner, TagSoup, HtmlParser, HtmlUnit, jSoup, jTidy, nekoHtml, WebHarvest or Jericho.
I need to build html pages, build the DOM model from String content.
EDIT: I need it for testing purposes. I have various types of input/strings that might be in the html page on various places... So I need to dynamically build it up... I then process the html page based on various criterions that must be fulfilled or not.
I will show you why I asked this question, consider htmlCleaner for this job :
List<String> paragraphs = getParagraphs(entity.getFile());
List<TagNode> pNodes = new ArrayList<TagNode>();
TagNode html = cleaner.clean("<html/>");
for(String paragraph : paragraphs) {
TagNode p = new TagNode("p");
pNodes.add(p);
// CANNOT setText() ?
}
html.addChildren(pNodes);
The problem is that TagNode
has getText()
method, but no setText()
method ....
Please add more comments about how vague this question is ... The best thing you can do
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
杰汤,杰汤,杰汤!这些我都用过,而且从长远来看,它是我最喜欢的。您可以使用它来构建文档,此外,它还带来了 Jquery 式遍历的许多魔力以及我迄今为止在 Java 库中见过的最佳 HTML 文档解析。我对此感到非常高兴,因此我不介意无耻地宣传它。 ;)
Jsoup, Jsoup, Jsoup! I've used all of those, and it's my favorite by a long shot. You can use it to build documents, plus it brings a lot of the magic of Jquery-style traversing alongside the best HTML document parsing I've seen to date in a Java library. I'm so happy with it that I don't mind shamelessly promoting it. ;)
Java 有很多模板库,从 JSP 到 FreeMarker,从各种框架(Spring?)中的具体实现到像 StringTemplate 这样的通用库。
最困难的任务是……做出选择。
一般来说,这些库提供了制作网页的骨架,并用“洞”来填充变量。这是最简单的方法,通常与工具配合使用效果很好。
如果您确实想从 Dom 构建,则只需使用 XML 库并生成 XHTML。
There are lot of template libraries for Java, from JSP to FreeMarker, from specific implementations in various frameworks (Spring?) to generic libraries like StringTemplate.
The most difficult task is... to make a choice.
In general, these libraries offer to make a skeleton of Web page, with "holes" to fill with variables. It is the simplest approach, often working well with tools.
If you really want to build from Dom, you can just use an XML library and generate XHTML.
如果您对 HtmlCleaner 特别感兴趣,它实际上是构建 html 文档的一个非常方便的选择。
但您必须知道,如果您想将内容设置到 TagNode,则需要附加一个子 ContentNode 元素:-)
If you are interested in HtmlCleaner particularly, it is actually a very convenient choice for building html documents.
But you must know that if you want to set content to a TagNode, you append a child ContentNode element :-)
jwebutils - 使用 Java 创建 HTML 5 标记的库。它还支持创建 JSON 和 CSS 3 标记。
Jakarta 元素构造集 (ECS) - 用于为各种标记语言生成元素的 Java API直接支持 HTML 4.0 和 XML。现在退休了,但有些人真的很喜欢它。
jwebutils -- A library for creating HTML 5 markup using Java. It also contains support for creating JSON and CSS 3 markup.
Jakarta Element Construction Set (ECS) - A Java API for generating elements for various markup languages it directly supports HTML 4.0 and XML. Now retired, but some folks really like it.