JTidy Node.findBody() — 如何使用？

发布于 2024-07-08 15:48:47 字数 312 浏览 6 评论 0原文

我正在尝试使用 JTidy 进行 XHTML DOM 解析，这似乎是相当违反直觉的任务。特别是，有一个解析 HTML 的方法：

Node Tidy.parse(Reader, Writer)

并获取 > 我想，我应该使用该节点的位置，

Node Node.findBody(TagTable)

我应该在哪里获取该 TagTable 的实例？（构造函数是受保护的，我还没有找到工厂来生产它。）

我使用JTidy 8.0-SNAPSHOT。

原文

I'm trying to do XHTML DOM parsing with JTidy, and it seems to be rather counterintuitive task. In particular, there's a method to parse HTML:

Node Tidy.parse(Reader, Writer)

And to get the <body /> of that Node, I assume, I should use

Node Node.findBody(TagTable)

Where should I get an instance of that TagTable? (Constructor is protected, and I haven't found a factory to produce it.)

I use JTidy 8.0-SNAPSHOT.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一笑百媚生 2024-07-15 15:48:47

我发现有更更简单的方法来提取正文：

tidy = new Tidy();
tidy.setXHTML(true);
tidy.setPrintBodyOnly(true);

然后在读写器对上使用 tidy 。

应该如此简单。

I found there's much simpler method to extract the body:

tidy = new Tidy();
tidy.setXHTML(true);
tidy.setPrintBodyOnly(true);

And then use tidy on the Reader-Writer pair.

Simple as it should be.

回复收藏 0 原文

酷遇一生 2024-07-15 15:48:47

您可以改用 parseDOM 方法，这会给您一个 org.w3c.dom.Document 返回：

Document document = Tidy.parseDOM(reader, writer);
Node body = document.getElementsByTagName("body").item(0);

You could use the parseDOM method instead, which would give you a org.w3c.dom.Document back:

Document document = Tidy.parseDOM(reader, writer);
Node body = document.getElementsByTagName("body").item(0);

回复收藏 0 原文

~没有更多了~

关于作者

习惯成性

暂无简介

0 文章

0 评论

21 人气

关注发私信

不再见

文章 0 评论 0

关注

真是无聊啊

文章 0 评论 0

关注

樱娆

文章 0 评论 0

关注

浅语花开

文章 0 评论 0

关注

烛光

文章 0 评论 0

关注

绻影浮沉

文章 0 评论 0

友情链接

文江博客

JTidy Node.findBody() — 如何使用？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

不再见

真是无聊啊

樱娆

浅语花开

烛光

绻影浮沉

友情链接

JTidy Node.findBody() — 如何使用？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

不再见

真是无聊啊

樱娆

浅语花开

烛光

绻影浮沉

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。