为什么 JDOM 的 getChild() 方法返回 null?

发布于 2024-10-21 02:59:21 字数 813 浏览 7 评论 0原文

我正在做一个关于 html 文档操作的项目。我想要现有 html 文档中的正文内容将其修改为新的 html。现在我正在使用 JDOM。我想在我的编码中使用 body 元素。为此,我在编码中使用了 getChild("body") 。但它向我的程序返回 null 。但是我的 html 文档有一个 body 元素。任何人都可以帮助我知道这个问题吗我是一名学生?

希望得到指点..

编码:

import org.jdom.Document;
import org.jdom.Element;
public static void getBody() {
SAXBuilder builder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", true);
org.jdom.Document jdomDocument=builder.build("http://www......com");
Element root = jdomDocument.getRootElement();
      //It returns null
System.out.println(root.getChild("body"));
}

也请参考这些..我的html根目录和子目录打印在控制台中...

root.getName():html

SIZE:2

[Element: <head [Namespace: http://www.w3.org/1999/xhtml]/>]

[Element: <body [Namespace: http://www.w3.org/1999/xhtml]/>]

I'm doing a project regarding html document manipulation. I want body content from existing html document to modify it into a new html.Now i'm using JDOM. i want to use body element in my coding.For that i used getChild("body") in my coding.But it returns null to my program.But my html document have a body element.Could anybody help me to know this problem as i'm a student?

would appreciate pointers..

Coding:

import org.jdom.Document;
import org.jdom.Element;
public static void getBody() {
SAXBuilder builder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", true);
org.jdom.Document jdomDocument=builder.build("http://www......com");
Element root = jdomDocument.getRootElement();
      //It returns null
System.out.println(root.getChild("body"));
}

please refer these too.. My html's root and childs printed in console...

root.getName():html

SIZE:2

[Element: <head [Namespace: http://www.w3.org/1999/xhtml]/>]

[Element: <body [Namespace: http://www.w3.org/1999/xhtml]/>]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

鹿童谣 2024-10-28 02:59:21

我在您的代码中发现了一些问题:
1) 如果你想通过网络构建远程 xml,你应该使用另一个接收 URL 作为输入的构建方法。实际上,您正在将名称为“www......com”的文件解析为 xml。

Document jdomDocument = builder.build( new URL("http://www........com"));

2)如果你想将html页面解析为xml,你必须检查它是否是一个格式良好的xhtml文档,否则你无法将其解析为xml

3)正如我已经在另一个答案中说过的,< code>root.getChild("body") 返回 root 的子级,名称为“body”,不带命名空间。您应该检查您要查找的元素的名称空间;如果它有一个合格的命名空间,你必须以这种方式传递它:

root.getChild("body", Namespace.getNamespace("your_namespace_uri"));

要以简单的方式知道哪个命名空间有你的元素,你应该使用 getChildren 方法打印出所有 root 的子元素:

for (Object element : doc.getRootElement().getChildren()) {
    System.out.println(element.toString());
}

如果你试图解析一个 xhtml,可能你有命名空间 uri http://www.w3.org/1999/xhtml。所以你应该这样做:

root.getChild("body", Namespace.getNamespace("http://www.w3.org/1999/xhtml"));

I've found some problems in your code:
1) if you want to build a remote xml through the net, you should user another build method which receives an URL as input. Actually you're parsing the file with name "www......com" as an xml.

Document jdomDocument = builder.build( new URL("http://www........com"));

2) if you want to parse an html page as xml, you have to check that it is a well formed xhtml document, otherwise you can't parse it as xml

3) as I've already said you in another answer, the root.getChild("body") returns root's child which name is "body", without namespace. You should check the namespace for the element that you're looking for; if it has a qualified namespace you have to pass it in this way:

root.getChild("body", Namespace.getNamespace("your_namespace_uri"));

To know which namespace has your element in an easy way, you should print out all root's children using getChildren method:

for (Object element : doc.getRootElement().getChildren()) {
    System.out.println(element.toString());
}

If you're trying to parse an xhtml, probably you have namespace uri http://www.w3.org/1999/xhtml. So you should do this:

root.getChild("body", Namespace.getNamespace("http://www.w3.org/1999/xhtml"));
风苍溪 2024-10-28 02:59:21

是什么让您感觉需要 org.ccil.cowan.tagsoup.Parser?它为您提供了哪些 JDK 内置的解析器没有提供的功能?

我会尝试使用 SAXBuilder 的另一个构造函数。使用 JDK 中内置的解析器,看看是否有帮助。

首先使用 XMLOutputter 打印出整个树。

public static void getBody() 
{
    SAXBuilder builder = new SAXBuilder(true);
    Document document = builder.build("http://www......com");
    XMLOutputter outputter = new XMLOutputter();
    outputter.output(document, System.out);  // do something w/ exception
}

What makes you feel like you require org.ccil.cowan.tagsoup.Parser? What does it provide you that the parser built into the JDK does not?

I'd try it using another constructor for SAXBuilder. Use the parser built into the JDK and see if that helps.

Start by printing out the entire tree using XMLOutputter.

public static void getBody() 
{
    SAXBuilder builder = new SAXBuilder(true);
    Document document = builder.build("http://www......com");
    XMLOutputter outputter = new XMLOutputter();
    outputter.output(document, System.out);  // do something w/ exception
}
幻想少年梦 2024-10-28 02:59:21
import org.jdom.Document;
import org.jdom.Element;
public static void getBody() {
SAXBuilder builder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", true);
org.jdom.Document jdomDocument=builder.build("http://www......com");
Element root = jdomDocument.getRootElement();
      //It returns null
System.out.println(root.getChild("body", Namespace.getNamespace("my_name_space")));
}
import org.jdom.Document;
import org.jdom.Element;
public static void getBody() {
SAXBuilder builder = new SAXBuilder("org.ccil.cowan.tagsoup.Parser", true);
org.jdom.Document jdomDocument=builder.build("http://www......com");
Element root = jdomDocument.getRootElement();
      //It returns null
System.out.println(root.getChild("body", Namespace.getNamespace("my_name_space")));
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文