如何以文本形式检索元素混合子元素 (JDOM)

发布于 2024-11-04 06:31:15 字数 1113 浏览 6 评论 0原文

我有一个如下所示的 XML：

<documentation>
    This value must be <i>bigger</i> than the other.
</documentation>

使用 JDOM，我可以获得以下文本结构：

Document d = new SAXBuilder().build( new StringReader( s ) );
System.out.printf( "getText:          '%s'%n", d.getRootElement().getText() );
System.out.printf( "getTextNormalize: '%s'%n", d.getRootElement().getTextNormalize() );
System.out.printf( "getTextTrim:      '%s'%n", d.getRootElement().getTextTrim() );
System.out.printf( "getValue:         '%s'%n", d.getRootElement().getValue() );

这给了我以下输出：

getText:          '
    This value must be  than the other.
'
getTextNormalize: 'This value must be than the other.'
getTextTrim:      'This value must be  than the other.'
getValue:         '
    This value must be bigger than the other.
'

我真正想要的是以字符串形式获取元素的内容，即 "This value必须比另一个大。”。 getValue() 很接近，但删除了 标记。我想我想要类似 innerHTML 的 XML 文档...

我应该在内容上使用 XMLOutputter 吗？或者有更好的选择吗？

原文

I have an XML like the following:

<documentation>
    This value must be <i>bigger</i> than the other.
</documentation>

Using JDOM, I can get the following text structures:

Document d = new SAXBuilder().build( new StringReader( s ) );
System.out.printf( "getText:          '%s'%n", d.getRootElement().getText() );
System.out.printf( "getTextNormalize: '%s'%n", d.getRootElement().getTextNormalize() );
System.out.printf( "getTextTrim:      '%s'%n", d.getRootElement().getTextTrim() );
System.out.printf( "getValue:         '%s'%n", d.getRootElement().getValue() );

which give me the following outputs:

getText:          '
    This value must be  than the other.
'
getTextNormalize: 'This value must be than the other.'
getTextTrim:      'This value must be  than the other.'
getValue:         '
    This value must be bigger than the other.
'

What I really wanted was to get the content of the element as a string, namely, "This value must be <i>bigger</i> than the other.". getValue() comes close but removes the <i> tag. I guess I wanted something like innerHTML for XML documents...

Should I just use an XMLOutputter on the contents? Or is there a better alternative?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

已下线请稍等 2024-11-11 06:31:15

在 JDOM 伪代码中：

for Object o in d.getRootElement().getContents()
   if o instanceOf Element
      print <o.getName>o.getText</o.getName>
   else // it's a text
      print o.getText()

然而，正如 Prashant Bhate 所写： content.getText() 给出即时文本，该文本仅在以下情况下有用：带有文本内容的叶子元素。

In JDOM pseudocode:

for Object o in d.getRootElement().getContents()
   if o instanceOf Element
      print <o.getName>o.getText</o.getName>
   else // it's a text
      print o.getText()

However, as Prashant Bhate wrote: content.getText() gives immediate text which is only useful fine with the leaf elements with text content.

回复收藏 0 原文

一百个冬季 2024-11-11 06:31:15

Jericho HTML 非常适合此类任务。您可以使用这样的代码块准确地完成您想要做的事情：

String snippet = new Source(html).getFirstElement().getContent().toString();

它对于一般的 HTML 处理也非常有用，因为它不会试图强制它成为 XML...它处理它的方式要宽松得多。

Jericho HTML is great for this sort of task. You can accomplish exactly what you're trying to do with a code block like this:

String snippet = new Source(html).getFirstElement().getContent().toString();

It's also great for working with HTML in general because it doesn't try to force it into being XML...it deals with it much more leniently.

回复收藏 0 原文

小猫一只 2024-11-11 06:31:15

我想说您应该更改您的文档

<documentation>
  <![CDATA[This value must be <i>bigger</i> than the other.]]>
</documentation>

以遵守 XML 规范。否则， 将被视为的子元素，而不是内容。

I'd say you should change your document to

<documentation>
  <![CDATA[This value must be <i>bigger</i> than the other.]]>
</documentation>

in order to adhere to the XML specification. Otherwise <i> would be considered a child element of <documentation> and not content.

回复收藏 0 原文

~没有更多了~

关于作者

那些过往

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何以文本形式检索元素混合子元素 (JDOM)

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如何以文本形式检索元素混合子元素 (JDOM)

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。