从 JTextPane 获取原始文本
在我的应用程序中,我使用 JTextPane 来显示一些日志信息。由于我想突出显示文本中的某些特定行(例如错误消息),因此我将 contentType
设置为“text/html
”。这样,我就可以格式化我的文本。
现在,我创建一个 JButton,将该 JTextPane
的内容复制到剪贴板中。这部分很简单,但我的问题是,当我调用 myTextPane.getText()
时,我得到 HTML 代码,例如:
<html>
<head>
</head>
<body>
blabla<br>
<font color="#FFCC66"><b>foobar</b></font><br>
blabla
</body>
</html>
而不是仅获取原始内容:
blabla
foobar
blabla
有没有办法只获取我的 JTextPane
的内容是纯文本吗?或者我需要自己将 HTML 转换为原始文本?
In my application, I use a JTextPane
to display some log information. As I want to hightlight some specific lines in this text (for example the error messages), I set the contentType
as "text/html
". This way, I can format my text.
Now, I create a JButton that copies the content of this JTextPane
into the clipboard. That part is easy, but my problem is that when I call myTextPane.getText()
, I get the HTML code, such as :
<html>
<head>
</head>
<body>
blabla<br>
<font color="#FFCC66"><b>foobar</b></font><br>
blabla
</body>
</html>
instead of getting only the raw content:
blabla
foobar
blabla
Is there a way to get only the content of my JTextPane
in plain text? Or do I need to transform the HTML into raw text by myself?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
无需使用 ParserCallback。只需使用:
No need to use the ParserCallback. Just use:
基于已接受的答案:从 Java 字符串中删除 HTML
稍作修改的版本在我链接的答案中找到的
Html2Text
类的如果您需要更细粒度的处理,请考虑实现更多由
HTMLEditorKit.ParserCallback
Based on the accepted answer to: Removing HTML from a Java String
Slightly modified version of the
Html2Text
class found on the answer I linked toIf you need a more fine-grained handling consider implementing more of the interface defined by
HTMLEditorKit.ParserCallback
不幸的是,你需要自己做。想象一下,如果某些内容是 HTML 特定的,例如图像 - 文本表示不清楚。例如,是否包含替代文本。
You need to do it yourself unfortunately. Imagine if some of the contents was HTML specific, eg images - the text representation is unclear. Include alt text or not for instance.
(允许RegExp吗?这不是解析,不是吗)
获取getText()结果并使用String.replaceAll()过滤所有标签。比 trim() 删除前导和尾随空格。对于第一个和最后一个“blabla”之间的空格,我没有看到通用的解决方案。也许您可以将其余的内容放在 CRLF 周围并再次修剪所有字符串。
(我不是正则表达式专家 - 也许有人可以提供正则表达式并赢得一些声誉;))
编辑
..我只是假设您不使用
<
并且>
在你的文本中 - 否则它..说,这是一个挑战。(Is RegExp allowed? This isn't parsing, isn't it)
Take the getText() result and use String.replaceAll() to filter all tags. Than a trim() to remove leading and trailing whitespaces. For the whitespaces between your first and you last 'blabla' I don't see a general solution. Maybe you can spilt the rest around CRLF and trim all Strings again.
(I'm no regexp expert - maybe someone can provide the regexp and earn some reputation ;) )
Edit
.. I just assumed that you don't use
<
and>
in your text - otherwise it.. say, it's a challenge.