jsoup:删除 iframe 标签
我正在使用 jsoup 1.6.1,当我尝试从 html 中删除 iframe 标签时遇到问题。当 iframe 没有任何 body 时(即
String html ="<p> This is start.</p><iframe frameborder="0" marginheight="0" /><p> This is end</p>";
Document doc = Jsoup.parse(html,"UTF-8");<br>
doc.select("iframe").remove();<br>
System.out.println(doc.text());
它返回给我 -
This is start.
但我期待结果 -
This is start. This is end
提前致谢
I am using jsoup 1.6.1 and facing the problem when I try to remove iframe tag from html. When iframe do not have any body(i.e <iframe pro=value />), the remove() method removes all the contents after thet tag. Here is my sample code.
String html ="<p> This is start.</p><iframe frameborder="0" marginheight="0" /><p> This is end</p>";
Document doc = Jsoup.parse(html,"UTF-8");<br>
doc.select("iframe").remove();<br>
System.out.println(doc.text());
It returns to me -
This is start.
But I am expecting the result -
This is start. This is end
Thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看来 iframe 的结束标记是必需的。您不能使用自关闭标签:
http://msdn.microsoft.com/en-us/library/ie/ms535258(v=vs.85).aspx
http://stackoverflow.com/questions/923328/line-after-iframe-is-not-visible
http://www.w3resource.com/html/iframe /HTML-iframe-tag-and-element.php
因此,Jsoup 遵循规范并采用 iframe 标记后面的任何内容并将其用作其主体。当您删除 iframe 时,“This is the end”也会随之删除。
It appears the closing tag for iframe is required. You can't use a self closing tag:
http://msdn.microsoft.com/en-us/library/ie/ms535258(v=vs.85).aspx
http://stackoverflow.com/questions/923328/line-after-iframe-is-not-visible
http://www.w3resource.com/html/iframe/HTML-iframe-tag-and-element.php
So, Jsoup is following the spec and taking whatever follows the iframe tag and using that as its body. When you remove the iframe, "This is the end" gets removed along with it.