Jsoup Element.text() 间歇性?
在下面的代码片段中:
String linkHref = "";
String linkText = "";
Elements links = div.getElementsByTag("a");
for (Element link : links) {
linkHref = link.attr("href");
linkText += link.text();
break;
}
linkText 有时是空的,即使我可以在 WebView 上清楚地看到链接文本就在那里!
另一方面,linkHref 总是以正确的值结束。
什么可以解释这种看似间歇性的行为?
这是 Jsoup 中的错误吗?我可能还缺少其他东西吗?
更新,回答@BalusC的问题如下:Jsoup版本是jsoup-1.5.2,div.html()说:
<div class="d2 dl">
<a href="nextp.html" class="cO"><img src="images/no001.jpg" alt="" vspace="0" width="69" border="0" height="69" hspace="0" /></a>
<span class="bc">2.</span>
<a accesskey="2" href="nextp.html"> Subject line </a>
</div>
<p class="aG">Human resource policies are viewed as a valuable to understand the companies.</p>
<div>
</div>
In the following snippet of code:
String linkHref = "";
String linkText = "";
Elements links = div.getElementsByTag("a");
for (Element link : links) {
linkHref = link.attr("href");
linkText += link.text();
break;
}
linkText is sometimes empty, even when I can see clearly on the WebView that the link text is there!
On the other hand, linkHref always ends up with the correct value.
What could possibly explain this seemingly intermittent behavior?
Is this a bug in Jsoup? Something else that I may be missing?
Update, answering @ BalusC's questions below: The Jsoup version is jsoup-1.5.2 and div.html() says:
<div class="d2 dl">
<a href="nextp.html" class="cO"><img src="images/no001.jpg" alt="" vspace="0" width="69" border="0" height="69" hspace="0" /></a>
<span class="bc">2.</span>
<a accesskey="2" href="nextp.html"> Subject line </a>
</div>
<p class="aG">Human resource policies are viewed as a valuable to understand the companies.</p>
<div>
</div>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
第一个链接根本不包含文本。它包含一个图像。所以 Jsoup 的工作做得非常好。
您可能想使用
Element# hasText()
首先检查链接是否有文本。The first link doesn't contain text at all. It contains an image. So Jsoup is doing its job perfectly fine.
You probably want to make use of the
Element#hasText()
first to check if the link has text.