使用 jsoup 从另一个 div 类中提取 div 类
我正在尝试从另一个 div 类中的 div 类中提取 href。我尝试使用的代码片段的一个示例是:
<div class="productData">
<div class="productTitle">
<a href="https://rads.stackoverflow.com/amzn/click/com/0786866020" rel="nofollow noreferrer"> Fish! A Remarkable Way to Boost Morale and Improve Results</a>
<span class="ptBrand">by <a href="/Stephen-C.-Lundin/e/B001H6UE16">Stephen C. Lundin</a>, <a href="/Harry-Paul/e/B001H9XQJA">Harry Paul</a>, <a href="/John- Christensen/e/B003VKXJ04">John Christensen</a> and Ken Blanchard</span>
<span class="binding"> (<span class="format">Hardcover</span> - Mar. 8, 2000) </span>
</div>
我尝试从此示例中提取内部类productTitle,但是使用代码:
Document doc = Jsoup.connect(fullST).timeout(10*1000).get();
Element title = doc.getElementById("div.productTitle");
System.out.println(title);
我得到null。尝试提取更高的元素,例如:
Element title = doc.getElementById("div.productData");
我也得到 null。我尝试了很多代码组合,但无法弄清楚从内部 div 类或内部 id 中提取的语法。
任何帮助将不胜感激。
I am trying to extract a href from a div class within another div class. One example of a code snippet i am trying to use is:
<div class="productData">
<div class="productTitle">
<a href="https://rads.stackoverflow.com/amzn/click/com/0786866020" rel="nofollow noreferrer"> Fish! A Remarkable Way to Boost Morale and Improve Results</a>
<span class="ptBrand">by <a href="/Stephen-C.-Lundin/e/B001H6UE16">Stephen C. Lundin</a>, <a href="/Harry-Paul/e/B001H9XQJA">Harry Paul</a>, <a href="/John- Christensen/e/B003VKXJ04">John Christensen</a> and Ken Blanchard</span>
<span class="binding"> (<span class="format">Hardcover</span> - Mar. 8, 2000) </span>
</div>
I am trying to extract the innter class productTitle from this example however using the code:
Document doc = Jsoup.connect(fullST).timeout(10*1000).get();
Element title = doc.getElementById("div.productTitle");
System.out.println(title);
I get null. Trying to extract higher elements such as:
Element title = doc.getElementById("div.productData");
I also get null. I have tried many code combinations but cannot figure out the syntax to extract from inner div classes or inner ids.
Any help would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您尝试使用
getElementById()
按 ID 选择元素。这是错误的。这些 div 没有 ID。相反,它们有一个类名。您应该使用select()
方法。请注意,类名选择器不一定返回单个元素。文档中可以有多个。我假设您需要第一个也是唯一的
Element
,因此我在示例中添加了first()
调用。You're trying to select the element by ID using
getElementById()
. This is wrong. Those div's don't have an ID. Instead, they have a classname. You should use theselect()
method instead.Note that the classname selector doesn't necessarily return a single element. There can be multiple of them in the document. I assume that you need the first and the only
Element
, so I addedfirst()
call to the example.