使用 jsoup 从另一个 div 类中提取 div 类
我正在尝试从另一个 div 类中的 div 类中提取 href。我尝试使用的代码片段的一个示例是:
<div class="productData">
<div class="productTitle">
<a href="https://rads.stackoverflow.com/amzn/click/com/0786866020" rel="nofollow noreferrer"> Fish! A Remarkable Way to Boost Morale and Improve Results</a>
<span class="ptBrand">by <a href="/Stephen-C.-Lundin/e/B001H6UE16">Stephen C. Lundin</a>, <a href="/Harry-Paul/e/B001H9XQJA">Harry Paul</a>, <a href="/John- Christensen/e/B003VKXJ04">John Christensen</a> and Ken Blanchard</span>
<span class="binding"> (<span class="format">Hardcover</span> - Mar. 8, 2000) </span>
</div>
我尝试从此示例中提取内部类productTitle,但是使用代码:
Document doc = Jsoup.connect(fullST).timeout(10*1000).get();
Element title = doc.getElementById("div.productTitle");
System.out.println(title);
我得到null。尝试提取更高的元素,例如:
Element title = doc.getElementById("div.productData");
我也得到 null。我尝试了很多代码组合,但无法弄清楚从内部 div 类或内部 id 中提取的语法。
任何帮助将不胜感激。
I am trying to extract a href from a div class within another div class. One example of a code snippet i am trying to use is:
<div class="productData">
<div class="productTitle">
<a href="https://rads.stackoverflow.com/amzn/click/com/0786866020" rel="nofollow noreferrer"> Fish! A Remarkable Way to Boost Morale and Improve Results</a>
<span class="ptBrand">by <a href="/Stephen-C.-Lundin/e/B001H6UE16">Stephen C. Lundin</a>, <a href="/Harry-Paul/e/B001H9XQJA">Harry Paul</a>, <a href="/John- Christensen/e/B003VKXJ04">John Christensen</a> and Ken Blanchard</span>
<span class="binding"> (<span class="format">Hardcover</span> - Mar. 8, 2000) </span>
</div>
I am trying to extract the innter class productTitle from this example however using the code:
Document doc = Jsoup.connect(fullST).timeout(10*1000).get();
Element title = doc.getElementById("div.productTitle");
System.out.println(title);
I get null. Trying to extract higher elements such as:
Element title = doc.getElementById("div.productData");
I also get null. I have tried many code combinations but cannot figure out the syntax to extract from inner div classes or inner ids.
Any help would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您尝试使用
getElementById()
按 ID 选择元素。这是错误的。这些 div 没有 ID。相反,它们有一个类名。您应该使用select()
方法。请注意,类名选择器不一定返回单个元素。文档中可以有多个。我假设您需要第一个也是唯一的
Element
,因此我在示例中添加了first()
调用。You're trying to select the element by ID using
getElementById()
. This is wrong. Those div's don't have an ID. Instead, they have a classname. You should use theselect()
method instead.Note that the classname selector doesn't necessarily return a single element. There can be multiple of them in the document. I assume that you need the first and the only
Element
, so I addedfirst()
call to the example.