对于没有子元素和 COLSPAN 属性的 TD 元素,Jsoup 选择器是什么?
所以我试图解析一个相对混乱的网页。它包含我想要提取的几个键值对。这些对的统一主题是它们非空、没有子项并且没有 COLSPAN 属性。这是我尝试过的,这在逻辑上似乎有意义,但没有产生任何结果。
Elements tds = document.select("td:not([colspan]):not(:has(*))");
所以我想要 TD:
- 不包含 COLSPAN
- 没有任何孩子
看起来我一定很接近,但只是没有任何运气。有什么想法吗?
So I'm trying to parse through a web page that is relatively messy. It contains several key-value pairs that I would like to extract. The unifying theme of these pairs is that they are non-empty, they have no children, and they do not have a COLSPAN attribute. Here's what I've tried, which seems to make sense logically but does not yield any results.
Elements tds = document.select("td:not([colspan]):not(:has(*))");
So I want TDs that:
- Do not contain COLSPAN
- Do not have any children
Seems like I must be close, but just not having any luck. Any thoughts?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想出了一个答案,它使用循环来删除那些您不想选择的元素。
http://jsoup.org/apidocs/org/jsoup/select/Selector.html
我模拟了一个表格,其中包含您试图避免选择的两种情况。
+++++++++++++++++++++++++
更新
+++++++++++++++++++++++++
我又玩了一下,然后想出了这个(这证明你的选择确实有效)。你的 HTML 一定有一些我无法用我的来表示的其他小东西。
I came up with an answer that uses a loop to remove those elements that you don't want to select.
http://jsoup.org/apidocs/org/jsoup/select/Selector.html
I mocked up a table that has the two situations you are trying to keep out of your select.
+++++++++++++++++++++++
UPDATE
+++++++++++++++++++++++
I played around with it a little more and came up with this (which proves your select does work). Your HTML must have some other little thing that I don't represent with mine.