从 div 类提取数据 Python Selenium
我试图从 Python Selenium 中的 div 类中提取特定的数字,但不知道该怎么做。我想要获取“post_parent”ID 947630
,只要它与以 09007
开头的“post_name”号码匹配即可。
我希望在多个“post_name”类中执行此操作,因此我会向其提供如下内容:search_text =“0900766b80090cb6”
,但将来会有多个,因此它必须读取首先“post_name”,然后拉“post_parent”(如果有意义的话)。
感谢任何人提供的任何建议。
<div class="hidden" id="inline_947631">
<div class="post_title">Interface Converter</div>
<div class="post_name">0900766b80090cb6</div>
<div class="post_author">28</div>
<div class="comment_status">closed</div>
<div class="ping_status">closed</div>
<div class="_status">inherit</div>
<div class="jj">06</div>
<div class="mm">07</div>
<div class="aa">2001</div>
<div class="hh">15</div>
<div class="mn">44</div>
<div class="ss">17</div>
<div class="post_password"></div>
<div class="post_parent">947630</div>
<div class="page_template">default</div>
<div class="tags_input" id="rs-language-code_947631">de</div>
</div>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您看到
这和
是兄弟节点彼此相连。
您可以使用 xpath -> follow-sibling 像这样:
代码:
或使用 ExplicitWait:
导入:
更新:
请检查在
开发工具
(Google Chrome)中,我们是否在HTML-DOM
中有唯一条目。您应该检查的 xpath :
检查步骤:
在 Chrome 中按 F12
->转到element
部分 ->执行CTRL + F
->然后粘贴xpath
并查看您所需的元素
是否通过1/1
匹配节点突出显示。如果这是唯一的
//div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent']
那么你需要还要检查以下条件。检查它是否在任何
iframe/frame/frameset
中。解决方案:先切换到 iframe/frame/frameset,然后与此 Web 元素交互。
检查它是否在任何
shadow-root
中。解决方案:使用
driver.execute_script('return document.querySelector
)返回一个Web元素,然后进行相应的操作。在交互之前确保该元素正确渲染加上一些
硬编码延迟
或显式等待
,然后重试。解决方案:
time.sleep(5)
或WebDriverWait(driver, 20).until(EC.visibility_of_element_ located((By.XPATH, "//div[@class='post_name' and text()='0900766b80090cb6']//以下同级: :div[@class='post_parent']"))).text
如果您已重定向到
新选项卡/或新窗口
并且您尚未切换到该特定新选项卡/新窗口
,否则您可能会得到NoSuchElement
异常。解决方案:先切换到相关窗口/选项卡。
如果您已切换到 iframe 并且新的所需元素不在同一 iframe 上下文中,则首先
切换到默认内容
,然后与其交互。解决方案:切换到默认内容,然后切换到相应的 iframe。
If you see
<div class="post_name">0900766b80090cb6</div>
this and<div class="post_parent">947630</div>
are siblings nodes to each other.You can use
xpath -> following-sibling
like this:Code:
or Using ExplicitWait:
Imports:
Update:
Please check in the
dev tools
(Google chrome) if we have unique entry inHTML-DOM
or not.xpath that you should check :
Steps to check:
Press F12 in Chrome
-> go toelement
section -> do aCTRL + F
-> then paste thexpath
and see, if your desiredelement
is getting highlighted with1/1
matching node.If this is unique
//div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent']
then you need to check for the below conditions as well.Check if it's in any
iframe/frame/frameset
.Solution: switch to iframe/frame/frameset first and then interact with this web element.
Check if it's in any
shadow-root
.Solution: Use
driver.execute_script('return document.querySelector
to have returned a web element and then operates accordingly.Make sure that the element is rendered properly before interacting with it. Put some
hardcoded delay
orExplicit wait
and try again.Solution:
time.sleep(5)
orWebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent']"))).text
If you have redirected to a
new tab/ or new windows
and you have not switched to that particularnew tab/new window
, otherwise you will likely getNoSuchElement
exception.Solution: switch to the relevant window/tab first.
If you have switched to an iframe and the new desired element is not in the same iframe context then first
switch to default content
and then interact with it.Solution: switch to default content and then switch to respective iframe.
我没有看到“post_parent”ID
947630
和以09007
开头的“post_name”号码之间有任何特定关系。此外,父级具有
class="hidden"
。但是,要提取特定数字,您可以使用以下任一定位器策略< /em>:
使用css_selector:
使用xpath:
理想情况下,您需要为 WebDriverWait /stackoverflow.com/a/57313803/7429447">presence_of_element_ located() 并且您可以使用以下任一方法定位器策略:
使用CSS_SELECTOR:
使用XPATH:
注意:您必须添加以下导入:
I don't see any specific relation between "post_parent" ID
947630
and "post_name" number starting09007
. Moreover, the parent<div>
is havingclass="hidden"
.However, to pull the specific number you can use either of the following locator strategies:
Using css_selector:
Using xpath:
Ideally you need to induce WebDriverWait for the presence_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
Using XPATH:
Note: You have to add the following imports :
您可以创建一个方法并使用以下
xpath
根据post_name
文本获取post_parent
文本。如果它与文本
starts-with('09007')
匹配,它将返回值似乎父类被隐藏,您需要使用
textContent
来获取值。You can create a method and use the following
xpath
to get thepost_parent
text based onpost_name
text.This will return value if it is matches the text
starts-with('09007')
It seems parent class is hidden you need to use
textContent
to get the value.