在 python selenium 中使用 get_attribute() 查找 xpath
这是一种有点倒退的网络抓取方法。我需要在使用 text()= 标识符找到 Web 元素之后找到它的 xpath
因为 xpath 值根据显示的信息而不同,所以我需要在行内使用可预测的标签来定位跨度文本找到的元素旁边。我发现一种简单可靠的方法是找到关键字标签,然后在 xpath 中将 td 整数加一。
def x_label(self, contains):
mls_data_xpath = f"//span[text()='{contains}']"
string = self.driver.find_element_by_xpath(mls_data_xpath).get_attribute("xpath")
digits = string.split("td[")[1]
num = int(re.findall(r'(\d+)', digits)[0]) + 1
labeled_data = f'{string.split("td[")[0]}td[{num}]/span'
print(labeled_data)
labeled_text = self.driver.find_element_by_xpath(labeled_data).text
return labeled_text
我找不到太多关于 .get_attribute() 和 get_property() 的信息,所以我希望有类似 .get_attribute("xpath") 的东西,但我一直找不到它。
基本上,我接受一个像“ApprxTotalLivArea”这样的字符串,我可以依赖它,然后将 td[0] 之后的整数增加 1,以找到隔壁单元格的跨度数据。我希望有类似 get_attributes("xpath") 的东西来从我通过 text()='{contains}' 搜索找到的元素中找到 xpath 字符串。
This is a somewhat backwards approach to web scraping. I need to locate the xpath of a web element AFTER I have already found it with a text()= identifier
Because the xpath values are different based on what information shows up, I need to use predictable labels inside the row for locating the span text next to found element. I found a simple and reliable way is locating the keyword label and then increasing td integer by one inside the xpath.
def x_label(self, contains):
mls_data_xpath = f"//span[text()='{contains}']"
string = self.driver.find_element_by_xpath(mls_data_xpath).get_attribute("xpath")
digits = string.split("td[")[1]
num = int(re.findall(r'(\d+)', digits)[0]) + 1
labeled_data = f'{string.split("td[")[0]}td[{num}]/span'
print(labeled_data)
labeled_text = self.driver.find_element_by_xpath(labeled_data).text
return labeled_text
I cannot find too much information on .get_attribute() and get_property() so I am hoping there is something like .get_attribute("xpath") but I haven't been able to find it.
Basically, I am taking in a string like "ApprxTotalLivArea" which I can rely on and then increasing the integer after td[0] by 1 to find the span data from cell next door. I am hoping there is something like a get_attributes("xpath") to locate the xpath string from the element I locate through my text()='{contains}' search.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这个函数迭代地获取父级,直到它到达顶部的 html 元素
希望这有帮助!
This function iteratively get's the parent until it hits the html element at the top
Hope this helps!
我能够找到执行脚本的python版本从这篇文章中基于另一个论坛中的JavaScript答案。我必须在字符串上进行大量的.replace()调用此函数创建,但我能够普遍找到我需要的标签字符串,并通过+1递增TD/Span XPath,以找到列数据并检索它不同页面列表上XPATH值的差异。
I was able to find a python version of the execute script from this post that was based off a JavaScript answer in another forum. I had to make a lot of .replace() calls on the string this function creates but I was able to universally find the label string I need and increment the td/span xpath by +1 to find the column data and retrieve it regardless of differences in xpath values on different page listings.
Tom Fuller 功能的升级。如果父元素中存在具有相同 tag_name(例如,类)的元素,以下内容有助于找到正确的 xpath:
An upgrade of Tom Fuller's function. The following helps to find the correct xpath if there are elements with the same tag_name (and, for example, class) in the parent element:
continue
The Remote WebElement does includes the following methods:
But
xpath
isn't a valid property of a WebElement. Soget_attribute("xpath")
will always returnNULL