如何为动态变化的元素编写CSS/XPATH?
我正在使用美丽的汤,下面是我的选择器来刮擦HREF。
html = ''' <a data-testid="Link" class="sc-pciXn eUevWj JobTile___StyledJobLink-sc-
1nulpkp-0 gkKKqP JobTile___StyledJobLink-sc-1nulpkp-0 gkKKqP"
href="https://join.com/companies/talpasolutions/4978529-project-customer-
success-manager-heavy-industries-d-f-m">'''
soup = beautifulsoup(HTML , "lxml")
jobs = soup.find_all( "a" ,class_= "sc-pciXn eUevWj JobTile___StyledJobLink-sc-1nulpkp-0
gkKKqP JobTile___StyledJobLink-sc-1nulpkp-0 gkKKqP")
for job in jobs:
job_url = job.get("href")
因为HREFS总共有3个元素。
我正在使用find_all , 我需要一种设计CSS/XPATH 的不同方法
I am using beautiful soup and below is my selector to scrape href.
html = ''' <a data-testid="Link" class="sc-pciXn eUevWj JobTile___StyledJobLink-sc-
1nulpkp-0 gkKKqP JobTile___StyledJobLink-sc-1nulpkp-0 gkKKqP"
href="https://join.com/companies/talpasolutions/4978529-project-customer-
success-manager-heavy-industries-d-f-m">'''
soup = beautifulsoup(HTML , "lxml")
jobs = soup.find_all( "a" ,class_= "sc-pciXn eUevWj JobTile___StyledJobLink-sc-1nulpkp-0
gkKKqP JobTile___StyledJobLink-sc-1nulpkp-0 gkKKqP")
for job in jobs:
job_url = job.get("href")
I am using find_all because there is a total of 3 elements with hrefs.
Above method is working but the website keeps changing the classes on a daily basis. I need a different way to design CSS/XPath
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
尝试:
打印:
Try:
Prints: