显示错误' WebElement'对象没有属性' startswith'

发布于 2025-02-09 07:19:12 字数 1566 浏览 2 评论 0原文

import time
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager

options = webdriver.ChromeOptions()

# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")

chrome_driver = webdriver.Chrome(
    service=Service(ChromeDriverManager().install()),
    options=options
)


def supplyvan_scraper():
    with chrome_driver as driver:
        driver.implicitly_wait(15)
        URL = 'https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx'
        driver.get(URL)
        time.sleep(3)
        
        link=driver.find_elements(By.XPATH, "//div[@class='list-group']//a")
        for links in link:
            if(links.startsWith("https://www.ifep.ro/")):
                print(links.get_attribute("href"))

他们向我展示了这些行中的错误,有一些不必要的链接,我想删除这些页面链接 https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx

'WebElement' object has no attribute 'startsWith'

“在此处输入图像说明”

import time
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager

options = webdriver.ChromeOptions()

# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")

chrome_driver = webdriver.Chrome(
    service=Service(ChromeDriverManager().install()),
    options=options
)


def supplyvan_scraper():
    with chrome_driver as driver:
        driver.implicitly_wait(15)
        URL = 'https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx'
        driver.get(URL)
        time.sleep(3)
        
        link=driver.find_elements(By.XPATH, "//div[@class='list-group']//a")
        for links in link:
            if(links.startsWith("https://www.ifep.ro/")):
                print(links.get_attribute("href"))

They show me error in these line there are some unwanted link and I want to remove it these is the page link https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx:

'WebElement' object has no attribute 'startsWith'

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

九八野马 2025-02-16 07:19:12

这是因为WebElement不是字符串。您必须首先从WebElement中提取文本,然后在结果文本上使用startswith

这是完整的代码:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

chrome_options = Options()
# chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--disable-extensions")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx")
driver.maximize_window()
time.sleep(3)

        
links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

您可以仅使用此修改的代码:

links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

输出:

https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33353&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=34493&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=15868&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33526&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33459&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=9100&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=27125&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=24811&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=1932&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=7746&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=18864&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=23966&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=3840&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16192&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16350&Signature=387599

This is because the WebElement is not a string. You have to first extract the text from the WebElement and then use startsWith on the resulting text.

Here is the complete code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

chrome_options = Options()
# chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--disable-extensions")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx")
driver.maximize_window()
time.sleep(3)

        
links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

You can use this modified code only:

links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

Output:

https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33353&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=34493&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=15868&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33526&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33459&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=9100&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=27125&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=24811&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=1932&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=7746&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=18864&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=23966&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=3840&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16192&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16350&Signature=387599
桃扇骨 2025-02-16 07:19:12

尝试通过部分@HREF过滤链接,您正在尝试解决XY问题。无需过滤链接 - 只需使用正确的XPath选择所需的链接:

links = driver.find_elements(By.XPATH, "links = driver.find_elements('xpath', "//td/div[@class='list-group']/a")")
for link in links:
    print(link.get_attribute("href"))

Trying to filter links by partial @href you're trying to solve an X-Y issue. There is no need to filter links- just use correct XPath to select required links:

links = driver.find_elements(By.XPATH, "links = driver.find_elements('xpath', "//td/div[@class='list-group']/a")")
for link in links:
    print(link.get_attribute("href"))
箜明 2025-02-16 07:19:12

链接具有多个属性,目标,位置,文本...

您很可能想要文本

 links.getText().... 

应该有效

Links have multiple attributes, target, location, text...

You most likely want text

 links.getText().... 

should work

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文