显示错误＆＃x27; WebElement＆＃x27;对象没有属性＆＃x27; startswith＆＃x27;

发布于 2025-02-09 07:19:12 字数 1566 浏览 2 评论 0原文

import time
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager

options = webdriver.ChromeOptions()

# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")

chrome_driver = webdriver.Chrome(
    service=Service(ChromeDriverManager().install()),
    options=options
)


def supplyvan_scraper():
    with chrome_driver as driver:
        driver.implicitly_wait(15)
        URL = 'https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx'
        driver.get(URL)
        time.sleep(3)
        
        link=driver.find_elements(By.XPATH, "//div[@class='list-group']//a")
        for links in link:
            if(links.startsWith("https://www.ifep.ro/")):
                print(links.get_attribute("href"))

他们向我展示了这些行中的错误，有一些不必要的链接，我想删除这些页面链接 https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx ：

'WebElement' object has no attribute 'startsWith'

原文

import time
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager

options = webdriver.ChromeOptions()

# options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")

chrome_driver = webdriver.Chrome(
    service=Service(ChromeDriverManager().install()),
    options=options
)


def supplyvan_scraper():
    with chrome_driver as driver:
        driver.implicitly_wait(15)
        URL = 'https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx'
        driver.get(URL)
        time.sleep(3)
        
        link=driver.find_elements(By.XPATH, "//div[@class='list-group']//a")
        for links in link:
            if(links.startsWith("https://www.ifep.ro/")):
                print(links.get_attribute("href"))

They show me error in these line there are some unwanted link and I want to remove it these is the page link https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx:

'WebElement' object has no attribute 'startsWith'

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

九八野马 2025-02-16 07:19:12

这是因为WebElement不是字符串。您必须首先从WebElement中提取文本，然后在结果文本上使用startswith。

这是完整的代码：

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

chrome_options = Options()
# chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--disable-extensions")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx")
driver.maximize_window()
time.sleep(3)

        
links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

您可以仅使用此修改的代码：

links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

输出：

https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33353&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=34493&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=15868&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33526&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33459&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=9100&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=27125&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=24811&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=1932&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=7746&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=18864&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=23966&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=3840&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16192&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16350&Signature=387599

This is because the WebElement is not a string. You have to first extract the text from the WebElement and then use startsWith on the resulting text.

Here is the complete code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

chrome_options = Options()
# chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920x1080")
chrome_options.add_argument("--disable-extensions")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.ifep.ro/justice/lawyers/lawyerspanel.aspx")
driver.maximize_window()
time.sleep(3)

        
links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

You can use this modified code only:

links = driver.find_elements_by_xpath("//div[@class='list-group']//a")
for link in links:
        link_href = link.get_attribute("href")
        if link_href.startswith("https://www.ifep.ro/"):
                print(link_href)

Output:

https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33353&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=34493&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=15868&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33526&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=33459&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=9100&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=27125&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=24811&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=1932&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=7746&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=18864&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=23966&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=3840&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16192&Signature=387599
https://www.ifep.ro/justice/lawyers/LawyerFile.aspx?RecordId=16350&Signature=387599

回复收藏 0 原文

桃扇骨 2025-02-16 07:19:12

尝试通过部分@HREF过滤链接，您正在尝试解决XY问题。无需过滤链接 - 只需使用正确的XPath选择所需的链接：

links = driver.find_elements(By.XPATH, "links = driver.find_elements('xpath', "//td/div[@class='list-group']/a")")
for link in links:
    print(link.get_attribute("href"))

Trying to filter links by partial @href you're trying to solve an X-Y issue. There is no need to filter links- just use correct XPath to select required links:

links = driver.find_elements(By.XPATH, "links = driver.find_elements('xpath', "//td/div[@class='list-group']/a")")
for link in links:
    print(link.get_attribute("href"))

回复收藏 0 原文