python webdriver:find_element()找不到元素

发布于 2025-02-07 14:23:15 字数 1668 浏览 1 评论 0原文

通过以下内容,我正在学习Web刮擦的基础知识,其中包括使用Python自动化无聊的东西的第12章,但是我对Find_element()方法有一个问题。当我使用该方法查找具有“ card-img-top cover-thumb”类名称的元素时,该方法不会返回任何匹配项。但是,代码确实适用于本书中的示例。

为了使代码做任何事情,我不得不对代码进行了很多更改。我已经在github上发布了完整的代码

  • 这本书说要使用'find_element_by_*'方法,但是这些方法正在产生指示我使用find_element()的折旧消息。

  • 要使用这种其他方法,我会导入'。

  • 我也从'selenium.webdriver.chrome.service'导入“服务”,因为Chromedriver否则不起作用。

  • 我还使用webdriver.chromeoptions()定义选项,隐藏了有关故障设备的某些错误消息,显然您应该忽略?

  • 我将书中的代码放入具有“ url”和“ className”参数的函数中,以便我可以测试不同的URL,而无需重复编辑代码。

这是代码的“业务零件”:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service  
from selenium.webdriver.common.by import By

s=Service(r'C:\Users\antse\AppData\Local\Chrome_WebDriver\chromedriver.exe')

op = webdriver.ChromeOptions()
op.add_experimental_option('excludeSwitches', ['enable-logging'])

def FNC_GET_CLASS_ELEMENT_FROM_PAGE(URL, CLASSNAME):       
    browser = webdriver.Chrome(service = s, options = op)
    browser.get(URL)
    try:  
        elem = browser.find_element(By.CLASS_NAME, CLASSNAME)
        print('Found <%s> element with that class name!' % (elem.tag_name))
    except:
        print('Was not able to find an element with that name.')

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

预期输出:找到&lt; img&gt;该班级名称的元素!

由于当我查看像Wikipedia这样的网站时,该代码确实可以工作,所以我想知道该页面的HTML是否有更改可以防止刮擦正常工作?

链接到书章在这里

感谢您可以给我任何建议!

I'm learning the very basics of Web Scraping by following Chapter 12 of Automate the boring stuff with Python, but I'm having an issue with the find_element() method. When I use the method to look for an element with the class name 'card-img-top cover-thumb', the method doesn't return any matches. However, the code does work for URL's other than the example in the book.

I have had to make quite a few changes to the code as-written in order to get the code to do anything. I've posted the full code on GitHub HERE, but to summarise:

  • The book says to use 'find_element_by_*' methods, but these were producing depreciation messages that directed me to use find_element() instead.

  • To use this other method, I import 'By'.

  • I also import 'Service' from 'Selenium.Webdriver.Chrome.Service' because Chromedriver doesn't work otherwise.

  • I also define options with Webdriver.ChromeOptions() that hide certain error messages about a faulty device which apparently you're just supposed to ignore?

  • I put the code from the book into a function with 'url' and 'classname' arguments so I can test different url's without having to edit the code repeatedly.

Here is the 'business-part' of the code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service  
from selenium.webdriver.common.by import By

s=Service(r'C:\Users\antse\AppData\Local\Chrome_WebDriver\chromedriver.exe')

op = webdriver.ChromeOptions()
op.add_experimental_option('excludeSwitches', ['enable-logging'])

def FNC_GET_CLASS_ELEMENT_FROM_PAGE(URL, CLASSNAME):       
    browser = webdriver.Chrome(service = s, options = op)
    browser.get(URL)
    try:  
        elem = browser.find_element(By.CLASS_NAME, CLASSNAME)
        print('Found <%s> element with that class name!' % (elem.tag_name))
    except:
        print('Was not able to find an element with that name.')

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

Expected output: Found <img> element with that class name!

Since the code does work when I look at a site like Wikipedia, I wonder if there have been changes to the html of the page that prevents the scrape from working properly?

Link to the book chapter HERE.

I appreciate any advice you can give me!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

花伊自在美 2025-02-14 14:23:15

您不能将多个类传递给find_element。只有一个可以在场。因此,请替换以下内容

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top')

如果您真的想使用这两个类详细。

You can't pass multiple classes to find_element. Only one can be present. So replace this:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

with this:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top')

If you really want to use both classes, then take a look at this answer which explains things in detail.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文