Web刮板没有取消完整的页面

发布于 2025-02-08 02:53:40 字数 798 浏览 2 评论 0原文

我想刮擦并显示此经销商网页中所有汽车的名称：

https://www.herbchambers.com/used-inventory/index.htm?geozip=02108&georadius=0

我找到了相应的x-path删除其中的模式，以找到页面上每个汽车名称的x-paths。

x = 1
while True:
    the_xpath = f"/html/body/div[2]/div/div/div[8]/div/div[2]/div[1]/div/ul/li[{x}]/div[1]/div[2]/h2/a"
    car_name = driver.find_element(By.XPATH, the_xpath)
    car_name.location_once_scrolled_into_view
    print(car_name.text)
    x += 1

它可以很好地工作，并打印出前7-9辆汽车的名称（每次都不同）。但是，它总是用 nosuchelementException 终止，而无需完成整个页面。

我想知道是否有人可以帮助我解决这个问题，并弄清楚为什么它只能完成一半。

原文

I want to scrape and display the names of all the cars from this dealership webpage:

https://www.herbchambers.com/used-inventory/index.htm?geoZip=02108&geoRadius=0

I located the corresponding x-path and figured out the pattern within it, to find the x-paths of every single car name on the page.

x = 1
while True:
    the_xpath = f"/html/body/div[2]/div/div/div[8]/div/div[2]/div[1]/div/ul/li[{x}]/div[1]/div[2]/h2/a"
    car_name = driver.find_element(By.XPATH, the_xpath)
    car_name.location_once_scrolled_into_view
    print(car_name.text)
    x += 1

It works perfectly fine and prints the names of the first 7-9 cars (varies every time). However, it then always terminates with the NoSuchElementException, without finishing the entire page.

I was wondering if anyone could help me solve this issue and figure out why it only works half way.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

这样的小城市 2025-02-15 02:53:40

通常，与硒一起工作时，定义路径的方式是有风险的元素（如果存在的javaScript生成），每个首先要为颜色的名称为“一个”，如果您想通过定义每个分区的第三个元素，则如果不存在汽车的颜色，并且其元素未生成，则价格为价格。现在将发生崩溃，因为汽车只有2个要素的

最佳方法是找到具有代表信息的值的属性像这样

car_name = driver.find_element(By.XPATH, '//div[contains(@class,"vehicle-card-details-container")]//h2/a')
car_name.location_once_scrolled_into_view
print(car_name.text)
x += 1

generally your way of defining the path is risky when working with selenium as absence of any element inside the box that contains the car will make a mess and selenium will return that error for example if there's divisons and each division represent a car then we have 3 elements (generated by javascript if present) inside each first for name one for color and one for price if you want to collect the price by defining the 3rd element of each division if the color of a car not present and its element not generated then you will have a crash as now that car only has 2 elements

the best approach is to find attribute with values representing the information for example in your case you first find div by class 'vehicle-card-details-container' this div contains h2 then a like this

car_name = driver.find_element(By.XPATH, '//div[contains(@class,"vehicle-card-details-container")]//h2/a')
car_name.location_once_scrolled_into_view
print(car_name.text)
x += 1

回复收藏 0 原文

~没有更多了~