Python/Selenium - 无法访问节标记中的元素

发布于 2025-01-11 15:45:05 字数 8681 浏览 0 评论 0原文

我正在使用 selenium 抓取网页以获取产品型号。该页面有产品网格的两个部分,两个部分之间有一张卡片。我可以从“browse-search-pods-1”的第一部分获取型号,但无法从“browse-search-pods-2”之后的第二部分访问页面下半部分的元素。它忽略了第二部分。有 24 个产品,但它只抓取第一部分的前 12 个。我如何访问这两个部分?

这是网站: https://www.homedepot.com/b/ Building-Materials-Drywall/N-5yc1vZar3d?catStyle=ShowProducts

以下是一种产品的 html 示例:

<div class="grid">
   <section id="browse-search-pods-1" class="grid">
      <div class="browse-search__pod col__true-12 col__6-12--xs col__4-12--sm col__3-12--md col__3-12--lg" data-lg-name="Product Pod: 0">
         <div class="desktop product-pod" data-automation-id="podnode" data-type="product">
            <div class="product-pod--padding">
               <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243" target="_blank" rel="noopener noreferrer" class="super-sku__inline-swatch__mini-swatch__more-options">More Options</a>
               <div class="product-pod__title product-pod__title__product">
                  <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243" class="header product-pod--ie-fix">
                     <div class="product-pod--ie-fix product-pod__title-control">
                        <h2 class="product-pod__title product-pod__title__product"><span class="product-pod__title__brand--bold">USG Sheetrock Brand</span><span class="product-pod__title__product">1/2 in. x 4 ft. x 8 ft. UltraLight Drywall</span></h2>
                     </div>
                  </a>
               </div>
               <div class="ratings-and-model-number-container">
                  <div class="product-pod-list__identifiers">
                     <div class="product-identifier product-identifier__model">Model# 14113411708</div>
                  </div>
                  <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243#ratings-and-reviews" data-testid="product-pod__ratings-link">
                     <div class="ratings--6r7g3">
                        <div class="reviews--c43xm reviews--no-margin--c43xm" title=""><span class="stars--c43xm" style="width:89.80600000000001%"></span></div>
                        <span class="ratings__count--6r7g3">
                           (<!-- -->3753<!-- -->)
                        </span>
                     </div>
                  </a>
               </div>
               
            </div>
         </div>
      </div>
   </section>
   <section id="browse-search-pods-2" class="grid">
      <div class="category-cards col__12-12" data-lg-name="Product Pod: 0">
         <div class="category-cards__zone-wrapper category-cards__zone-card">
            <section class="zone-card__zone1">
               <div class="zone-card__header-wrapper">
                  <h2 class="zone-card__header u__bold">Project Guide</h2>
                  <p class="zone-card__header-text">Installing Drywall Project Guide</p>
               </div>
               <div class="zone-card-details">
                  <div class="zone-card-details__image"><img src="https://www.homedepot.com/hdus/en_US/DTCCOMNEW/fetch/FetchRules/FetchPN/how-to-install-drywall-professional-steps-HT-PG-BM.jpg" alt="" class="stretchy" height="1" width="1" loading="lazy"></div>
                  <div class="zone-card-details__description">
                     <div class="zone-card-details__text category-cards-details__text--truncate">Hanging drywall is not difficult if you have patience, the right tools and a friend to help. Follow our instructions to learn more</div>
                     <div class="zone-card-details__actions"><a class="bttn-outline bttn-outline--primary bttn--inline zone-card-details__btn" href="//www.homedepot.com/c/how_to_install_drywall_professional_steps_HT_PG_BM"><span class="bttn__content">Read Our Guide</span></a></div>
                  </div>
               </div>
            </section>
            <section class="zone-card__zone2">
               <div class="zone-card__header-wrapper">
                  <h2 class="u__truncate zone-card__header u__bold">Buying Guide</h2>
                  <p class="zone-card__header-text">Types of Drywall</p>
               </div>
               <div class="zone-card__video-wrapper">
                  <a class="zone-card__vidcap-link" href="//www.homedepot.com/c/ab/types-of-drywall/9ba683603be9fa5395fab90c24feaae">
                     <div class="zone-card-details__image zone-card-details__image--vidcap" style="background-image: url(&quot;https://i3.ytimg.com/vi/4hF9_z3IqaA/mqdefault.jpg&quot;);"></div>
                  </a>
               </div>
               <a class="zone-card__video-link" href="//www.homedepot.com/c/ab/types-of-drywall/9ba683603be9fa5395fab90c24feaae">See Our Tips</a>
            </section>
         </div>
      </div>
      <div class="browse-search__pod col__true-12 col__6-12--xs col__4-12--sm col__3-12--md col__3-12--lg">
         <div class="desktop product-pod" data-automation-id="podnode" data-type="product">
            <div class="product-pod--padding">
               <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411" target="_blank" rel="noopener noreferrer" class="super-sku__inline-swatch__mini-swatch__more-options">More Options</a>
               <div class="product-pod__title product-pod__title__product">
                  <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411" class="header product-pod--ie-fix">
                     <div class="product-pod--ie-fix product-pod__title-control">
                        <h2 class="product-pod__title product-pod__title__product"><span class="product-pod__title__brand--bold">Westpac Materials</span><span class="product-pod__title__product">18 lb. Fast Set 20 Lite Setting-Type Joint Compound</span></h2>
                     </div>
                  </a>
               </div>
               <div class="ratings-and-model-number-container">
                  <div class="product-pod-list__identifiers">
                     <div class="product-identifier product-identifier__model">Model# 22165H</div>
                  </div>
                  <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411#ratings-and-reviews" data-testid="product-pod__ratings-link">
                     <div class="ratings--6r7g3">
                        <div class="reviews--c43xm reviews--no-margin--c43xm" title=""><span class="stars--c43xm" style="width: 94.16%;"></span></div>
                        <span class="ratings__count--6r7g3">(226)</span>
                     </div>
                  </a>
               </div>
            </div>
         </div>
      </div>
   </section>
</div>

这是我尝试访问第二部分的代码,但我从第一部分获取了型号:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 

options = Options()
options.add_experimental_option('excludeSwitches', ['enable-logging'])

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

driver.get('https://www.homedepot.com/b/Building-Materials-Drywall/N-5yc1vZar3d?catStyle=ShowProducts')
    
section_two = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

product_model = section_two.find_elements(By.XPATH, "//div[contains(@class, 'product-identifier product-identifier__model')]")
for model in product_model:
    print(model.text)

I'm using selenium to scrape a web page to get product model numbers. The page has two sections of a grid of products with a card between the two sections. I can grab the model numbers from the first section from "browse-search-pods-1" but I can't access the elements on the bottom half of the page from the second section after "browse-search-pods-2". It ignores the second section. There are 24 products but it only grabs the first 12 from the first section. How can I access both sections?

Here's the website:
https://www.homedepot.com/b/Building-Materials-Drywall/N-5yc1vZar3d?catStyle=ShowProducts

Here's a sample of the html for one product:

<div class="grid">
   <section id="browse-search-pods-1" class="grid">
      <div class="browse-search__pod col__true-12 col__6-12--xs col__4-12--sm col__3-12--md col__3-12--lg" data-lg-name="Product Pod: 0">
         <div class="desktop product-pod" data-automation-id="podnode" data-type="product">
            <div class="product-pod--padding">
               <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243" target="_blank" rel="noopener noreferrer" class="super-sku__inline-swatch__mini-swatch__more-options">More Options</a>
               <div class="product-pod__title product-pod__title__product">
                  <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243" class="header product-pod--ie-fix">
                     <div class="product-pod--ie-fix product-pod__title-control">
                        <h2 class="product-pod__title product-pod__title__product"><span class="product-pod__title__brand--bold">USG Sheetrock Brand</span><span class="product-pod__title__product">1/2 in. x 4 ft. x 8 ft. UltraLight Drywall</span></h2>
                     </div>
                  </a>
               </div>
               <div class="ratings-and-model-number-container">
                  <div class="product-pod-list__identifiers">
                     <div class="product-identifier product-identifier__model">Model# 14113411708</div>
                  </div>
                  <a href="/p/USG-Sheetrock-Brand-1-2-in-x-4-ft-x-8-ft-UltraLight-Drywall-14113411708/202530243#ratings-and-reviews" data-testid="product-pod__ratings-link">
                     <div class="ratings--6r7g3">
                        <div class="reviews--c43xm reviews--no-margin--c43xm" title=""><span class="stars--c43xm" style="width:89.80600000000001%"></span></div>
                        <span class="ratings__count--6r7g3">
                           (<!-- -->3753<!-- -->)
                        </span>
                     </div>
                  </a>
               </div>
               
            </div>
         </div>
      </div>
   </section>
   <section id="browse-search-pods-2" class="grid">
      <div class="category-cards col__12-12" data-lg-name="Product Pod: 0">
         <div class="category-cards__zone-wrapper category-cards__zone-card">
            <section class="zone-card__zone1">
               <div class="zone-card__header-wrapper">
                  <h2 class="zone-card__header u__bold">Project Guide</h2>
                  <p class="zone-card__header-text">Installing Drywall Project Guide</p>
               </div>
               <div class="zone-card-details">
                  <div class="zone-card-details__image"><img src="https://www.homedepot.com/hdus/en_US/DTCCOMNEW/fetch/FetchRules/FetchPN/how-to-install-drywall-professional-steps-HT-PG-BM.jpg" alt="" class="stretchy" height="1" width="1" loading="lazy"></div>
                  <div class="zone-card-details__description">
                     <div class="zone-card-details__text category-cards-details__text--truncate">Hanging drywall is not difficult if you have patience, the right tools and a friend to help. Follow our instructions to learn more</div>
                     <div class="zone-card-details__actions"><a class="bttn-outline bttn-outline--primary bttn--inline zone-card-details__btn" href="//www.homedepot.com/c/how_to_install_drywall_professional_steps_HT_PG_BM"><span class="bttn__content">Read Our Guide</span></a></div>
                  </div>
               </div>
            </section>
            <section class="zone-card__zone2">
               <div class="zone-card__header-wrapper">
                  <h2 class="u__truncate zone-card__header u__bold">Buying Guide</h2>
                  <p class="zone-card__header-text">Types of Drywall</p>
               </div>
               <div class="zone-card__video-wrapper">
                  <a class="zone-card__vidcap-link" href="//www.homedepot.com/c/ab/types-of-drywall/9ba683603be9fa5395fab90c24feaae">
                     <div class="zone-card-details__image zone-card-details__image--vidcap" style="background-image: url("https://i3.ytimg.com/vi/4hF9_z3IqaA/mqdefault.jpg");"></div>
                  </a>
               </div>
               <a class="zone-card__video-link" href="//www.homedepot.com/c/ab/types-of-drywall/9ba683603be9fa5395fab90c24feaae">See Our Tips</a>
            </section>
         </div>
      </div>
      <div class="browse-search__pod col__true-12 col__6-12--xs col__4-12--sm col__3-12--md col__3-12--lg">
         <div class="desktop product-pod" data-automation-id="podnode" data-type="product">
            <div class="product-pod--padding">
               <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411" target="_blank" rel="noopener noreferrer" class="super-sku__inline-swatch__mini-swatch__more-options">More Options</a>
               <div class="product-pod__title product-pod__title__product">
                  <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411" class="header product-pod--ie-fix">
                     <div class="product-pod--ie-fix product-pod__title-control">
                        <h2 class="product-pod__title product-pod__title__product"><span class="product-pod__title__brand--bold">Westpac Materials</span><span class="product-pod__title__product">18 lb. Fast Set 20 Lite Setting-Type Joint Compound</span></h2>
                     </div>
                  </a>
               </div>
               <div class="ratings-and-model-number-container">
                  <div class="product-pod-list__identifiers">
                     <div class="product-identifier product-identifier__model">Model# 22165H</div>
                  </div>
                  <a href="/p/Westpac-Materials-18-lb-Fast-Set-20-Lite-Setting-Type-Joint-Compound-22165H/100320411#ratings-and-reviews" data-testid="product-pod__ratings-link">
                     <div class="ratings--6r7g3">
                        <div class="reviews--c43xm reviews--no-margin--c43xm" title=""><span class="stars--c43xm" style="width: 94.16%;"></span></div>
                        <span class="ratings__count--6r7g3">(226)</span>
                     </div>
                  </a>
               </div>
            </div>
         </div>
      </div>
   </section>
</div>

Here's the code I've tried to access the second section but I get the model numbers from the first:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 

options = Options()
options.add_experimental_option('excludeSwitches', ['enable-logging'])

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

driver.get('https://www.homedepot.com/b/Building-Materials-Drywall/N-5yc1vZar3d?catStyle=ShowProducts')
    
section_two = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

product_model = section_two.find_elements(By.XPATH, "//div[contains(@class, 'product-identifier product-identifier__model')]")
for model in product_model:
    print(model.text)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

森林很绿却致人迷途 2025-01-18 15:45:05

尝试滚动到元素 browse-search-pods-2 ,然后执行

section_two = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

滚动操作,您可以尝试:

org.openqa.selenium.interactions.Actions 反映在 中ActionChains 类:

from selenium.webdriver.common.action_chains import ActionChains

element = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

actions = ActionChains(driver)
actions.move_to_element(element).perform()

或者,您也可以通过 scrollIntoView()“滚动到视图”:

driver.execute_script("arguments[0].scrollIntoView();", element)

Try scrolling to the element browse-search-pods-2 and then do

section_two = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

For scrolling you can try:

org.openqa.selenium.interactions.Actions are reflected in ActionChains class:

from selenium.webdriver.common.action_chains import ActionChains

element = driver.find_element(By.XPATH, "//section[contains(@id, 'browse-search-pods-2')]")

actions = ActionChains(driver)
actions.move_to_element(element).perform()

Or, you can also "scroll into view" via scrollIntoView():

driver.execute_script("arguments[0].scrollIntoView();", element)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文