无头可以使用剧作家和美丽的人4
此代码正在工作:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
browser.close()
但是,如果我更改headless = true
,代码返回错误:
Traceback (most recent call last):
File "c:/Users/ANDERSONCARVALHODELI/Documents/py/AirpodsPW.py", line 19, in <module>
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)",
'')
IndexError: list index out of range
我使用以下方式解决了此问题:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
time.sleep(1)
browser.close()
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
但是我认为这不是更好的选择。如何在不使用headless = false
的情况下打开浏览器并坚持headless = true
的情况下解决此问题?
当i print(html)
<代码>汤= ... 之前,我看到:
<!DOCTYPE html><html><head> <title>Page Not Found - Apple</title> <link rel="stylesheet" href="https://www.apple.com/wss/fonts?families=SF+Pro,v1|SF+Pro+Icons,v1"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/main.built.css" type="text/css"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/overview.built.css" type="text/css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-external/rel/us/external.css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-globalelements/dist/us/globalelements.css"> <style>.more::after{content: "";}a.pointer, a.more, a.block span.more, button.unbutton.more{padding-right: .7em; background-image: url(https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-web/2/dist/assets/as-legacy/base/link/res/more.svg); background-repeat: no-repeat; background-position: 100% 50%; background-size: 5px 9px; zoom: 1;}.as-globalfooter-directory-column-section-list a{margin-bottom: .8em; display: block}.as-globalfooter-directory-column-section-list a:last-child{margin-bottom: 0;}.as-globalfooter-mini .as-globalfooter-mini-shop a{color: #06c;}.as-globalfooter .as-globalfooter-mini-legal-copyright, .as-footnotes .as-globalfooter-mini-legal-copyright, .as-globalfooter .as-globalfooter-mini-legal-link, .as-footnotes .as-globalfooter-mini-legal-link{top: -3px; position: relative; z-index: 1;}.as-globalfooter .as-globalfooter-directory+.as-globalfooter-mini, .as-footnotes .as-globalfooter-directory+.as-globalfooter-mini{padding-bottom: 26px;}.container{position: relative;}hr{display: inline-block; border: 0px; border-top: 0.1em solid #CCD2D9; width: 100%}</style></head><body class="page-overview"> <nav data-store-api="/shop/bag/status" id="ac-globalnav"> <div class="ac-gn-content"> <ul class="ac-gn-list"> <a href="/" class="ac-gn-link ac-gn-link-apple"> <p class="ac-gn-link-text">Apple</p></a> <a href="/us/shop/goto/store" class="ac-gn-link ac-gn-link-store"> <p class="ac-gn-link-text">Store</p></a> <a href="/mac/" class="ac-gn-link ac-gn-link-mac"> <p class="ac-gn-link-text">Mac</p></a> <a href="/ipad/" class="ac-gn-link ac-gn-link-ipad"> <p class="ac-gn-link-text">iPad</p></a> <a href="/iphone/" class="ac-gn-link ac-gn-link-iphone"> <p class="ac-gn-link-text">iPhone</p></a> <a href="/watch/" class="ac-gn-link ac-gn-link-watch"> <p class="ac-gn-link-text">Watch</p></a> <a href="/airpods/" class="ac-gn-link ac-gn-link-airpods"> <p class="ac-gn-link-text">AirPods</p></a> <a href="/tv-home/" class="ac-gn-link ac-gn-link-tvhome"> <p class="ac-gn-link-text">TV & Home</p></a>
<a href="/services/" class="ac-gn-link ac-gn-link-onlyonapple"> <p class="ac-gn-link-text">Only on Apple</p></a> <a href="/us/shop/goto/buy_accessories" class="ac-gn-link ac-gn-link-accessories"> <p class="ac-gn-link-text">Accessories</p></a> <a href="https://support.apple.com" class="ac-gn-link ac-gn-link-support"> <p class="ac-gn-link-text">Support</p></a> <li class="ac-gn-item ac-gn-item-menu ac-gn-search"> <a id="ac-gn-link-search" class="ac-gn-link ac-gn-link-search" href="/us/search" data-analytics-title="search" data-analytics-intrapage-link="" aria-label="Search apple.com" role="button" aria-haspopup="true"></a> </li><a href="/us/shop/goto/bag" class="ac-gn-link ac-gn-link-bag"> <p class="ac-gn-link-text">Shopping Bag</p></a> </ul> </div></nav> <div id="ac-gn-placeholder"> </div><main id="main" class="main" role="main" data-page-type="overview"> <h1 class="section-headline typography-headline">The page you’re looking for can’t be found.</h1> <aside id="search-wrapper" role="search" data-analytics-region="search" aria-hidden="false"> <form id="searchform-form" class="searchform" action="/us/search" method="get" data-suggestions-url="/search-services/suggestions/"><input id="searchform-input" type="text" class="form-textbox form-textbox-text form-icon-left" aria-labelledby="textbox_label" required="" aria-required="true" data-placeholder-long="Search for Products, Stores, and Help" autocorrect="off" autocapitalize="off" autocomplete="off"><span class="form-label" id="textbox_label" aria-hidden="true">Search apple.com</span> <div id="searchform-submit" class="form-icons-wrapper form-icons-wrapper-left form-icons-focusable" type="submit" aria-label="Submit"><button class="form-icons form-icons-search15"></button></div><div id="searchform-reset" class="button-reset form-icons-wrapper form-icons-focusable" type="reset" disabled="" aria-label="Clear Search"><button class="form-icons form-icons-small form-icons-clearsolid15 form-icon-reset"></button></div></form> </aside> <div class="cta-sitemap"> <div class="cta-sitemap"> <a href="/sitemap/" class="more" style="top: bottom">Or see our site map</a> </div></div></main> <footer class="as-globalfooter as-globalfooter-contained"> <div class="as-globalfooter-content"> <div class="as-globalfooter-breadcrumbs"> <a href="/" class="as-globalfooter-breadcrumbs-home"> <p class="as-globalfooter-breadcrumbs-home-icon"></p><p class="as-globalfooter-breadcrumbs-home-label">Apple</p></a> <div class="as-globalfooter-breadcrumbs-path"> <ol class="as-globalfooter-breadcrumbs-list"> <li class="as-globalfooter-breadcrumbs-item breadcrumbs-title"> Page Not Found</li></ol> </div></div><nav class="as-globalfooter-directory with-5-columns"> <div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Shop and Learn</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/us/shop/goto/store">Store</a> <a href="/mac/">Mac</a> <a href="/ipad/">iPad</a> <a href="/iphone/">iPhone</a> <a href="/watch/">Watch</a> <a href="/airpods/">AirPods</a> <a href="/tv-home/">TV & Home</a> <a href="/ipod-touch/">iPod touch</a> <a href="/airtag/">AirTag</a> <a href="/us/shop/goto/buy_accessories">Accessories</a> <a href="/us/shop/goto/giftcards">Gift Cards</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Services</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/apple-music/">Apple Music</a> <a href="/apple-tv-plus/">Apple TV+</a> <a href="/apple-fitness-plus/">Apple Fitness+</a> <a href="/apple-news/">Apple News+</a> <a href="/apple-arcade/">Apple Arcade</a> <a href="/icloud/">iCloud</a> <a href="/apple-one/">Apple One</a> <a href="/apple-card/">Apple Card</a> <a href="/apple-books/">Apple Books</a> <a href="/apple-podcasts/">Apple Podcasts</a> <a href="/app-store/">App Store</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Account</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="https://appleid.apple.com/us/">Manage Your Apple ID</a> <a href="/us/shop/goto/account">Apple Store Account</a> <a href="https://www.icloud.com">iCloud.com</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Store</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/retail/">Find a Store</a> <a href="/retail/geniusbar/">Genius Bar</a> <a href="/today/">Today at Apple</a> <a href="/today/camp/">Apple Camp</a> <a href="https://itunes.apple.com/app/apple-store/id375380948">Apple Store App</a> <a href="/us/shop/goto/special_deals">Refurbished and Clearance</a> <a href="/us/shop/goto/payment_plan">Financing</a> <a href="/us/shop/goto/trade_in">Apple Trade In</a> <a href="/us/shop/goto/order/list">Order Status</a> <a href="/us/shop/goto/help">Shopping Help</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Business</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/business/">Apple and Business</a> <a href="/retail/business/">Shop for Business</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Education</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/education/">Apple and Education</a> <a href="/education/k12/how-to-buy/">Shop for K-12</a> <a href="/us/shop/goto/educationrouting">Shop for College</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Healthcare</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/healthcare/">Apple in Healthcare</a> <a href="/healthcare/apple-watch/">Health on Apple Watch</a> <a href="/healthcare/health-records/">Health Records on iPhone</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Government</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/r/store/government/">Shop for Government</a> <a href="/us/shop/goto/eppstore/veteransandmilitary">Shop for Veterans and Military</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Values</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/accessibility/">Accessibility</a> <a href="/education/connectED/">Education</a> <a href="/environment/">Environment</a> <a href="/diversity/">Inclusion and Diversity</a> <a href="/privacy/">Privacy</a> <a href="/racial-equity-justice-initiative/">Racial Equity
and Justice</a> <a href="/supplier-responsibility/">Supplier Responsibility</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">About Apple</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/newsroom/">Newsroom</a> <a href="/leadership/">Apple Leadership</a> <a href="/careers/us/">Career Opportunities</a> <a href="https://investor.apple.com">Investors</a> <a href="/compliance/">Ethics & Compliance</a> <a href="/apple-events/">Events</a> <a href="/contact/">Contact Apple</a> </ul> </div></div></nav> <div class="as-globalfooter-mini"> <div class="as-globalfooter-mini-shop">More ways to shop:
<a href="/retail/">Find an Apple Store</a> or <a href="https://locate.apple.com/">other retailer</a> near you. <span>Or call 1-800-MY-APPLE.</span> </div><div class="as-globalfooter-mini-locale"> <a class="as-globalfooter-mini-locale-link" href="/choose-country-region/" title="Choose your country or region" aria-label="United States. Choose your country or region" data-analytics-title="choose your country">United States</a> </div><p class="as-globalfooter-mini-legal-copyright">Copyright © 2022 Apple Inc. All rights reserved. </p><a class="as-globalfooter-mini-legal-link" href="/legal/privacy/">Privacy Policy </a> <a class="as-globalfooter-mini-legal-link" href="/legal/internet-services/terms/site.html">Terms of Use </a> <a class="as-globalfooter-mini-legal-link" href="/us/shop/goto/help/sales_refunds">Sales
and Refunds </a> <a class="as-globalfooter-mini-legal-link" href="/legal/">Legal </a> <a class="as-globalfooter-mini-legal-link" href="/sitemap/">Site Map </a> </div></div></footer> <script src="https://www.apple.com/v/errors/c/built/scripts/main.built.js" type="text/javascript" charset="utf-8"></script></body></html>
This code is working:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
browser.close()
But if I change headless=True
, the code returns an error:
Traceback (most recent call last):
File "c:/Users/ANDERSONCARVALHODELI/Documents/py/AirpodsPW.py", line 19, in <module>
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)",
'')
IndexError: list index out of range
I fixed this using:
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
from datetime import datetime
import time
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
time.sleep(1)
browser.close()
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto("https://www.apple.com/br/shop/product/MV7N2BE/A/airpods-com-estojo-de-recarga")
html = page.content()
soup = BeautifulSoup(html,'html.parser')
valorAppleStore = soup.select("span.as-price-installments")[-2].get_text().replace(" à vista (10% de desconto)", '')
print(valorAppleStore)
But I think this is not the better choice. How do I fix this without opening the browser using headless=False
and stick to headless=True
?
When I print(html)
before soup=...
, I see:
<!DOCTYPE html><html><head> <title>Page Not Found - Apple</title> <link rel="stylesheet" href="https://www.apple.com/wss/fonts?families=SF+Pro,v1|SF+Pro+Icons,v1"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/main.built.css" type="text/css"> <link rel="stylesheet" href="https://www.apple.com/v/errors/c/built/styles/overview.built.css" type="text/css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-external/rel/us/external.css"> <link rel="stylesheet" href="https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-globalelements/dist/us/globalelements.css"> <style>.more::after{content: "";}a.pointer, a.more, a.block span.more, button.unbutton.more{padding-right: .7em; background-image: url(https://store.storeimages.cdn-apple.com/4982/store.apple.com/shop/rs-web/2/dist/assets/as-legacy/base/link/res/more.svg); background-repeat: no-repeat; background-position: 100% 50%; background-size: 5px 9px; zoom: 1;}.as-globalfooter-directory-column-section-list a{margin-bottom: .8em; display: block}.as-globalfooter-directory-column-section-list a:last-child{margin-bottom: 0;}.as-globalfooter-mini .as-globalfooter-mini-shop a{color: #06c;}.as-globalfooter .as-globalfooter-mini-legal-copyright, .as-footnotes .as-globalfooter-mini-legal-copyright, .as-globalfooter .as-globalfooter-mini-legal-link, .as-footnotes .as-globalfooter-mini-legal-link{top: -3px; position: relative; z-index: 1;}.as-globalfooter .as-globalfooter-directory+.as-globalfooter-mini, .as-footnotes .as-globalfooter-directory+.as-globalfooter-mini{padding-bottom: 26px;}.container{position: relative;}hr{display: inline-block; border: 0px; border-top: 0.1em solid #CCD2D9; width: 100%}</style></head><body class="page-overview"> <nav data-store-api="/shop/bag/status" id="ac-globalnav"> <div class="ac-gn-content"> <ul class="ac-gn-list"> <a href="/" class="ac-gn-link ac-gn-link-apple"> <p class="ac-gn-link-text">Apple</p></a> <a href="/us/shop/goto/store" class="ac-gn-link ac-gn-link-store"> <p class="ac-gn-link-text">Store</p></a> <a href="/mac/" class="ac-gn-link ac-gn-link-mac"> <p class="ac-gn-link-text">Mac</p></a> <a href="/ipad/" class="ac-gn-link ac-gn-link-ipad"> <p class="ac-gn-link-text">iPad</p></a> <a href="/iphone/" class="ac-gn-link ac-gn-link-iphone"> <p class="ac-gn-link-text">iPhone</p></a> <a href="/watch/" class="ac-gn-link ac-gn-link-watch"> <p class="ac-gn-link-text">Watch</p></a> <a href="/airpods/" class="ac-gn-link ac-gn-link-airpods"> <p class="ac-gn-link-text">AirPods</p></a> <a href="/tv-home/" class="ac-gn-link ac-gn-link-tvhome"> <p class="ac-gn-link-text">TV & Home</p></a>
<a href="/services/" class="ac-gn-link ac-gn-link-onlyonapple"> <p class="ac-gn-link-text">Only on Apple</p></a> <a href="/us/shop/goto/buy_accessories" class="ac-gn-link ac-gn-link-accessories"> <p class="ac-gn-link-text">Accessories</p></a> <a href="https://support.apple.com" class="ac-gn-link ac-gn-link-support"> <p class="ac-gn-link-text">Support</p></a> <li class="ac-gn-item ac-gn-item-menu ac-gn-search"> <a id="ac-gn-link-search" class="ac-gn-link ac-gn-link-search" href="/us/search" data-analytics-title="search" data-analytics-intrapage-link="" aria-label="Search apple.com" role="button" aria-haspopup="true"></a> </li><a href="/us/shop/goto/bag" class="ac-gn-link ac-gn-link-bag"> <p class="ac-gn-link-text">Shopping Bag</p></a> </ul> </div></nav> <div id="ac-gn-placeholder"> </div><main id="main" class="main" role="main" data-page-type="overview"> <h1 class="section-headline typography-headline">The page you’re looking for can’t be found.</h1> <aside id="search-wrapper" role="search" data-analytics-region="search" aria-hidden="false"> <form id="searchform-form" class="searchform" action="/us/search" method="get" data-suggestions-url="/search-services/suggestions/"><input id="searchform-input" type="text" class="form-textbox form-textbox-text form-icon-left" aria-labelledby="textbox_label" required="" aria-required="true" data-placeholder-long="Search for Products, Stores, and Help" autocorrect="off" autocapitalize="off" autocomplete="off"><span class="form-label" id="textbox_label" aria-hidden="true">Search apple.com</span> <div id="searchform-submit" class="form-icons-wrapper form-icons-wrapper-left form-icons-focusable" type="submit" aria-label="Submit"><button class="form-icons form-icons-search15"></button></div><div id="searchform-reset" class="button-reset form-icons-wrapper form-icons-focusable" type="reset" disabled="" aria-label="Clear Search"><button class="form-icons form-icons-small form-icons-clearsolid15 form-icon-reset"></button></div></form> </aside> <div class="cta-sitemap"> <div class="cta-sitemap"> <a href="/sitemap/" class="more" style="top: bottom">Or see our site map</a> </div></div></main> <footer class="as-globalfooter as-globalfooter-contained"> <div class="as-globalfooter-content"> <div class="as-globalfooter-breadcrumbs"> <a href="/" class="as-globalfooter-breadcrumbs-home"> <p class="as-globalfooter-breadcrumbs-home-icon"></p><p class="as-globalfooter-breadcrumbs-home-label">Apple</p></a> <div class="as-globalfooter-breadcrumbs-path"> <ol class="as-globalfooter-breadcrumbs-list"> <li class="as-globalfooter-breadcrumbs-item breadcrumbs-title"> Page Not Found</li></ol> </div></div><nav class="as-globalfooter-directory with-5-columns"> <div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Shop and Learn</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/us/shop/goto/store">Store</a> <a href="/mac/">Mac</a> <a href="/ipad/">iPad</a> <a href="/iphone/">iPhone</a> <a href="/watch/">Watch</a> <a href="/airpods/">AirPods</a> <a href="/tv-home/">TV & Home</a> <a href="/ipod-touch/">iPod touch</a> <a href="/airtag/">AirTag</a> <a href="/us/shop/goto/buy_accessories">Accessories</a> <a href="/us/shop/goto/giftcards">Gift Cards</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Services</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/apple-music/">Apple Music</a> <a href="/apple-tv-plus/">Apple TV+</a> <a href="/apple-fitness-plus/">Apple Fitness+</a> <a href="/apple-news/">Apple News+</a> <a href="/apple-arcade/">Apple Arcade</a> <a href="/icloud/">iCloud</a> <a href="/apple-one/">Apple One</a> <a href="/apple-card/">Apple Card</a> <a href="/apple-books/">Apple Books</a> <a href="/apple-podcasts/">Apple Podcasts</a> <a href="/app-store/">App Store</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Account</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="https://appleid.apple.com/us/">Manage Your Apple ID</a> <a href="/us/shop/goto/account">Apple Store Account</a> <a href="https://www.icloud.com">iCloud.com</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Store</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/retail/">Find a Store</a> <a href="/retail/geniusbar/">Genius Bar</a> <a href="/today/">Today at Apple</a> <a href="/today/camp/">Apple Camp</a> <a href="https://itunes.apple.com/app/apple-store/id375380948">Apple Store App</a> <a href="/us/shop/goto/special_deals">Refurbished and Clearance</a> <a href="/us/shop/goto/payment_plan">Financing</a> <a href="/us/shop/goto/trade_in">Apple Trade In</a> <a href="/us/shop/goto/order/list">Order Status</a> <a href="/us/shop/goto/help">Shopping Help</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Business</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/business/">Apple and Business</a> <a href="/retail/business/">Shop for Business</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Education</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/education/">Apple and Education</a> <a href="/education/k12/how-to-buy/">Shop for K-12</a> <a href="/us/shop/goto/educationrouting">Shop for College</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Healthcare</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/healthcare/">Apple in Healthcare</a> <a href="/healthcare/apple-watch/">Health on Apple Watch</a> <a href="/healthcare/health-records/">Health Records on iPhone</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">For Government</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/r/store/government/">Shop for Government</a> <a href="/us/shop/goto/eppstore/veteransandmilitary">Shop for Veterans and Military</a> </ul> </div></div><div class="as-globalfooter-directory-column"> <div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">Apple Values</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/accessibility/">Accessibility</a> <a href="/education/connectED/">Education</a> <a href="/environment/">Environment</a> <a href="/diversity/">Inclusion and Diversity</a> <a href="/privacy/">Privacy</a> <a href="/racial-equity-justice-initiative/">Racial Equity
and Justice</a> <a href="/supplier-responsibility/">Supplier Responsibility</a> </ul> </div><div class="as-globalfooter-directory-column-section"> <h3 class="as-globalfooter-directory-column-section-title">About Apple</h3> <ul class="as-globalfooter-directory-column-section-list"> <a href="/newsroom/">Newsroom</a> <a href="/leadership/">Apple Leadership</a> <a href="/careers/us/">Career Opportunities</a> <a href="https://investor.apple.com">Investors</a> <a href="/compliance/">Ethics & Compliance</a> <a href="/apple-events/">Events</a> <a href="/contact/">Contact Apple</a> </ul> </div></div></nav> <div class="as-globalfooter-mini"> <div class="as-globalfooter-mini-shop">More ways to shop:
<a href="/retail/">Find an Apple Store</a> or <a href="https://locate.apple.com/">other retailer</a> near you. <span>Or call 1-800-MY-APPLE.</span> </div><div class="as-globalfooter-mini-locale"> <a class="as-globalfooter-mini-locale-link" href="/choose-country-region/" title="Choose your country or region" aria-label="United States. Choose your country or region" data-analytics-title="choose your country">United States</a> </div><p class="as-globalfooter-mini-legal-copyright">Copyright © 2022 Apple Inc. All rights reserved. </p><a class="as-globalfooter-mini-legal-link" href="/legal/privacy/">Privacy Policy </a> <a class="as-globalfooter-mini-legal-link" href="/legal/internet-services/terms/site.html">Terms of Use </a> <a class="as-globalfooter-mini-legal-link" href="/us/shop/goto/help/sales_refunds">Sales
and Refunds </a> <a class="as-globalfooter-mini-legal-link" href="/legal/">Legal </a> <a class="as-globalfooter-mini-legal-link" href="/sitemap/">Site Map </a> </div></div></footer> <script src="https://www.apple.com/v/errors/c/built/scripts/main.built.js" type="text/javascript" charset="utf-8"></script></body></html>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,剧作家已经在直播页面上有一套完整的选择器,因此要消除依赖关系,加快刮擦速度,使用较少的代码并避免静态HTML快照与实时页面同步时,我建议
在主要问题上,您通过打印HTML来了解您要处理的响应方式。 404页表示您在无头运行时被检测到您是机器人,但这通常可以表现为验证码,CloudFlare浏览器检查页面,或其他“您是机器人吗?”注意。
与刮擦中的所有内容一样,没有一种尺寸的解决方案,但是一种典型的方法是设置自定义用户代理字符串:
如果使用人类用户代理不会解开您,则可以尝试使用其他更改的方法浏览器指纹,就像使用诸如 this (注意:我尚未尝试过此特定库)。有各种云服务可以在旋转住宅代理时运行具有优化的指纹自动浏览器。
(请注意,该网站已更改以来该答案已发布,因此它不再可重现 - 尽管如此,基本想法仍然存在))
First of all, Playwright already has a full suite of selectors that work on the live page, so to eliminate a dependency, speed up your scrape, use less code and avoid weird errors when the static HTML snapshot gets out of sync with the live page, I suggest skipping BS (this blog post of mine is oriented to Puppeteer/Node, but applies equally to Playwright/Python).
On to the main problem, you've done good by printing the HTML to see what sort of response you're dealing with. The 404 page indicates you've been detected as a bot when running headlessly, but this can often manifest as a captcha, Cloudflare browser check page, or other "are you a robot?" notice.
As with everything in scraping, there's no one-size-fits-all solution, but one typical approach is to set a custom user agent string:
If using a human user agent doesn't unblock you, you can experiment with other means of changing the browser fingerprint, like using an off-the-shelf library like this (note: I have not tried this specific library). There are various cloud services that can run automated browsers with optimized fingerprints on rotating residential proxies.
(Note that the site has changed since this answer was posted, so it's no longer reproducible--the fundamental ideas still hold true, though)