python(美丽)仅1个结果

发布于 2025-02-08 15:02:32 字数 689 浏览 3 评论 0原文

我知道与这个回答的问题相似,我已经尝试申请并没有解决我的问题。

我的问题是在此网站上: http://books.toscrape.com/catalogue/catalogue/ Page-1.html 有20个价格,当我尝试刮擦价格时,我只能获得第一个价格,但没有其他19个。

这是代码

from bs4 import BeautifulSoup
import requests
URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find_all("div", class_ = "col-sm-8 col-md-9")

for i in results :
    prices = i.find("p", class_ = "price_color")
    print(prices.text.strip())
    print()

I know there are similar questions to this one that are answered which I already tried applying and didn't fix my problem.

My problem is that on this website: http://books.toscrape.com/catalogue/page-1.html there are 20 prices and when I try to scrape the prices, I only get the first price but not other 19.

Here's the code

from bs4 import BeautifulSoup
import requests
URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find_all("div", class_ = "col-sm-8 col-md-9")

for i in results :
    prices = i.find("p", class_ = "price_color")
    print(prices.text.strip())
    print()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

憧憬巴黎街头的黎明 2025-02-15 15:02:32

您以错误的方式搜索项目。

只有一个div,带有col-sm-8 col-md-9,带有许多价格,但您的代码期望许多divs divs /code>在每个div中都有单个价格 - 这使问题。

使用find()您在此div中搜索单个价格,但是您应该使用find_all在此单个div中获取所有价格代码>。

div = soup.find("div", class_="col-sm-8 col-md-9")

prices = div.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

您甚至可以直接搜索价格

prices = soup.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

最少的工作示例:

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/catalogue/page-1.html'
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

div = soup.find("div", class_="col-sm-8 col-md-9")

prices = soup.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

使用find()才能在搜索价格上使用,只有在您首先找到单个价格的所有区域 - 例如Artical)。

每本书都在分开的文章中 - 因此,有许多文章,每个文章都有单个价格(以及单个标题,单个图像等)

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/catalogue/page-1.html'
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

results = soup.find_all("article")

for i in results:
    title = i.find("h3")
    print('title:', title.text.strip())

    price = i.find("p", class_="price_color")
    print('price:', price.text.strip())

    print('---')

结果:

title: A Light in the ...
price: £51.77
---
title: Tipping the Velvet
price: £53.74
---
title: Soumission
price: £50.10
---
title: Sharp Objects
price: £47.82
---
title: Sapiens: A Brief History ...
price: £54.23
---
title: The Requiem Red
price: £22.65
---
title: The Dirty Little Secrets ...
price: £33.34
---
title: The Coming Woman: A ...
price: £17.93
---
title: The Boys in the ...
price: £22.60
---
title: The Black Maria
price: £52.15
---
title: Starving Hearts (Triangular Trade ...
price: £13.99
---
title: Shakespeare's Sonnets
price: £20.66
---
title: Set Me Free
price: £17.46
---
title: Scott Pilgrim's Precious Little ...
price: £52.29
---
title: Rip it Up and ...
price: £35.02
---
title: Our Band Could Be ...
price: £57.25
---
title: Olio
price: £23.88
---
title: Mesaerion: The Best Science ...
price: £37.59
---
title: Libertarianism for Beginners
price: £51.33
---
title: It's Only the Himalayas
price: £45.17
---

You search items in wrong way.

There is only one div with col-sm-8 col-md-9 with many prices but your code expects many divs with single price in every div - and this makes problem.

Using find() you search single price in this div but you should use find_all to get all prices in this single div.

div = soup.find("div", class_="col-sm-8 col-md-9")

prices = div.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

You could even search directly prices

prices = soup.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

Minimal working example:

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/catalogue/page-1.html'
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

div = soup.find("div", class_="col-sm-8 col-md-9")

prices = soup.find_all("p", class_="price_color")

for i in prices:
    print(i.text.strip())

Using find() to search price could work only if you would first find all regions with single price - like article.

Every book is in separated article - so there are many articles and every article has single price (and single title, single image, etc.)

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/catalogue/page-1.html'
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

results = soup.find_all("article")

for i in results:
    title = i.find("h3")
    print('title:', title.text.strip())

    price = i.find("p", class_="price_color")
    print('price:', price.text.strip())

    print('---')

Result:

title: A Light in the ...
price: £51.77
---
title: Tipping the Velvet
price: £53.74
---
title: Soumission
price: £50.10
---
title: Sharp Objects
price: £47.82
---
title: Sapiens: A Brief History ...
price: £54.23
---
title: The Requiem Red
price: £22.65
---
title: The Dirty Little Secrets ...
price: £33.34
---
title: The Coming Woman: A ...
price: £17.93
---
title: The Boys in the ...
price: £22.60
---
title: The Black Maria
price: £52.15
---
title: Starving Hearts (Triangular Trade ...
price: £13.99
---
title: Shakespeare's Sonnets
price: £20.66
---
title: Set Me Free
price: £17.46
---
title: Scott Pilgrim's Precious Little ...
price: £52.29
---
title: Rip it Up and ...
price: £35.02
---
title: Our Band Could Be ...
price: £57.25
---
title: Olio
price: £23.88
---
title: Mesaerion: The Best Science ...
price: £37.59
---
title: Libertarianism for Beginners
price: £51.33
---
title: It's Only the Himalayas
price: £45.17
---
睫毛上残留的泪 2025-02-15 15:02:32

此代码应该起作用!

import requests
from bs4 import BeautifulSoup


URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')

list_of_books = soup.select(
    # using chrom selector
    '#default > div > div > div > div > section > div:nth-child(2) > ol > li'
)

for book in list_of_books:
    price = book.find('p', {'class': 'price_color'})
    print(price.text.strip())

我只是使用了Chorme选择器
的屏幕截图

这是它 代码> find_all 在错误的地方。

this code should work!

import requests
from bs4 import BeautifulSoup


URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')

list_of_books = soup.select(
    # using chrom selector
    '#default > div > div > div > div > section > div:nth-child(2) > ol > li'
)

for book in list_of_books:
    price = book.find('p', {'class': 'price_color'})
    print(price.text.strip())

i just used chorme selector
this is a screenshot of it

you are using the find and find_all in the wrong places.

心碎的声音 2025-02-15 15:02:32

班级错误。
@ihonestlydontknow,如果将此行更改为“文章”,您的代码将有效:(

results = soup.find_all("article")

正如他的答复中提到的furas)

** print(结果)(或使用 https://codebeautify.org/htmlviewer 用于检查结构。)

....

<article class="product_pod">
<div class="image_container">
<a href="libertarianism-for-beginners_982/index.html"><img alt="Libertarianism for Beginners" class="thumbnail" src="../media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg"/></a>
</div>
<p class="star-rating Two">
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
</p>
<h3><a href="libertarianism-for-beginners_982/index.html" title="Libertarianism for Beginners">Libertarianism for Beginners</a></h3>
<div class="product_price">
<p class="price_color">£51.33</p>
<p class="instock availability">
<i class="icon-ok"></i>

        In stock

</p>
<form>
<button class="btn btn-primary btn-block" data-loading-text="Adding..." type="submit">Add to basket</button>
</form>
</div>
</article>

****输出

£51.77

£53.74

£50.10

£47.82

54.23英镑

£22.65

33.34英镑

£17.93

...

(vwebtuan)tng@rack-dff0:〜$ cat a.py

from bs4 import BeautifulSoup
import requests
URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find_all("article")
#print(results)
for i in results :
    prices = i.find("p", class_ = "price_color")
    print(prices.text.strip())

(vwebtuan)tng@rack-dff0:〜$

Wrong class.
@ihonestlydontKnow, if you change this line to "article", your code will work:

results = soup.find_all("article")

(as furas mentioned in his reply)

**print(results) (or using https://codebeautify.org/htmlviewer for checking the structure.)

....

<article class="product_pod">
<div class="image_container">
<a href="libertarianism-for-beginners_982/index.html"><img alt="Libertarianism for Beginners" class="thumbnail" src="../media/cache/0b/bc/0bbcd0a6f4bcd81ccb1049a52736406e.jpg"/></a>
</div>
<p class="star-rating Two">
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
<i class="icon-star"></i>
</p>
<h3><a href="libertarianism-for-beginners_982/index.html" title="Libertarianism for Beginners">Libertarianism for Beginners</a></h3>
<div class="product_price">
<p class="price_color">£51.33</p>
<p class="instock availability">
<i class="icon-ok"></i>

        In stock

</p>
<form>
<button class="btn btn-primary btn-block" data-loading-text="Adding..." type="submit">Add to basket</button>
</form>
</div>
</article>

****output

£51.77

£53.74

£50.10

£47.82

£54.23

£22.65

£33.34

£17.93

...

(vwebtuan) tng@rack-dff0:~$ cat a.py

from bs4 import BeautifulSoup
import requests
URL = 'http://books.toscrape.com/catalogue/page-1.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find_all("article")
#print(results)
for i in results :
    prices = i.find("p", class_ = "price_color")
    print(prices.text.strip())

(vwebtuan) tng@rack-dff0:~$

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文