可以弄清楚如何用美丽的小组刮擦ID

发布于 2025-02-06 12:14:14 字数 477 浏览 1 评论 0原文

试图用ID刮擦网站，但我不知道如何修复它：

from bs4 import BeautifulSoup
import requests

url= "Website"
page= requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
lists = soup.find_all ('div', class_="position-relative")

for list in lists:
    Value = list.find('h5', id_= "player_value")
print (Value)

现在它将打印：

None

这是网站检查模式的样子：

原文

Trying to scrape a site with an ID but I can't figure out how to fix it:

from bs4 import BeautifulSoup
import requests

url= "Website"
page= requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
lists = soup.find_all ('div', class_="position-relative")

for list in lists:
    Value = list.find('h5', id_= "player_value")
print (Value)

Now with that it will just print:

None

Here is what the website inspect mode looks like:

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

情话墙 2025-02-13 12:14:14

从attribute参数ID中删除_：

.find('h5', id= "player_value")

为什么_需要从

“ class”，是python中的一个保留词。使用类作为关键字
参数将为您带来语法错误。从美丽的汤4.1.2起，你
可以使用关键字参数类搜索CSS类_

可以使用关键字参数class_示例

假设有一个唯一的ID，您可以直接获得值：

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
'''
soup = BeautifulSoup(html)

player_value = soup.find('h5', id= "player_value").text
print(player_value)

如果您的＆lt; H5＆gt;不是唯一的，并且您想获得全部 - 避免使用其他保留单词，例如list：

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
<h5 id="player_value">2</h5>
<h5 id="player_value">3</h5>
<h5 id="player_value">4</h5>
'''

soup = BeautifulSoup(html)

for l in soup.find_all('h5', id = "player_value"):
    print (l.text)

Remove the _ from attribute parameter id:

.find('h5', id= "player_value")

Why _ is needed for the class from the docs:

“class”, is a reserved word in Python. Using class as a keyword
argument will give you a syntax error. As of Beautiful Soup 4.1.2, you
can search by CSS class using the keyword argument class_

Example

Assuming that there is an unique id you could get your value directly:

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
'''
soup = BeautifulSoup(html)

player_value = soup.find('h5', id= "player_value").text
print(player_value)

If the id of your <h5> is not unique and you want to get all - Avoid also to use other reserved words like list:

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
<h5 id="player_value">2</h5>
<h5 id="player_value">3</h5>
<h5 id="player_value">4</h5>
'''

soup = BeautifulSoup(html)

for l in soup.find_all('h5', id = "player_value"):
    print (l.text)

回复收藏 0 原文

不回头走下去 2025-02-13 12:14:14

您需要在dict中通过班级，请尝试以下操作：

lists = soup.find_all ('div', {'class': 'position-relative'})

You need to pass the class in a dict, try that:

lists = soup.find_all ('div', {'class': 'position-relative'})

回复收藏 0 原文

~没有更多了~

关于作者

罪歌

暂无简介

文章

28 人气

关注发私信

Mr.HU

文章 0 评论 0

关注

疯到世界奔溃

文章 0 评论 0

关注

隔纱相望

文章 0 评论 0

关注

萌无敌

文章 0 评论 0

关注

梦幻的味道

文章 0 评论 0

关注

自在安然

文章 0 评论 0

友情链接

文江博客

可以弄清楚如何用美丽的小组刮擦ID

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

可以使用关键字参数class_示例

Example

关于作者

相关话题

热门标签

推荐作者

Mr.HU

疯到世界奔溃

隔纱相望

萌无敌

梦幻的味道

自在安然

友情链接

可以弄清楚如何用美丽的小组刮擦ID

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

可以使用关键字参数class_示例

Example

关于作者

相关话题

热门标签

推荐作者

Mr.HU

疯到世界奔溃

隔纱相望

萌无敌

梦幻的味道

自在安然

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。