当前位置：文江博客话题详情

HTML Python beautifulsoup web-scraping

如何网络抓取元内容 - Python 网络抓取问题

发布于 2025-01-09 09:34:06 字数 521 浏览 0 评论 0原文

我只想刮掉“汽车”一词，而不是带有元括号的整行。

所需的输出：“汽车”

你能告诉我如何解决这个问题吗？谢谢！

from bs4 import BeautifulSoup
import requests
import csv

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')
print(category2)

输出：

<meta content="Automobile" property="article:section"/>

I want to only scrape the word "Automobile" not the entire line with the meta brackets.

Desired output: "Automobile"

Can you please tell me how to fix this? Thanks!

from bs4 import BeautifulSoup
import requests
import csv

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')
print(category2)

Output:

<meta content="Automobile" property="article:section"/>

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（1）

佼人 2025-01-16 09:34:06

只需将 ['content'] 添加到您的 soup 对象即可。

import requests
from bs4 import BeautifulSoup

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')['content']
print(category2)

输出：

Automobile

Just add ['content'] to your soup object.

import requests
from bs4 import BeautifulSoup

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')['content']
print(category2)

Output:

Automobile

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

文章

评论

25 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

紫罗兰の梦幻

文章 0 评论 0

-2134

文章 0 评论 0

liuxuanli

文章 0 评论 0

意中人

文章 0 评论 0

○愚か者の日

文章 0 评论 0

xxhui

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文