如何从 BS4 输出中生成列表

发布于 2025-01-11 20:12:48 字数 474 浏览 0 评论 0原文

我现在有这段代码:

from bs4 import BeautifulSoup
import requests

get = requests.get("https://solmfers-minting-site.netlify.app/")
soup = BeautifulSoup(get.text, 'html.parser')
for i in soup.find_all('script'):
    print(i.get('src'))

我需要以某种方式将输出转换为列表并从中删除 None 值,因为它输出如下:

jquery.js
nicepage.js
None
None
/static/js/2.c20455e8.chunk.js
/static/js/main.87864e1d.chunk.js

I have this code right now:

from bs4 import BeautifulSoup
import requests

get = requests.get("https://solmfers-minting-site.netlify.app/")
soup = BeautifulSoup(get.text, 'html.parser')
for i in soup.find_all('script'):
    print(i.get('src'))

And I need to somehow turn the output into a list and remove the None values from it since it outputs it like this:

jquery.js
nicepage.js
None
None
/static/js/2.c20455e8.chunk.js
/static/js/main.87864e1d.chunk.js

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别想她 2025-01-18 20:12:48

只需将提取的值附加到列表中即可。

result = []
for i in soup.find_all('script'):
    elem = i.get('src')
    if elem is not None:
        result.append(elem)

或者使用列表理解:

result = [x['src'] for x in soup.find_all('script') if x.get('src') is not None]

Just append your extracted values to a list.

result = []
for i in soup.find_all('script'):
    elem = i.get('src')
    if elem is not None:
        result.append(elem)

Or using a list comprehension:

result = [x['src'] for x in soup.find_all('script') if x.get('src') is not None]
内心旳酸楚 2025-01-18 20:12:48

您已接近目标,但要选择更具体的元素,并在迭代 ResultSet 时将 src 附加到列表:

data = []

for i in soup.find_all('script', src=True):
    data.append(i.get('src'))
  

使用 css 选择器 替代:

for i in soup.select('script[src]'):
    data.append(i.get('src'))

正如列表理解中已经提到的:

[i.get('src') for i in soup.select('script[src]')]

输出

['jquery.js', 'nicepage.js', '/static/js/2.c20455e8.chunk.js', '/static/js/main.87864e1d.chunk.js']

Your near to your goal, but select your elements more specific and append the src to a list while iterating your ResultSet:

data = []

for i in soup.find_all('script', src=True):
    data.append(i.get('src'))
  

Alternative with css selectors:

for i in soup.select('script[src]'):
    data.append(i.get('src'))

And as allready mentioned with list comprehension:

[i.get('src') for i in soup.select('script[src]')]

Output

['jquery.js', 'nicepage.js', '/static/js/2.c20455e8.chunk.js', '/static/js/main.87864e1d.chunk.js']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文