python -beautifulousoup-如何针对n个孩子并打印文字

发布于 2025-02-06 05:25:26 字数 676 浏览 1 评论 0原文

我正在尝试在

。我在Div class_ ='SC-1RMT1NR-0 SC-1RMT1NR-2 IMYVIY'中访问第n个孩子(最大获益)'

我设法从“趋势”部分获取数据“最大的获利者”前3个文本项目。

我得到attributeError:'nontype'对象没有属性'p'

from bs4 import BeautifulSoup
import requests


source = requests.get('https://coinmarketcap.com/').text

soup = BeautifulSoup(source, 'lxml')

section = soup.find(class_='sc-1rmt1nr-0 sc-1rmt1nr-2 iMyvIy')

#List the top 3 Gainers 
for top_gainers in section.find_all(class_='sc-16r8icm-0 sc-1uagfi2-0 bdEGog sc-1rmt1nr-1 eCWTbV')[1]:
    top_gainers = top_gainers.find(class_='sc-1eb5slv-0 iworPT')
    top_coins = top_gainers.p.text
    print(top_coins)

I'm trying to scrape the "Biggest Gainers" list of coins on https://coinmarketcap.com/

How do I access the nth child (Biggest Gainers) in the div class_ = 'sc-1rmt1nr-0 sc-1rmt1nr-2 iMyvIy'

I managed to get the data from the "Trending" section but having trouble targeting the "Biggest Gainers" top 3 text items.

I get AttributeError: 'NoneType' object has no attribute 'p'

from bs4 import BeautifulSoup
import requests


source = requests.get('https://coinmarketcap.com/').text

soup = BeautifulSoup(source, 'lxml')

section = soup.find(class_='sc-1rmt1nr-0 sc-1rmt1nr-2 iMyvIy')

#List the top 3 Gainers 
for top_gainers in section.find_all(class_='sc-16r8icm-0 sc-1uagfi2-0 bdEGog sc-1rmt1nr-1 eCWTbV')[1]:
    top_gainers = top_gainers.find(class_='sc-1eb5slv-0 iworPT')
    top_coins = top_gainers.p.text
    print(top_coins)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

辞旧 2025-02-13 05:25:26

我会避免使用这些动态的类,而是使用 - :汤包室和组合者首先通过文本找到所需的块,然后使用组合者指定最终元素的关系以从中提取信息。

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

soup = bs(requests.get("https://coinmarketcap.com/").text, "lxml")
biggest_gainers = []

for i in soup.select(
    'div[color=text]:has(span:-soup-contains("Biggest Gainers")) > div ~ div'
):
    biggest_gainers.append(
        {
            "rank": int(i.select_one(".rank").text),
            "currency": i.select_one(".alias").text,
            "% change": f"{i.select_one('.icon-Caret-up').next_sibling}",
        }
    )

gainers = pd.DataFrame(biggest_gainers)
gainers

I would avoid those dynamic classes and instead use -:soup-contains and combinators to first locate desired block via text, then with the combinators specify the relationship of the final elements to extract info from.

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

soup = bs(requests.get("https://coinmarketcap.com/").text, "lxml")
biggest_gainers = []

for i in soup.select(
    'div[color=text]:has(span:-soup-contains("Biggest Gainers")) > div ~ div'
):
    biggest_gainers.append(
        {
            "rank": int(i.select_one(".rank").text),
            "currency": i.select_one(".alias").text,
            "% change": f"{i.select_one('.icon-Caret-up').next_sibling}",
        }
    )

gainers = pd.DataFrame(biggest_gainers)
gainers
未央 2025-02-13 05:25:26

如@qharr所述,您应该避免使用与他的方法相似的动态标识符,而选择通过: - soup-contains()和该元素的已知文本:

soup.select('div:has(>div>span:-soup-contains("Biggest Gainers")) ~ div')

提取我使用的文本spripped_strings并用键将其划为dict

dict(zip(['rank','name','alias','change'],e.stripped_strings))
示例
from bs4 import BeautifulSoup
import requests

url = 'https://coinmarketcap.com/'
soup=BeautifulSoup(requests.get(url).content)
data = []
for e in soup.select('div:has(>div>span:-soup-contains("Biggest Gainers")) ~ div'):
    data.append(dict(zip(['rank','name','alias','change'],e.stripped_strings)))
输出
[{'rank': '1', 'name': 'Tenset', 'alias': '10SET', 'change': '1406.99'},
 {'rank': '2', 'name': 'Burn To Earn', 'alias': 'BTE', 'change': '348.89'},
 {'rank': '3', 'name': 'MetaCars', 'alias': 'MTC', 'change': '332.05'}]

As mentioned by @QHarr you should avoid dynamic identifier similar to his approach the selection comes via :-soup-contains() and the known text of the element:

soup.select('div:has(>div>span:-soup-contains("Biggest Gainers")) ~ div')

To extract the texts I used stripped_strings and zipped it with the keys to a dict:

dict(zip(['rank','name','alias','change'],e.stripped_strings))
Example
from bs4 import BeautifulSoup
import requests

url = 'https://coinmarketcap.com/'
soup=BeautifulSoup(requests.get(url).content)
data = []
for e in soup.select('div:has(>div>span:-soup-contains("Biggest Gainers")) ~ div'):
    data.append(dict(zip(['rank','name','alias','change'],e.stripped_strings)))
Output
[{'rank': '1', 'name': 'Tenset', 'alias': '10SET', 'change': '1406.99'},
 {'rank': '2', 'name': 'Burn To Earn', 'alias': 'BTE', 'change': '348.89'},
 {'rank': '3', 'name': 'MetaCars', 'alias': 'MTC', 'change': '332.05'}]
热鲨 2025-02-13 05:25:26

您可以使用:type来找到“最大的获利者”父div

import requests
from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://coinmarketcap.com/').text, 'html.parser')
bg = d.select_one('div:nth-of-type(2).sc-16r8icm-0.sc-1uagfi2-0.bdEGog.sc-1rmt1nr-1.eCWTbV')
data = [{'rank':i.select_one('span.rank').text, 
         'name':i.select_one('p.sc-1eb5slv-0.iworPT').text,
          'change':i.select_one('span.sc-27sy12-0.gLZJFn').text}
        for i in bg.select('div.sc-1rmt1nr-0.sc-1rmt1nr-4.eQRTPY')]

输出:

[{'rank': '1', 'name': 'Tenset', 'change': '1308.72%'}, {'rank': '2', 'name': 'Burn To Earn', 'change': '421.82%'}, {'rank': '3', 'name': 'Aigang', 'change': '329.63%'}]

You can use :nth-of-type to locate the "Biggest Gainers" parent div:

import requests
from bs4 import BeautifulSoup as soup
d = soup(requests.get('https://coinmarketcap.com/').text, 'html.parser')
bg = d.select_one('div:nth-of-type(2).sc-16r8icm-0.sc-1uagfi2-0.bdEGog.sc-1rmt1nr-1.eCWTbV')
data = [{'rank':i.select_one('span.rank').text, 
         'name':i.select_one('p.sc-1eb5slv-0.iworPT').text,
          'change':i.select_one('span.sc-27sy12-0.gLZJFn').text}
        for i in bg.select('div.sc-1rmt1nr-0.sc-1rmt1nr-4.eQRTPY')]

Output:

[{'rank': '1', 'name': 'Tenset', 'change': '1308.72%'}, {'rank': '2', 'name': 'Burn To Earn', 'change': '421.82%'}, {'rank': '3', 'name': 'Aigang', 'change': '329.63%'}]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文