简单的 Web Scraper 不返回数据

发布于 2025-01-11 14:33:58 字数 1216 浏览 0 评论 0原文

我试图从网页中抓取数据,但它返回 [“F”] [“F”],如果没有检索到数据,它应该这样做。请参阅下面的代码

`

import pandas as pd
import datetime
import requests
from requests.exceptions import ConnectionError
from bs4 import BeautifulSoup

def web_content_div(web_content, class_path):
    web_content_div = web_content.find_all('div', {"class": class_path})
    try:
        spans = web_content_div[0].find_all('span')
        texts =[span.get_text() for span in spans]
    except IndexError:
        texts=[]
        return texts

def real_time_price(stock_code):
    url = 'https://finance.yahoo.com/quote/' + stock_code + '?p=' + stock_code + '%27&.tsrc=fin-srch'
    # 'https://finance.yahoo.com/quote/' + stock_code + '?p=' + stock_code + '&.tsrc=fin-srch'
    try:
        r = requests.get(url)
        web_content = BeautifulSoup(r.text, 'lxml')
        texts = web_content_div(web_content, 'My(6px) Pos(r) smarthphone_Mt(6px) W(100&%')
        if texts != []:
           price, change = texts[0], texts[1] 
        else:
            price, change = ["F"], ["F"]
    except ConnectionError:
        price, change = [""], [""]

    return price, change
    

Stock = ["BRK-B"]

print(real_time_price("BRK-B"))`

Im trying to scrape data from a webpage but its returning ["F"] ["F"] which is what it should do if no data has been retrieved. Please see Code below

`

import pandas as pd
import datetime
import requests
from requests.exceptions import ConnectionError
from bs4 import BeautifulSoup

def web_content_div(web_content, class_path):
    web_content_div = web_content.find_all('div', {"class": class_path})
    try:
        spans = web_content_div[0].find_all('span')
        texts =[span.get_text() for span in spans]
    except IndexError:
        texts=[]
        return texts

def real_time_price(stock_code):
    url = 'https://finance.yahoo.com/quote/' + stock_code + '?p=' + stock_code + '%27&.tsrc=fin-srch'
    # 'https://finance.yahoo.com/quote/' + stock_code + '?p=' + stock_code + '&.tsrc=fin-srch'
    try:
        r = requests.get(url)
        web_content = BeautifulSoup(r.text, 'lxml')
        texts = web_content_div(web_content, 'My(6px) Pos(r) smarthphone_Mt(6px) W(100&%')
        if texts != []:
           price, change = texts[0], texts[1] 
        else:
            price, change = ["F"], ["F"]
    except ConnectionError:
        price, change = [""], [""]

    return price, change
    

Stock = ["BRK-B"]

print(real_time_price("BRK-B"))`

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦幻之岛 2025-01-18 14:33:58

由于几个拼写错误,您的 class_path 不存在。有问题的网站引用“My(6px) Pos(r) smartphone_Mt(6px) W(100%)”,我相信这就是您的目标。

Your class_path doesn't exist due to a couple of typos. Website in question references "My(6px) Pos(r) smartphone_Mt(6px) W(100%)", which I believe is what you're targeting.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文