如何修复 python Yahoo Finance Webscrapper 中的 TypeError、Parsing 和所有其他错误

发布于 2025-01-12 07:02:43 字数 2203 浏览 0 评论 0原文

如何修复我的 python Yahoo Finance Webscraper 中的 TypeError、Parsing 和所有其他错误。我无法从雅虎财经获取我的代码。有修复吗？看来 span 类是问题所在，因为它们已被 fin-streamer 删除并替换。

错误：错误

代码：

import requests
from bs4 import BeautifulSoup
    
    def create_url():
        symbol = str(input('Enter Stock Symbol: '))
        url = f'https://finance.yahoo.com/quote/{symbol}'
        return url
    
    def get_html(url):
        header = {"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}
        response = requests.get(url, headers = header)
    
        if response.status_code == 200:
            return response.text
        else:
            return None
    
    
    def parse_data(html):
    
        soup = BeautifulSoup(html,'html.parser')
        name = soup.find('h1', {'class': 'D(ib) Fz(18px)'}).text
        price = soup.find('fin-streamer', {'class': 'D(ib) Mend(20px)'}).find_all('fin-streamer')[0].text
        change = soup.find('fin-streamer', {'class': 'D(ib) Mend(20px)'}).find_all('fin-streamer')[1].text
        previous_close = soup.find('fin-streamer', {'class': 'Trsdu(0.3s)'}).text
        open_price = soup.find('td',{'class':'Ta(end) Fw(600) Lh(14px)'}).text
        print(f'|Stock Name: {name}|', f'|Stock Price: ${price}|', f'|Change: {change}|', f'|Previous Close: ${previous_close}|', f'|Open Price: ${open_price}|')
        # print(f'Stock Price: ${price}')
        # print(f'Change: {change}')
        # print(f'Previous Close: ${previous_close}')
        # print(f'Open Price: ${open_price}')
        stock_data = {
            'name':name,
            'price':price,
            'change':change ,
            'previous_close': previous_close,
            'open_price': open_price
        }
    
        return stock_data
    
    def main():
        # get users input
        url = create_url()
        # get html
        html = get_html(url)
        # while loop
        i = True
        while i:
            # parse data
            data = parse_data(html)
    
    
    if __name__ == '__main__':
        main()

原文

How do I fix TypeError, Parsing, and all other errors in my python Yahoo Finance Webscraper. I cannot get my code to pull from Yahoo finance. Any fixes? It looks like span classes are the problem since they were removed and replaced by fin-streamer.

Error:
error

Code:

import requests
from bs4 import BeautifulSoup
    
    def create_url():
        symbol = str(input('Enter Stock Symbol: '))
        url = f'https://finance.yahoo.com/quote/{symbol}'
        return url
    
    def get_html(url):
        header = {"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}
        response = requests.get(url, headers = header)
    
        if response.status_code == 200:
            return response.text
        else:
            return None
    
    
    def parse_data(html):
    
        soup = BeautifulSoup(html,'html.parser')
        name = soup.find('h1', {'class': 'D(ib) Fz(18px)'}).text
        price = soup.find('fin-streamer', {'class': 'D(ib) Mend(20px)'}).find_all('fin-streamer')[0].text
        change = soup.find('fin-streamer', {'class': 'D(ib) Mend(20px)'}).find_all('fin-streamer')[1].text
        previous_close = soup.find('fin-streamer', {'class': 'Trsdu(0.3s)'}).text
        open_price = soup.find('td',{'class':'Ta(end) Fw(600) Lh(14px)'}).text
        print(f'|Stock Name: {name}|', f'|Stock Price: ${price}|', f'|Change: {change}|', f'|Previous Close: ${previous_close}|', f'|Open Price: ${open_price}|')
        # print(f'Stock Price: ${price}')
        # print(f'Change: {change}')
        # print(f'Previous Close: ${previous_close}')
        # print(f'Open Price: ${open_price}')
        stock_data = {
            'name':name,
            'price':price,
            'change':change ,
            'previous_close': previous_close,
            'open_price': open_price
        }
    
        return stock_data
    
    def main():
        # get users input
        url = create_url()
        # get html
        html = get_html(url)
        # while loop
        i = True
        while i:
            # parse data
            data = parse_data(html)
    
    
    if __name__ == '__main__':
        main()

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

装纯掩盖桑 2025-01-19 07:02:43

def create_url():
    symbol = str(input('Enter Stock Symbol: '))
    url = f'https://finance.yahoo.com/quote/%7Bsymbol%7D'
    return url

此函数包含一个错误：为了将您的 symbol 插入到 url 中，您需要执行以下操作：

url = f'https://finance.yahoo.com/quote/{symbol}'

因此，您获取了错误的 URL ，但是如果您未能获得状态 200，您的函数 get_html() 返回 None。该 None 将作为 HTML 传递给 BeautifulSoup，生成您的错误。

您的 get_html() 函数检查状态是件好事，但如果状态指示失败，则该函数应该失败。

更新：这是复制和粘贴错误。

您的错误是由于将 None 传递给 BeautifulSoup 引起的 - 您可以通过运行来确认这一点：

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(None, 'html.parser')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/tmp/venv/lib/python3.9/site-packages/bs4/__init__.py", line 312, in __init__
    elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()

您正在传递 None 因为这就是您的 get_html() 函数可能会返回：

if response.status_code == 200:
  return response.text
else:
  return None

如果您的函数无法获取 200 状态代码，您需要失败，而不是返回 None。

尝试用以下内容替换整个 get_html() 函数：

def get_html(url):
        header = {"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}
        response = requests.get(url, headers = header)
        response.raise_for_status()
        return response.text

如果 http 请求失败，此函数将引发异常 - 否则，它将返回 html，然后您可以将其提供给 BeautifulSoup

def create_url():
    symbol = str(input('Enter Stock Symbol: '))
    url = f'https://finance.yahoo.com/quote/%7Bsymbol%7D'
    return url

This function contains an error: in order to interpolate your symbol into the url, you need to do something like this:

url = f'https://finance.yahoo.com/quote/{symbol}'

As a result, you where GET'ing the wrong URL, but your function get_html() returns None if you fail to get a status 200. That None gets passed to BeautifulSoup as HTML, generating your error.

It's good that your get_html() function checks the status, but it should be failing if status indicates failure.

Update: that was a copy and paste error.

Your error is caused by passing None to BeautifulSoup - you can confirm this by running:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(None, 'html.parser')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/private/tmp/venv/lib/python3.9/site-packages/bs4/__init__.py", line 312, in __init__
    elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()

You are passing None because that's what your get_html() func might be returning:

if response.status_code == 200:
  return response.text
else:
  return None

If your function fails to GET a 200 status code, you need to fail, not return None.

Try replacing the entire get_html() func with this:

def get_html(url):
        header = {"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}
        response = requests.get(url, headers = header)
        response.raise_for_status()
        return response.text

This function will raise an exception if the http request failed - otherwise, it will return the html, which you can then feed to BeautifulSoup

回复收藏 0 原文

~没有更多了~