' nontype'对象没有属性' find_all'与美丽的小组

发布于 2025-02-08 21:18:53 字数 1399 浏览 1 评论 0原文

我尝试了我发现的这个代码,但是它给了我attributeError的错误消息:“ nontype”对象没有属性'find_all' 我不熟悉Beautifulsoup,也不知道如何解决此问题。试图找到一个我忽略Tabpane部分的解决方案,但无法弄清楚。 你有建议吗?

import datetime
import pandas as pd # pip install pandas
import requests # pip install requests
from bs4 import BeautifulSoup # pip install beautifulsoup4

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) 
Gecko/20100101 Firefox/87.0',
}
url = 'https://www.marketwatch.com/tools/earningscalendar'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')

tabpane = soup.find('div', 'tabpane')
earning_tables = tabpane.find_all('div', {'id': True})

dfs = {}
current_datetime = datetime.datetime.now().strftime('%m-%d-%y %H_%M_%S')
xlsxwriter = pd.ExcelWriter('Earning Calendar 
({0}).xlsx'.format(current_datetime), index=False)

for earning_table in earning_tables:
    if not 'Sorry, this date currently does not have any earnings 
announcements scheduled' in earning_table.text:
        earning_date = earning_table['id'].replace('page', '')
        earning_date = earning_date[:3] + '_' + earning_date[3:]
        print(earning_date)
        dfs[earning_date] = pd.read_html(str(earning_table.table))[0]
        dfs[earning_date].to_excel(xlsxwriter, sheet_name=earning_date, 
index=False)

xlsxwriter.save()
print('earning tables Excel file exported')

I have tried this code I found, however it gives me the error message of AttributeError: 'NoneType' object has no attribute 'find_all'
I am not familiar with Beautifulsoup and dont know how to fix this. tried to find a solution where I ignore the tabpane part, but could not figure it out.
Do you have any sugggestion?

import datetime
import pandas as pd # pip install pandas
import requests # pip install requests
from bs4 import BeautifulSoup # pip install beautifulsoup4

headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) 
Gecko/20100101 Firefox/87.0',
}
url = 'https://www.marketwatch.com/tools/earningscalendar'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')

tabpane = soup.find('div', 'tabpane')
earning_tables = tabpane.find_all('div', {'id': True})

dfs = {}
current_datetime = datetime.datetime.now().strftime('%m-%d-%y %H_%M_%S')
xlsxwriter = pd.ExcelWriter('Earning Calendar 
({0}).xlsx'.format(current_datetime), index=False)

for earning_table in earning_tables:
    if not 'Sorry, this date currently does not have any earnings 
announcements scheduled' in earning_table.text:
        earning_date = earning_table['id'].replace('page', '')
        earning_date = earning_date[:3] + '_' + earning_date[3:]
        print(earning_date)
        dfs[earning_date] = pd.read_html(str(earning_table.table))[0]
        dfs[earning_date].to_excel(xlsxwriter, sheet_name=earning_date, 
index=False)

xlsxwriter.save()
print('earning tables Excel file exported')

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

微凉 2025-02-15 21:18:53

要抓住页面中的所有表:

tables = pd.read_html("https://www.marketwatch.com/tools/earnings-calendar")

只需查看第一个:

print(tables[0].head())

如果您确定所有表都有相同的列,则可以将它们置为一个数据帧:

df = pd.concat(pd.read_html("https://www.marketwatch.com/tools/earnings-calendar"))

To grap all tables in page:

tables = pd.read_html("https://www.marketwatch.com/tools/earnings-calendar")

Just look at the first:

print(tables[0].head())

If you are sure all tables have same columns, you can concat them to have only one dataframe:

df = pd.concat(pd.read_html("https://www.marketwatch.com/tools/earnings-calendar"))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文