棒球年鉴.com API 返回 403 错误

发布于 2025-01-12 20:06:42 字数 796 浏览 4 评论 0原文

我对 API 和网络抓取完全是菜鸟。我正在尝试重新创建“用棒球学习编程”第 5 章中的示例。

我正在使用 Spyder (Python 3.8)。首先,我导入以下库:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

然后,我输入接下来的两个语句

bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL')

print(bal_response.text)

这将返回以下 403 错误消息

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /opening_day/odschedule.php
on this server.</p>
</body></html>

有人可以指出我在这里做错了什么吗?我确实是按照书上的步骤来的。

提前致谢。

I am a complete noob to API and web scraping. I am trying to recreate the example in chapter 5 of 'Learn to Code with Baseball'.

I am using Spyder (Python 3.8). First I import the following libraries:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

Then, I type these next two statements

bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL')

print(bal_response.text)

This returns the following 403 error message

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /opening_day/odschedule.php
on this server.</p>
</body></html>

Can someone please point out what I'm doing wrong here? I am literally following the book's steps.

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

剩余の解释 2025-01-19 20:06:42

在 headers 参数中添加用户代理:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL', headers=headers)
print(bal_response.text)

Add the user-agent in the headers parameter:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL', headers=headers)
print(bal_response.text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文