棒球年鉴.com API 返回 403 错误

发布于 2025-01-12 20:06:42 字数 796 浏览 4 评论 0原文

我对 API 和网络抓取完全是菜鸟。我正在尝试重新创建“用棒球学习编程”第 5 章中的示例。

我正在使用 Spyder (Python 3.8)。首先，我导入以下库：

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

然后，我输入接下来的两个语句

bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL')

print(bal_response.text)

这将返回以下 403 错误消息

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /opening_day/odschedule.php
on this server.</p>
</body></html>

有人可以指出我在这里做错了什么吗？我确实是按照书上的步骤来的。

提前致谢。

原文

I am a complete noob to API and web scraping. I am trying to recreate the example in chapter 5 of 'Learn to Code with Baseball'.

I am using Spyder (Python 3.8). First I import the following libraries:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

Then, I type these next two statements

bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL')

print(bal_response.text)

This returns the following 403 error message

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /opening_day/odschedule.php
on this server.</p>
</body></html>

Can someone please point out what I'm doing wrong here? I am literally following the book's steps.

Thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

剩余の解释 2025-01-19 20:06:42

在 headers 参数中添加用户代理：

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL', headers=headers)
print(bal_response.text)

Add the user-agent in the headers parameter:

from bs4 import BeautifulSoup as Soup
import pandas as pd
import requests
from pandas import DataFrame

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
bal_response = requests.get('http://baseball-almanac.com/opening_day/odschedule.php?t=BAL', headers=headers)
print(bal_response.text)

回复收藏 0 原文

~没有更多了~