ValueError:文件不是可识别的 Excel 文件,如何读取 .xls filke?
我正在尝试读取一个 excel 文件,它是一个 url。链接和代码如下:
excel = 'https://www.marketnews.usda.gov/mnp/fv-report?&commAbr=AVOC&step3date=true&locAbr=HX&repType=termPriceDaily&refine=false&Run=Run&type=termPrice&repTypeChanger=termPriceDaily&environment=&_environment=1&locAbrPass=CHICAGO%7C%7CHX&locChoose=commodity&commodityClass=allcommodity&locAbrlength=1&organic=&repDate=01%2F01%2F2022&endDate=03%2F17%2F2022&format=excel&rebuild=false'
data = pd.read_excel(excel, engine='openpyxl')
我尝试使用 openpyxl ,但出现以下错误:
File is not a zip file
我什至尝试使用 pd.read_csv 但数据采用 html 格式,不易阅读:
df = pd.read_csv('https://www.marketnews.usda.gov/mnp/fv-report?&commAbr=AVOC&step3date=true&locAbr=HX&repType=termPriceDaily&refine=false&Run=Run&type=termPrice&repTypeChanger=termPriceDaily&environment=&_environment=1&locAbrPass=CHICAGO%7C%7CHX&locChoose=commodity&commodityClass=allcommodity&locAbrlength=1&organic=&repDate=01%2F01%2F2022&endDate=03%2F17%2F2022&format=excel&rebuild=false',
sep='</tr><tr>'
)
请帮忙!
I am trying to read an excel file which is a url. The link and the code is below:
excel = 'https://www.marketnews.usda.gov/mnp/fv-report?&commAbr=AVOC&step3date=true&locAbr=HX&repType=termPriceDaily&refine=false&Run=Run&type=termPrice&repTypeChanger=termPriceDaily&environment=&_environment=1&locAbrPass=CHICAGO%7C%7CHX&locChoose=commodity&commodityClass=allcommodity&locAbrlength=1&organic=&repDate=01%2F01%2F2022&endDate=03%2F17%2F2022&format=excel&rebuild=false'
data = pd.read_excel(excel, engine='openpyxl')
I tried using openpyxl and i get the following error:
File is not a zip file
I even tried using pd.read_csv but the data is coming in html format which isn't easily readable:
df = pd.read_csv('https://www.marketnews.usda.gov/mnp/fv-report?&commAbr=AVOC&step3date=true&locAbr=HX&repType=termPriceDaily&refine=false&Run=Run&type=termPrice&repTypeChanger=termPriceDaily&environment=&_environment=1&locAbrPass=CHICAGO%7C%7CHX&locChoose=commodity&commodityClass=allcommodity&locAbrlength=1&organic=&repDate=01%2F01%2F2022&endDate=03%2F17%2F2022&format=excel&rebuild=false',
sep='</tr><tr>'
)
Please help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您将无法使用 pandas 下载/读取 Excel 文件,因为 url 链接不是直接的 Excel 文件。
使用以下代码代替
pd.read_excel
:You won't be able to download/read the excel file using pandas as the url link is not of direct excel file.
Instead of
pd.read_excel
use below code:该 url 返回 html 页面而不是 Excel 文件。
使用:
The url returns an html page not an excel file.
Use: