在 Pandas 中索引日期列时出现错误
我试图让熊猫将第一列识别为日期。
import csv
import pandas as pd
import plotly.express as px
cl = open('cl.csv')
cl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date'])
cl.info()
然后可视化价格:
fig = px.line(cl, y="Adj Close", title='Crude Oil Price', labels = {'Adj Close':'Crude Oil Price(in USD)'})
但它返回一个损坏的图表:
如果我发表评论输出 'parse_dates=['Date'], index_col=['Date'])' 并保留 'cl = pd.read_csv('CL.csv')' 图表看起来就很好。
我在这里做错了什么?
I'm trying to make pandas recognise the first column as a date.
import csv
import pandas as pd
import plotly.express as px
cl = open('cl.csv')
cl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date'])
cl.info()
Then to visualise the price:
fig = px.line(cl, y="Adj Close", title='Crude Oil Price', labels = {'Adj Close':'Crude Oil Price(in USD)'})
But it gives back a ruined chart:
If I comment out 'parse_dates=['Date'], index_col=['Date'])' and just leave 'cl = pd.read_csv('CL.csv')' the chart will look just fine.
What am I doing wrong here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您打印
c1
并且日期看起来不错,那么图表背后的原因可能是您的c1
未按Date
排序,在可视化之前执行以下操作:If you print
c1
out and the dates look fine, then the reason behind the graph could likely be that yourc1
wasn't sorted byDate
, do the following before visualizing it:我认为这个问题可能是由列包含的日期格式类型(
'Date'
)引起的,因此研究文档,我引用以下内容:对于非标准日期时间解析,请使用pd.to_datetime
在pd.read_csv
之后。要解析混合时区的索引或列,请将 date_parser 指定为部分应用的pandas.to_datetime()
和utc=True
。请参阅解析 具有混合时区的 CSV更多,那么您可以将 cl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date']) 替换为cl = pd.read_csv('CL.csv', parse_dates=['Date'], date_parser=lambda col: pd.to_datetime(col, utc=True))
I think this problem can be caused by the type of date format that column contains (
'Date'
), so researching the documentation, I quote the following: For non-standard datetime parsing, usepd.to_datetime
afterpd.read_csv
. To parse an index or column with a mixture of timezones, specify date_parser to be a partially-appliedpandas.to_datetime()
withutc=True
. See Parsing a CSV with mixed timezones for more, then you could replacecl = pd.read_csv('CL.csv', parse_dates=['Date'], index_col=['Date'])
withcl = pd.read_csv('CL.csv', parse_dates=['Date'], date_parser=lambda col: pd.to_datetime(col, utc=True))