每天重新采样销售数据 - 现在无法使用PolyFit -Python
我有一个具有这样的销售的数据框架: -
sales_df
Date Sales
01/04/2020 00:03 1
01/04/2020 02:26 4
01/05/2020 02:28 3
01/05/2020 05:09 5
01/06/2020 05:16 6
01/06/2020 05:17 7
01/07/2020 05:18 3
在sales_df.info()上看起来像这样
0 Date datetime64[ns]
1 Sales float64
结果
line_coef = np.polyfit(sales_df.index,sales_df['Sales'],1)
print(line_coef)
,我可以执行以下操作,并得到我想做同样的 ,但每天都会汇总,所以我已经重新采样了这样的数据
sales_day_df = sales_df.resample('D',on='Date').agg({'Sales':'sum'})
会导致这样的数据帧: -
sales_by_day_df
Date Sales
01/04/2020 5
01/05/2020 8
01/06/2020 13
01/07/2020 3
但是当我尝试执行相同的
line_coef = np.polyfit(sales_by_day_df.index,sales_by_day_df['Sales'],1)
print(line_coef)
数据时,我会收到一个错误
ufuntypeypeerror:ufunc'add'不能将操作数与类型dtype('< m8 [ns]')和dtype('float64')
我注意到我现在只有一个带有dateTimeIndex的数据框中的一个列,这是原因吗?我需要为数据重新采样时的日期创建一个新列吗?
sales_by_day_df.info()
DatetimeIndex: 30 entries, 2020-04-01 to 2020-04-30
Freq: D
|Data columns (total 1 columns):|
# Column Dtype
--- ------ -----
0 Sales float64
I have a dataframe of data with sales like this:-
sales_df
Date Sales
01/04/2020 00:03 1
01/04/2020 02:26 4
01/05/2020 02:28 3
01/05/2020 05:09 5
01/06/2020 05:16 6
01/06/2020 05:17 7
01/07/2020 05:18 3
which looks like this on sales_df.info()
0 Date datetime64[ns]
1 Sales float64
and I can perform the below and get a result
line_coef = np.polyfit(sales_df.index,sales_df['Sales'],1)
print(line_coef)
I want to do the same, but aggregated by day, so I've resampled the data like this
sales_day_df = sales_df.resample('D',on='Date').agg({'Sales':'sum'})
which results in a dataframe like this:-
sales_by_day_df
Date Sales
01/04/2020 5
01/05/2020 8
01/06/2020 13
01/07/2020 3
But when I try and perform the same
line_coef = np.polyfit(sales_by_day_df.index,sales_by_day_df['Sales'],1)
print(line_coef)
I get an error
UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('float64')
I notice that I only have the one column now in my dataframe with a DatetimeIndex, is this the cause? Do I need to create a new column for the date when I resample the data?
sales_by_day_df.info()
DatetimeIndex: 30 entries, 2020-04-01 to 2020-04-30
Freq: D
|Data columns (total 1 columns):|
# Column Dtype
--- ------ -----
0 Sales float64
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是因为当您重新采样时,您的索引必须为
DateTime
。在代码中,您将Date
列设置为索引。您的第一次尝试工作是因为索引仍然数字(不是错误消息中提到的日期时间)
您的第二次尝试不起作用,因为索引现在是DateTime,
我的建议是,尝试重置索引,然后尝试再次执行您的
Polyfit用
dataframe.reset_index()
重置您的索引It's because when you resample, your index has to be
datetime
. in your code, you set theDate
column as the index.your first try work because the index still numbers (not DateTime as mentioned in the error message)
your second try didn't work because the index now is datetime
my advice is, try to reset the index then try to execute your polyfit again
you can reset your index with
DataFrame.reset_index()