pandas - 星期几和周数 - 选择值的进度整数
我有这个数据框:
rng = pd.date_range(start='2018-01-01', end='2018-01-14')
rnd_values = np.random.rand(len(rng))+3
dfc = pd.DataFrame({'time':rng.to_list(),'value':rnd_values})
dfc = dfc.set_index('time')
dfc.index = pd.to_datetime(dfc.index)
假设我必须根据某个定义间隔内的值每周进行划分。 假设周一到周三之间;以及周四和周日。因此,我定义此列表是为了设置间隔:
days=['Monday','Thursday']
到目前为止,我已经学会了如何对结果组进行分组,这要归功于 这篇文章
作为
def groupby_daysspan_week(dfc,days):
df = dfc.copy()
day_to_dayofweek = {'Monday':0,'Tuesday':1,'Wednesday':2,
'Thursday':3,'Friday':4,'Saturday':5,'Sunday':6}
short_dict = {0:'Mn',1:'Tu',2:'Wd',3:'Th',4:'Fr',5:'St',6:'Sn'}
day_split = [day_to_dayofweek[d] for d in days]
df_agg = 0.
df['wkno'] = df.index.isocalendar().week
df['dow'] = df.index.day_of_week
df['span'] = np.searchsorted(day_split,df['dow'],side='right')
span_name_dict = {i+1:short_dict[day_split[i]]+'-'+short_dict[(day_split+[6])[i+1]]
for i in range(len(day_split))}
df_agg = df.groupby(['wkno','span'])['value'].mean()
df_agg = df_agg.rename(index=span_name_dict,level=1)
return df_agg
感谢 arnau。
正如你所注意到的,我现在有 4 个间隔。我想根据间隔来识别值。我想提出类似的东西:
value wkno dow span int
time
2018-01-01 3.548814 1 0 1 0
2018-01-02 3.715189 1 1 1 0
2018-01-03 3.602763 1 2 1 0
2018-01-04 3.544883 1 3 2 1
2018-01-05 3.423655 1 4 2 1
2018-01-06 3.645894 1 5 2 1
2018-01-07 3.437587 1 6 2 1
2018-01-08 3.891773 2 0 1 2
2018-01-09 3.963663 2 1 1 2
2018-01-10 3.383442 2 2 1 2
2018-01-11 3.791725 2 3 2 3
2018-01-12 3.528895 2 4 2 3
2018-01-13 3.568045 2 5 2 3
2018-01-14 3.925597 2 6 2 3
这将允许我根据间隔选择“值”。正如您所注意到的,我有一个名为“Int”的附加列,用于标识间隔。然后,我将将此列与 np.where 一起使用。问题是我无法建立“Int”列。
感谢您的任何帮助。
迪德罗
I have this dataframe:
rng = pd.date_range(start='2018-01-01', end='2018-01-14')
rnd_values = np.random.rand(len(rng))+3
dfc = pd.DataFrame({'time':rng.to_list(),'value':rnd_values})
dfc = dfc.set_index('time')
dfc.index = pd.to_datetime(dfc.index)
Let's say that I have to divide each week according to the value in some defined interval.
Let's say between Monday and Wednesday; and Thursday and Sunday. So I define this list in order to set-up the intervals:
days=['Monday','Thursday']
I have, so far, learned how to group the resulting groups thanks to this post
as
def groupby_daysspan_week(dfc,days):
df = dfc.copy()
day_to_dayofweek = {'Monday':0,'Tuesday':1,'Wednesday':2,
'Thursday':3,'Friday':4,'Saturday':5,'Sunday':6}
short_dict = {0:'Mn',1:'Tu',2:'Wd',3:'Th',4:'Fr',5:'St',6:'Sn'}
day_split = [day_to_dayofweek[d] for d in days]
df_agg = 0.
df['wkno'] = df.index.isocalendar().week
df['dow'] = df.index.day_of_week
df['span'] = np.searchsorted(day_split,df['dow'],side='right')
span_name_dict = {i+1:short_dict[day_split[i]]+'-'+short_dict[(day_split+[6])[i+1]]
for i in range(len(day_split))}
df_agg = df.groupby(['wkno','span'])['value'].mean()
df_agg = df_agg.rename(index=span_name_dict,level=1)
return df_agg
thanks to arnau.
As you can notice I have now 4 intervals. I would like to identify the values according to the interval. I would like to come out with something like:
value wkno dow span int
time
2018-01-01 3.548814 1 0 1 0
2018-01-02 3.715189 1 1 1 0
2018-01-03 3.602763 1 2 1 0
2018-01-04 3.544883 1 3 2 1
2018-01-05 3.423655 1 4 2 1
2018-01-06 3.645894 1 5 2 1
2018-01-07 3.437587 1 6 2 1
2018-01-08 3.891773 2 0 1 2
2018-01-09 3.963663 2 1 1 2
2018-01-10 3.383442 2 2 1 2
2018-01-11 3.791725 2 3 2 3
2018-01-12 3.528895 2 4 2 3
2018-01-13 3.568045 2 5 2 3
2018-01-14 3.925597 2 6 2 3
This would allow me to select "value" according to the interval. As you can notice I have an additional column named "Int" which identifies the interval. Then, I will use this column with np.where. The problem is that I am not able to build-up the "Int" column.
Thanks for any kind of help.
Diedro
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是我设置的解决方案。
不知道够不够优雅。
here the solution that I have set-up.
I do not know if it is elegant enough.