pandas 根据所选的一周中的几天进行分组
我有这个数据框:
rng = pd.date_range(start='2018-01-01', end='2018-01-21')
rnd_values = np.random.rand(len(rng))+3
df = pd.DataFrame({'time':rng.to_list(),'value':rnd_values})
假设我想根据一周中的某一天对其进行分组并计算平均值:
df['span'] = np.where((df['time'].dt.day_of_week <= 2 , 'Th-Sn', 'Mn-Wd')
df['wkno'] = df['time'].dt.isocalendar().week.shift(fill_value=0)
df.groupby(['wkno','span']).mean()
但是,我想让这个过程更加通用。
假设我定义第二天是一周:
days=['Monday','Thursday']
是否有任何选项允许我使用“天”来完成我所做的事情。我想我必须计算“星期一”和“星期四”之间的天数,然后我应该使用该数字。当
days=['Monday','Thursday','Friday']
我考虑将字典设置为:
days={'Monday':0,'Thursday':3,'Friday':4}
那么
idays = list(days.values())[:]
我现在如何在 np.where 中使用 idays 呢?确实我有三个间隔。
谢谢
I have this dataframe:
rng = pd.date_range(start='2018-01-01', end='2018-01-21')
rnd_values = np.random.rand(len(rng))+3
df = pd.DataFrame({'time':rng.to_list(),'value':rnd_values})
let's say that I want to group it according to the day of the week and compute the mean:
df['span'] = np.where((df['time'].dt.day_of_week <= 2 , 'Th-Sn', 'Mn-Wd')
df['wkno'] = df['time'].dt.isocalendar().week.shift(fill_value=0)
df.groupby(['wkno','span']).mean()
However, I would like to make this procedure more general.
Let's say that I define the following day is the week:
days=['Monday','Thursday']
Is there any option that allows me to do what I have done by using "days". I imagine that I have to compute the number of day between 'Monday','Thursday' and then I should use that number. What about the case when
days=['Monday','Thursday','Friday']
I was thinking to set-up a dictionary as:
days={'Monday':0,'Thursday':3,'Friday':4}
then
idays = list(days.values())[:]
How can I use now idays inside np.where? Indeed I have three interval.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您想使用多个阈值,则需要
np.searchsorted
,结果函数将类似于If you want to use more than one threshold you need
np.searchsorted
the resulting function would look something like