按照熊猫组的差异
如何创建new_group列?如果上面的行是水果,则基于10分钟的水果差距;以及2分钟的水果差距,如果上面的行是其他行?数据框已排序。
person time_bought product new_group
abby 2:21 fruit 1
abby 2:25 fruit 1 (2.25 is within 10 minutes of 2.21 so part of same group)
abby 10:35 fruit 2
abby 10:40 other
abby 10:42 fruit 2 (10.42 is within 2 minutes of 10.35)
abby 10:53 fruit 3 (10.53 is not within 10 minutes of 10.42)
barry 12:00 fruit 1
...
我试过:
m1 = df.loc[df['product'].eq('fruit'), 'time_bought'].groupby(df['person']).diff().gt('10min')
m2 = df.product.shift(1)=="other"
m3 = df.loc[df['product'].eq('fruit'), 'time_bought'].groupby(df['person']).diff().gt('2min')
df['new_group'] = m1.cumsum().mask(m2, m3)
How could I create the new_group column? It's based on 10-minute fruit gaps if the row above is fruit; and 2-minute fruit gaps if the row above is Other? Dataframe is sorted.
person time_bought product new_group
abby 2:21 fruit 1
abby 2:25 fruit 1 (2.25 is within 10 minutes of 2.21 so part of same group)
abby 10:35 fruit 2
abby 10:40 other
abby 10:42 fruit 2 (10.42 is within 2 minutes of 10.35)
abby 10:53 fruit 3 (10.53 is not within 10 minutes of 10.42)
barry 12:00 fruit 1
...
I tried:
m1 = df.loc[df['product'].eq('fruit'), 'time_bought'].groupby(df['person']).diff().gt('10min')
m2 = df.product.shift(1)=="other"
m3 = df.loc[df['product'].eq('fruit'), 'time_bought'].groupby(df['person']).diff().gt('2min')
df['new_group'] = m1.cumsum().mask(m2, m3)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
iiuc,您可以使用词典保存参考,然后使用相同代码的变体:
输出:
IIUC, you can use a dictionary to hold the references, then use a variation of the same code:
output: