进行groupby时添加具有特定值的列
我有一个看起来像这样的 DataFrame:
df
:
date price bool
---------------------------------------------
2022-01-03 22:00:00+01:00 109.65 False
2022-01-03 22:00:00+01:00 80.00 False
2022-01-03 22:00:00+01:00 65.79 True
2022-01-03 22:00:00+01:00 50.00 True
2022-01-03 23:00:00+01:00 47.00 False
2022-01-03 23:00:00+01:00 39.95 True
2022-01-03 23:00:00+01:00 39.47 False
2022-01-03 23:00:00+01:00 29.96 False
2022-01-03 23:00:00+01:00 22.47 True
如果我执行 df.groupby("date")
我的输出将是 2 groupby
由日期
分隔的对象。这很好。但我想要的是向这两个列添加一个新列,其中整个列的 max
price
其中 bool == True
。因此,生成的数据帧将变为:
df_groupby_object1
:
date price bool max_price
-----------------------------------------------------------
2022-01-03 22:00:00+01:00 109.65 False 65.79
2022-01-03 22:00:00+01:00 80.00 False 65.79
2022-01-03 22:00:00+01:00 65.79 True 65.79
2022-01-03 22:00:00+01:00 50.00 True 65.79
df_groupby_object2
:
date price bool max_price
-----------------------------------------------------------
2022-01-03 23:00:00+01:00 47.00 False 39.95
2022-01-03 23:00:00+01:00 39.95 True 39.95
2022-01-03 23:00:00+01:00 39.47 False 39.95
2022-01-03 23:00:00+01:00 29.96 False 39.95
2022-01-03 23:00:00+01:00 22.47 True 39.95
我可能可以迭代 groupby
对象,以这种方式创建一个额外的列,但我想知道这是否可以直接在 groupby
函数中完成?
I have a DataFrame that looks something like:
df
:
date price bool
---------------------------------------------
2022-01-03 22:00:00+01:00 109.65 False
2022-01-03 22:00:00+01:00 80.00 False
2022-01-03 22:00:00+01:00 65.79 True
2022-01-03 22:00:00+01:00 50.00 True
2022-01-03 23:00:00+01:00 47.00 False
2022-01-03 23:00:00+01:00 39.95 True
2022-01-03 23:00:00+01:00 39.47 False
2022-01-03 23:00:00+01:00 29.96 False
2022-01-03 23:00:00+01:00 22.47 True
If I do a df.groupby("date")
my output will be 2 groupby
objects separated by date
. This is fine. But what I would like is to add a new column to both of these with the max
price
where bool == True
for the entire column. Hence, the resulting data frames would become:
df_groupby_object1
:
date price bool max_price
-----------------------------------------------------------
2022-01-03 22:00:00+01:00 109.65 False 65.79
2022-01-03 22:00:00+01:00 80.00 False 65.79
2022-01-03 22:00:00+01:00 65.79 True 65.79
2022-01-03 22:00:00+01:00 50.00 True 65.79
df_groupby_object2
:
date price bool max_price
-----------------------------------------------------------
2022-01-03 23:00:00+01:00 47.00 False 39.95
2022-01-03 23:00:00+01:00 39.95 True 39.95
2022-01-03 23:00:00+01:00 39.47 False 39.95
2022-01-03 23:00:00+01:00 29.96 False 39.95
2022-01-03 23:00:00+01:00 22.47 True 39.95
I could probably just iterate through the groupby
objects as create a extra column that way, but I was wondering if this could be done directly in the groupby
function ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
GroupBy。仅当
以获得最大值。如果不匹配price
中的值为True
时,才转换price
则为Series.where
:详细信息:
Use
GroupBy.transform
for get maximal values only ifTrue
s values inprice
. If not matchprice
isNaN
created bySeries.where
:Details: