基于多个列值标准创建等级列
假设我有以下数据框架,
Num1 Num2 Num3
123 75 43
123 72 32
123 72 37
123 73 41
456 72 23
456 75 25
456 73 21
456 73 27
我需要创建另一列,称为rank
。预期的输出是
Num1 Num2 Num3 rank
123 75 43 1
123 72 32 3
123 72 37 2
123 73 41 4
456 72 23 6
456 75 25 5
456 73 21 8
456 73 27 7
逻辑是:对于每个num1
,请检查num2
,如果是75对于73,必须是第三名。对于打路机案例,检查num3
,将根据较大的数字给出优先级。
我的想法是sort
降低,但将在num3
列上使用,而不是num2
。 我已经创建
df['tcolun'] = df.apply(lambda row: 1 if row['Num2'] == 75 else (2 if row['Num2'] == 72 else 3), axis = 1)
但无法正确使用它。
Suppose I have the following dataframe
Num1 Num2 Num3
123 75 43
123 72 32
123 72 37
123 73 41
456 72 23
456 75 25
456 73 21
456 73 27
I need to create another column called rank
. The expected output would be
Num1 Num2 Num3 rank
123 75 43 1
123 72 32 3
123 72 37 2
123 73 41 4
456 72 23 6
456 75 25 5
456 73 21 8
456 73 27 7
The logic is: for each Num1
, check the Num2
, if it is 75, give them 1st priority, if it is 72, give it 2nd and for 73, it has to be 3rd. For tie breaker case, check Num3
, priority will be given based on the larger number.
My thought was to sort
it down, but will work on the Num3
column not on Num2
.
I have created
df['tcolun'] = df.apply(lambda row: 1 if row['Num2'] == 75 else (2 if row['Num2'] == 72 else 3), axis = 1)
But unable to use it properly.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
IIUC,为 Num2 制作映射字典,然后应用排序逻辑并使用 索引上的
numpy.argsort
:输出:
与
groupby
+ngroup
,利用groupby
默认情况下对组进行快速排序的优势:IIUC, craft a mapping dictionary for Num2, then apply your sorting logic and use
numpy.argsort
on the index:output:
Alternative with
groupby
+ngroup
, taking advantage of the fast thatgroupby
sorts the groups by default: