根据条件更新Pandas列
遵循问题的标题,情况是:
创建dataframe:
import pandas as pd
df = pd.DataFrame({ 'a': ['one', 'one', 'three', 'two', 'eleven', 'two'],
'b': [45, 34, 556, 32, 97, 33],
'c': [234, 66, 12, 44, 99, 3],
'd': [123, 45, 55, 98, 17, 22] })
df
output:
a b c d
0 one 45 234 123
1 one 34 66 45
2 three 556 12 55
3 two 32 44 98
4 eleven97 99 17
5 two 33 3 22
让我想添加列“ e”列,这是列'b','b','c'和'd'的总和。这很简单:
df['e'] = df.b + df.c + df.d
df
输出:
a b c d e
0 one 45 234 123 402
1 one 34 66 45 145
2 three 556 12 55 623
3 two 32 44 98 174
4 eleven 97 99 17 213
5 two 33 3 22 58
现在我想要一个列“ F”,但是基于以下条件:
if df.a == 'one' and df.b < 50:
df['f'] = 0
elif df.a == 'two' and df.d > 50:
df['f'] = 1
else:
df['f'] = 2
但是当然,此代码不起作用。
外:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
如何正确实施这些条件?
Following the title of the question, the case is this:
Creating dataframe:
import pandas as pd
df = pd.DataFrame({ 'a': ['one', 'one', 'three', 'two', 'eleven', 'two'],
'b': [45, 34, 556, 32, 97, 33],
'c': [234, 66, 12, 44, 99, 3],
'd': [123, 45, 55, 98, 17, 22] })
df
Output:
a b c d
0 one 45 234 123
1 one 34 66 45
2 three 556 12 55
3 two 32 44 98
4 eleven97 99 17
5 two 33 3 22
Let's say I want to add a column 'e' which is the sum of the columns 'b', 'c' and 'd'. It's simple:
df['e'] = df.b + df.c + df.d
df
Output:
a b c d e
0 one 45 234 123 402
1 one 34 66 45 145
2 three 556 12 55 623
3 two 32 44 98 174
4 eleven 97 99 17 213
5 two 33 3 22 58
Now I want one more column 'f' , but based on the following condition:
if df.a == 'one' and df.b < 50:
df['f'] = 0
elif df.a == 'two' and df.d > 50:
df['f'] = 1
else:
df['f'] = 2
But of course this code does not work.
out:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
a.item(), a.any() or a.all().
How could those condition be correctly implemented?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用 np.Select
You can use np.select for this:
您可以使用Nested
np.Where
方法:输出:
You can use nested
np.where
methods:Output:
一个选项是 pyjanitor ;它是围绕
pd.series.mask
的包装器,尽可能多地通过了dtypes的所有艰苦工作,也可以通过pandas:One option is case_when from pyjanitor; it is a wrapper around
pd.Series.mask
and as much as possible passes all the hardwork of dtypes and the like to Pandas: