如何根据另一列中的条件创建新列

发布于 2025-01-12 16:01:52 字数 642 浏览 0 评论 0原文

在 pandas 中，如何基于 df 中的列 A 创建新列 B，例如：

B=1< /code> if A_(i+1)-A_(i) > 5 或 A_(i) <= 10
B=0 如果 A_(i+1)-A_(i) <= 5

但是，第一个 B_i 值始终为 1

示例：

A	B
5	1（第一个 B_i）
12	1
14	0
22	1
20	0
33	1

原文

In pandas, How can I create a new column B based on a column A in df, such that:

B=1 if A_(i+1)-A_(i) > 5 or A_(i) <= 10
B=0 if A_(i+1)-A_(i) <= 5

However, the first B_i value is always one

Example:

A	B
5	1 (the first B_i)
12	1
14	0
22	1
20	0
33	1

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

铜锣湾横着走 2025-01-19 16:01:52

使用 diff< /a> 与您的值进行比较，并使用 le ：

N = 5
df['B'] = (~df['A'].diff().le(N)).astype(int)

注意。使用 le(5) 与反转比较可以使第一个值为 1
输出：

更新的答案，只需将第二个条件与 OR (|) 结合起来：

df['B'] = (~df['A'].diff().le(5)|df['A'].lt(10)).astype(int)

输出：与上面提供的数据相同

Use diff with a comparison to your value and convertion from boolean to int using le:

N = 5
df['B'] = (~df['A'].diff().le(N)).astype(int)

NB. using a le(5) comparison with inversion enables to have 1 for the first value
output:

updated answer, simply combine a second condition with OR (|):

df['B'] = (~df['A'].diff().le(5)|df['A'].lt(10)).astype(int)

output: same as above with the provided data

回复收藏 0 原文

橪书 2025-01-19 16:01:52

我对你的行计数有点困惑，因为如果我们根据条件 A_(i+1)-A_(i)< 计算 B_i ，我们应该在最后一行而不是第一行缺少值/code> （第一行应同时包含 A_(i) 和 A_(i+1)，最后一行应缺少 A_(i+1)值。

不管怎样，根据你的例子，我假设我们计算B_(i+1)

import pandas as pd
df = pd.DataFrame(columns=["A"],data=[5,12,14,22,20,33])
df['shifted_A'] = df['A'].shift(1) #This row can be removed - it was added only show to how shift works on final dataframe
df['B']=''
df.loc[((df['A']-df['A'].shift(1))>5) + (df['A'].shift(1)<=10), 'B']=1 #Update rows that fulfill one of conditions with 1
df.loc[(df['A']-df['A'].shift(1))<=5, 'B']=0 #Update rows that fulfill condition with 0
df.loc[df.index==0, 'B']=1 #Update first row in B column
print(df)

    A  shifted_A  B
0   5        NaN  1
1  12        5.0  1
2  14       12.0  0
3  22       14.0  1
4  20       22.0  0
5  33       20.0  1

。稍微解释一下

：

df.loc[mask, columnname]=newvalue 允许我们在满足条件（掩码）时更新给定列中的值

(df['A']-df['A'].shift(1))>5) + (df['A'].shift(1)<=10)
这里的每个条件都返回 True 或 False。如果我们将它们相加，如果其中任何一个为 True，则结果为 True（简单的 OR）。如果我们需要 AND 我们可以将条件相乘

I was little confused with your rows numeration bacause we should have missing value on last row instead of first if we calcule for B_i basing on condition A_(i+1)-A_(i) (first row should have both, A_(i) and A_(i+1) and last row should be missing A_(i+1) value.

Anyway,basing on your example i assumed that we calculate for B_(i+1).

import pandas as pd
df = pd.DataFrame(columns=["A"],data=[5,12,14,22,20,33])
df['shifted_A'] = df['A'].shift(1) #This row can be removed - it was added only show to how shift works on final dataframe
df['B']=''
df.loc[((df['A']-df['A'].shift(1))>5) + (df['A'].shift(1)<=10), 'B']=1 #Update rows that fulfill one of conditions with 1
df.loc[(df['A']-df['A'].shift(1))<=5, 'B']=0 #Update rows that fulfill condition with 0
df.loc[df.index==0, 'B']=1 #Update first row in B column
print(df)

That prints:

    A  shifted_A  B
0   5        NaN  1
1  12        5.0  1
2  14       12.0  0
3  22       14.0  1
4  20       22.0  0
5  33       20.0  1

I am not sure if it is fastest way, but i guess it should be one of easier to understand.

Little explanation:

df.loc[mask, columnname]=newvalue allows us to update value in given column if condition (mask) is fulfilled

(df['A']-df['A'].shift(1))>5) + (df['A'].shift(1)<=10)
Each condition here returns True or False. If we added them the result is True if any of that is True (which is simply OR). In case we need AND we can multiply the conditions

回复收藏 0 原文

堇年纸鸢 2025-01-19 16:01:52

使用 Series.diff，在比较大于或等于后将 1 的第一个缺失值替换为 Series.ge：

N = 5
df['B'] = (df.A.diff().fillna(N).ge(N) | df.A.lt(10)).astype(int)
print (df)
    A  B
0   5  1
1  12  1
2  14  0
3  22  1
4  20  0
5  33  1

Use Series.diff, replace first missing value for 1 after compare for greater or equal by Series.ge:

N = 5
df['B'] = (df.A.diff().fillna(N).ge(N) | df.A.lt(10)).astype(int)
print (df)
    A  B
0   5  1
1  12  1
2  14  0
3  22  1
4  20  0
5  33  1

回复收藏 0 原文

~没有更多了~

关于作者

稚气少女

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何根据另一列中的条件创建新列

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

：

Little explanation:

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如何根据另一列中的条件创建新列

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

：

Little explanation:

关于作者

相关话题

热门标签

推荐作者

燃烧我的卡路李先生

qq_2gSKZM

∞梦里开花

qq_IklFPL

迷途知返

深海不蓝

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。