如何根据其他两个列中的条件创建和填充新列?

发布于 2025-02-04 13:45:46 字数 1225 浏览 3 评论 0原文

如何根据另外两个列的条件创建一个新列并用值填充值?

输入:

    import pandas as pd
    import numpy as np

    list1 = ['no','no','yes','yes','no','no','no','yes','no','yes','yes','no','no','no']
    list2 = ['no','no','no','no','no','yes','yes','no','no','no','no','no','yes','no']

    df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B'])

    df['C'] = np.where ((df['A'] == 'yes') & (df['A'].shift(1) == 'no'), 'X', np.nan)
    df['D'] = 'nan','nan','X','X','X','X','nan','X','X','X','X','X','X','nan'

    print (df)

输出:

          A    B    C    D
    0    no   no  nan  nan
    1    no   no  nan  nan
    2   yes   no    X    X
    3   yes   no  nan    X
    4    no   no  nan    X
    5    no  yes  nan    X
    6    no  yes  nan  nan
    7   yes   no    X    X
    8    no   no  nan    X
    9   yes   no    X    X
    10  yes   no  nan    X
    11   no   no  nan    X
    12   no  yes  nan    X
    13   no   no  nan  nan

A和B列将是Givens,仅包含“是”或“否”值。只能有三对('no' - 'no','yes' - 'no'或'no' - 'YES')。永远不可能有“是” - “是”对。

目的是在遇到“是” - “否”对时,将“ X”放在新列中,然后继续填充'X,直到有“否” - '是'是'对。这可能会在几行或几百行中发生。

D列显示了所需的输出。

C列是当前失败的尝试。

谁能帮忙?提前致谢。

How can I create a new column and fill it with values based on the condition of two other columns?

input:

    import pandas as pd
    import numpy as np

    list1 = ['no','no','yes','yes','no','no','no','yes','no','yes','yes','no','no','no']
    list2 = ['no','no','no','no','no','yes','yes','no','no','no','no','no','yes','no']

    df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B'])

    df['C'] = np.where ((df['A'] == 'yes') & (df['A'].shift(1) == 'no'), 'X', np.nan)
    df['D'] = 'nan','nan','X','X','X','X','nan','X','X','X','X','X','X','nan'

    print (df)

output:

          A    B    C    D
    0    no   no  nan  nan
    1    no   no  nan  nan
    2   yes   no    X    X
    3   yes   no  nan    X
    4    no   no  nan    X
    5    no  yes  nan    X
    6    no  yes  nan  nan
    7   yes   no    X    X
    8    no   no  nan    X
    9   yes   no    X    X
    10  yes   no  nan    X
    11   no   no  nan    X
    12   no  yes  nan    X
    13   no   no  nan  nan

Columns A and B will be givens and only contain 'yes' or 'no' values. There can only be three possible pairs ('no'-'no', 'yes'-'no', or 'no'-'yes'). There can never be a 'yes'-'yes' pair.

The goal is to place an 'X' in the new column when a 'yes'-'no' pair is encountered and then to continue filling in 'X's until there is a 'no'-'yes' pair. This could happen over a few rows or several hundred rows.

Column D shows the desired output.

Column C is the current failing attempt.

Can anyone help? Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

大姐,你呐 2025-02-11 13:45:46

尝试以下操作:

df["E"] = np.nan

# Use boolean indexing to set no-yes to placeholder value
df.loc[(df["A"] == "no") & (df["B"] == "yes"), "E"] = "PL"

# Shift placeholder down by one, as it seems from your example
# that you want X to be on the no-yes "stopping" row
df["E"] = df.E.shift(1)

# Then set the X value on the yes-no rows
df.loc[(df.A == "yes") & (df.B == "no"), "E"] = "X"
df["E"] = df.E.ffill() # Fill forward

# Fix placeholders
df.loc[df.E == "PL", "E"] = np.nan

结果:

    A   B   C   D   E
0   no  no  nan nan NaN
1   no  no  nan nan NaN
2   yes no  X   X   X
3   yes no  nan X   X
4   no  no  nan X   X
5   no  yes nan X   X
6   no  yes nan nan NaN
7   yes no  X   X   X
8   no  no  nan X   X
9   yes no  X   X   X
10  yes no  nan X   X
11  no  no  nan X   X
12  no  yes nan X   X
13  no  no  nan nan NaN

Try this:

df["E"] = np.nan

# Use boolean indexing to set no-yes to placeholder value
df.loc[(df["A"] == "no") & (df["B"] == "yes"), "E"] = "PL"

# Shift placeholder down by one, as it seems from your example
# that you want X to be on the no-yes "stopping" row
df["E"] = df.E.shift(1)

# Then set the X value on the yes-no rows
df.loc[(df.A == "yes") & (df.B == "no"), "E"] = "X"
df["E"] = df.E.ffill() # Fill forward

# Fix placeholders
df.loc[df.E == "PL", "E"] = np.nan

Results:

    A   B   C   D   E
0   no  no  nan nan NaN
1   no  no  nan nan NaN
2   yes no  X   X   X
3   yes no  nan X   X
4   no  no  nan X   X
5   no  yes nan X   X
6   no  yes nan nan NaN
7   yes no  X   X   X
8   no  no  nan X   X
9   yes no  X   X   X
10  yes no  nan X   X
11  no  no  nan X   X
12  no  yes nan X   X
13  no  no  nan nan NaN
本王不退位尔等都是臣 2025-02-11 13:45:46

您可以使用Apply()来做到这一点,

df['C'] = df[['A','B']].apply(yourfunction, axis=1)

您的功能可能是:

def yourfunction(cols):
   col_A = cols[0]
   col_B = cols[1]
   if YOURLOGIC:
      return X

You can use apply() to do that,

df['C'] = df[['A','B']].apply(yourfunction, axis=1)

Where your functions can be:

def yourfunction(cols):
   col_A = cols[0]
   col_B = cols[1]
   if YOURLOGIC:
      return X
三月梨花 2025-02-11 13:45:46

您可以这样尝试。在这里,我使用iTerrows在行上循环

import pandas as pd
import numpy as np

list1 = ['no','no','yes','yes','no','no','no','yes','no','yes','yes','no','no','no']
list2 = ['no','no','no','no','no','yes','yes','no','no','no','no','no','yes','no']

df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B'])

df['C'] = np.nan
to_check = 0
for ind, row in df.iterrows():
    if (row['A'] == 'yes') and (row['B'] == 'no'):
        to_check = 1
        df.loc[ind, 'C'] = 'X'
        continue
    
    if (row['A'] == 'no') and (row['B'] == 'yes'):
        if to_check == 1:
            df.loc[ind, 'C'] = 'X'
            to_check = 0
        continue

    if to_check == 1:
        df.loc[ind, 'C'] = 'X'


df['D'] = 'nan','nan','X','X','X','X','nan','X','X','X','X','X','X','nan'

print (df)

you can try this way. Here, I use iterrows to loop over rows

import pandas as pd
import numpy as np

list1 = ['no','no','yes','yes','no','no','no','yes','no','yes','yes','no','no','no']
list2 = ['no','no','no','no','no','yes','yes','no','no','no','no','no','yes','no']

df = pd.DataFrame({'A' : list1, 'B' : list2}, columns = ['A', 'B'])

df['C'] = np.nan
to_check = 0
for ind, row in df.iterrows():
    if (row['A'] == 'yes') and (row['B'] == 'no'):
        to_check = 1
        df.loc[ind, 'C'] = 'X'
        continue
    
    if (row['A'] == 'no') and (row['B'] == 'yes'):
        if to_check == 1:
            df.loc[ind, 'C'] = 'X'
            to_check = 0
        continue

    if to_check == 1:
        df.loc[ind, 'C'] = 'X'


df['D'] = 'nan','nan','X','X','X','X','nan','X','X','X','X','X','X','nan'

print (df)
晨光如昨 2025-02-11 13:45:46

这将完成工作,

def needed_in():
  count = False
  for index in df.index:
    if df.loc[index, ["A", "B"]].tolist() == ["yes", "no"]:
      count = True
    
    if count:
      yield index

    if df.loc[index, ["A", "B"]].tolist() == ["no", "yes"]:
      count = False

df["C"] = np.nan
df.loc[needed_in(), "C"] = "X"

输出 -

abc
0nan
1nan
2是否x
3是否
x
12x
13nan

This would get the job done,

def needed_in():
  count = False
  for index in df.index:
    if df.loc[index, ["A", "B"]].tolist() == ["yes", "no"]:
      count = True
    
    if count:
      yield index

    if df.loc[index, ["A", "B"]].tolist() == ["no", "yes"]:
      count = False

df["C"] = np.nan
df.loc[needed_in(), "C"] = "X"

Output -

ABC
0nononan
1nononan
2yesnoX
3yesnoX
4nonoX
5noyesX
6noyesnan
7yesnoX
8nonoX
9yesnoX
10yesnoX
11nonoX
12noyesX
13nononan
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文