将元素从一列列表中的列表中删除，从另一列中的列表中删除，然后用新值python pandas替换

发布于 2025-01-25 18:58:53 字数 1905 浏览 1 评论 0原文

我有一个dataframe（列del_lst具有BOOL类型）：

import pandas as pd

df = pd.DataFrame({'col1': [[['a1']], [['b1'], ['b2']], [['b1'], ['b2']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']]],
'col2': [['a1'], ['b1'], ['b2'], ['c1'], ['c2'], ['c3']],
'day': [18, 19, 19, 20, 20, 20],
'del_lst': [True, True, True , True, False, False]})
df

输出：

  col1                col2   day del_lst
0 [[a1]]                [a1]   18    True
1 [[b1], [b2]]        [b1]   19    True
2 [[b1], [b2]]        [b2]   19    True
3 [[c1], [c2], [c3]]  [c1]   20    True
4 [[c1], [c2], [c3]]  [c2]   20    False
5 [[c1], [c2], [c3]]  [c3]   20    False

我想删除具有真实类型的列表，然后逐步删除它们。例如，在[[B1]，[B2]]，b1和b2是正确的，因此首先删除b1，然后B2。我确实喜欢这个，但不幸的是我的代码不起作用。

def func_del(df):
return list(set(df['col1']) - set(df['col2']))


def all_func(df):
# select only lines with True
df_tr = df[df['del_lst'] == True]
for i, row in df_tr.iterrows():
df_tr['new_col1'] = df_tr.apply(func_del, axis=1)

# I want to get a dictionary from where the key is column col1 and the value is new_col1
dict_replace = dict (zip(df_tr['col1'], df_tr['new_col1']))
# so that I replace the old values in the initial dataframe
df['col1_replaced'] = df['col1'].apply(lambda word: dict_replace.get(word, word))
return df

df_new = df.apply(all_func, axis=1)

我想在最后拥有这样的数据框

   col1               col2  col1_replaced  day  del_lst
0 [[a1]]               [a1]   []             18     True
1 [[b1],[b2]]        [b1]   []             19     True
2 [[b1],[b2]]        [b2]   []             19     True
3 [[c1],[c2],[c3]]   [c1]   []             20     True
4 [[c1],[c2],[c3]]   [c2]   [[c2], [c3]]   20     False
5 [[c1],[c2],[c3]]   [c3]   [[c2], [c3]]   20     False

原文

I have a dataframe (column del_lst has bool type ):

import pandas as pd

df = pd.DataFrame({'col1': [[['a1']], [['b1'], ['b2']], [['b1'], ['b2']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']]],
'col2': [['a1'], ['b1'], ['b2'], ['c1'], ['c2'], ['c3']],
'day': [18, 19, 19, 20, 20, 20],
'del_lst': [True, True, True , True, False, False]})
df

Output:

  col1                col2   day del_lst
0 [[a1]]                [a1]   18    True
1 [[b1], [b2]]        [b1]   19    True
2 [[b1], [b2]]        [b2]   19    True
3 [[c1], [c2], [c3]]  [c1]   20    True
4 [[c1], [c2], [c3]]  [c2]   20    False
5 [[c1], [c2], [c3]]  [c3]   20    False

I want to delete lists that have the True type, and delete them step by step. For example in [[b1],[b2]],b1 and b2 are True, so first delete b1, then b2. I did like this, but unfortunately my code doesn't work.

def func_del(df):
return list(set(df['col1']) - set(df['col2']))


def all_func(df):
# select only lines with True
df_tr = df[df['del_lst'] == True]
for i, row in df_tr.iterrows():
df_tr['new_col1'] = df_tr.apply(func_del, axis=1)

# I want to get a dictionary from where the key is column col1 and the value is new_col1
dict_replace = dict (zip(df_tr['col1'], df_tr['new_col1']))
# so that I replace the old values in the initial dataframe
df['col1_replaced'] = df['col1'].apply(lambda word: dict_replace.get(word, word))
return df

df_new = df.apply(all_func, axis=1)

I would like to have a dataframe like this at the end

   col1               col2  col1_replaced  day  del_lst
0 [[a1]]               [a1]   []             18     True
1 [[b1],[b2]]        [b1]   []             19     True
2 [[b1],[b2]]        [b2]   []             19     True
3 [[c1],[c2],[c3]]   [c1]   []             20     True
4 [[c1],[c2],[c3]]   [c2]   [[c2], [c3]]   20     False
5 [[c1],[c2],[c3]]   [c3]   [[c2], [c3]]   20     False

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

站稳脚跟 2025-02-01 18:58:53

您需要在此处循环，使用set操作：

S = set(df.loc[df['del_lst'], 'col2'].str[0])


df['col1_replaced'] = [[x for x in l
                        if (x[0] if isinstance(x, list) else x) not in S]
                       for l in df['col1']]

nb我假设您在此处有单个或嵌套列表，如果不只是使用，如果不使用x [0] /代码>作为条件

输出：

                 col1  col2  day  del_lst col1_replaced
0                [a1]  [a1]   18     True            []
1        [[b1], [b2]]  [b1]   19     True            []
2        [[b1], [b2]]  [b2]   19     True            []
3  [[c1], [c2], [c3]]  [c1]   20     True  [[c2], [c3]]
4  [[c1], [c2], [c3]]  [c2]   20    False  [[c2], [c3]]
5  [[c1], [c2], [c3]]  [c3]   20    False  [[c2], [c3]]

You need to loop here, using set operations:

S = set(df.loc[df['del_lst'], 'col2'].str[0])


df['col1_replaced'] = [[x for x in l
                        if (x[0] if isinstance(x, list) else x) not in S]
                       for l in df['col1']]

NB I am assuming that you have either single or nested lists here, if not just use if x[0] not in S as condition

output:

                 col1  col2  day  del_lst col1_replaced
0                [a1]  [a1]   18     True            []
1        [[b1], [b2]]  [b1]   19     True            []
2        [[b1], [b2]]  [b2]   19     True            []
3  [[c1], [c2], [c3]]  [c1]   20     True  [[c2], [c3]]
4  [[c1], [c2], [c3]]  [c2]   20    False  [[c2], [c3]]
5  [[c1], [c2], [c3]]  [c3]   20    False  [[c2], [c3]]

回复收藏 0 原文

~没有更多了~