将元素从一列列表中的列表中删除,从另一列中的列表中删除,然后用新值python pandas替换
我有一个dataframe(列del_lst具有BOOL类型):
import pandas as pd
df = pd.DataFrame({'col1': [[['a1']], [['b1'], ['b2']], [['b1'], ['b2']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']]],
'col2': [['a1'], ['b1'], ['b2'], ['c1'], ['c2'], ['c3']],
'day': [18, 19, 19, 20, 20, 20],
'del_lst': [True, True, True , True, False, False]})
df
输出:
col1 col2 day del_lst
0 [[a1]] [a1] 18 True
1 [[b1], [b2]] [b1] 19 True
2 [[b1], [b2]] [b2] 19 True
3 [[c1], [c2], [c3]] [c1] 20 True
4 [[c1], [c2], [c3]] [c2] 20 False
5 [[c1], [c2], [c3]] [c3] 20 False
我想删除具有真实类型的列表,然后逐步删除它们。例如,在[[B1],[B2]]
,b1
和b2
是正确的,因此首先删除b1
,然后B2
。我确实喜欢这个,但不幸的是我的代码不起作用。
def func_del(df):
return list(set(df['col1']) - set(df['col2']))
def all_func(df):
# select only lines with True
df_tr = df[df['del_lst'] == True]
for i, row in df_tr.iterrows():
df_tr['new_col1'] = df_tr.apply(func_del, axis=1)
# I want to get a dictionary from where the key is column col1 and the value is new_col1
dict_replace = dict (zip(df_tr['col1'], df_tr['new_col1']))
# so that I replace the old values in the initial dataframe
df['col1_replaced'] = df['col1'].apply(lambda word: dict_replace.get(word, word))
return df
df_new = df.apply(all_func, axis=1)
我想在最后拥有这样的数据框
col1 col2 col1_replaced day del_lst
0 [[a1]] [a1] [] 18 True
1 [[b1],[b2]] [b1] [] 19 True
2 [[b1],[b2]] [b2] [] 19 True
3 [[c1],[c2],[c3]] [c1] [] 20 True
4 [[c1],[c2],[c3]] [c2] [[c2], [c3]] 20 False
5 [[c1],[c2],[c3]] [c3] [[c2], [c3]] 20 False
I have a dataframe (column del_lst has bool type ):
import pandas as pd
df = pd.DataFrame({'col1': [[['a1']], [['b1'], ['b2']], [['b1'], ['b2']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']], [['c1'], ['c2'], ['c3']]],
'col2': [['a1'], ['b1'], ['b2'], ['c1'], ['c2'], ['c3']],
'day': [18, 19, 19, 20, 20, 20],
'del_lst': [True, True, True , True, False, False]})
df
Output:
col1 col2 day del_lst
0 [[a1]] [a1] 18 True
1 [[b1], [b2]] [b1] 19 True
2 [[b1], [b2]] [b2] 19 True
3 [[c1], [c2], [c3]] [c1] 20 True
4 [[c1], [c2], [c3]] [c2] 20 False
5 [[c1], [c2], [c3]] [c3] 20 False
I want to delete lists that have the True type, and delete them step by step. For example in [[b1],[b2]]
,b1
and b2
are True, so first delete b1
, then b2
. I did like this, but unfortunately my code doesn't work.
def func_del(df):
return list(set(df['col1']) - set(df['col2']))
def all_func(df):
# select only lines with True
df_tr = df[df['del_lst'] == True]
for i, row in df_tr.iterrows():
df_tr['new_col1'] = df_tr.apply(func_del, axis=1)
# I want to get a dictionary from where the key is column col1 and the value is new_col1
dict_replace = dict (zip(df_tr['col1'], df_tr['new_col1']))
# so that I replace the old values in the initial dataframe
df['col1_replaced'] = df['col1'].apply(lambda word: dict_replace.get(word, word))
return df
df_new = df.apply(all_func, axis=1)
I would like to have a dataframe like this at the end
col1 col2 col1_replaced day del_lst
0 [[a1]] [a1] [] 18 True
1 [[b1],[b2]] [b1] [] 19 True
2 [[b1],[b2]] [b2] [] 19 True
3 [[c1],[c2],[c3]] [c1] [] 20 True
4 [[c1],[c2],[c3]] [c2] [[c2], [c3]] 20 False
5 [[c1],[c2],[c3]] [c3] [[c2], [c3]] 20 False
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要在此处循环,使用
set
操作:nb我假设您在此处有单个或嵌套列表,如果不只是使用
,如果不使用x [0] /代码>作为条件
输出:
You need to loop here, using
set
operations:NB I am assuming that you have either single or nested lists here, if not just use
if x[0] not in S
as conditionoutput: