pandas:根据列条件将行追加到相似行下的另一个数据帧
我有两个数据框,如下所示,
import pandas as pd
d1 ={'col1': ['I ate dinner','I ate dinner', 'the play was inetresting','the play was inetresting'],
'col2': ['min', 'max', 'mid','min'],
'col3': ['min', 'max', 'max','max']}
d2 ={'col1': ['I ate dinner',' the glass is shattered', 'the play was inetresting'],
'col2': ['min', 'max', 'max'],
'col3': ['max', 'min', 'mid']}
df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)
我在 df2 中创建了一个名为“exist”的列,并根据 df2.col1 中的句子是否存在于 df1.col1 中添加值(true、false):
common = df1.merge(df2,on=['col1'])
index_list = df2[(~df2.col1.isin(common.col1))].index.to_list()
df2['exist'] = ' '
df2.loc[index_list, 'exist'] = 'false'
df2.loc[df2["exist"] == " ",'exist'] = 'true'
我现在想做的是如果存在列中的值 == true,我想将该行添加到 df1 中的类似行下。所以所需的输出应该是:
output:
col1 col2 col3
0 I ate dinner min min
1 I ate dinner max max
2 I ate dinner min max
3 the play was inetresting mid max
4 the play was inetresting min max
5 the play was inetresting max mid
我想我必须使用 np.where,但我不确定如何制定附加以获得所需的输出
I have two dataframes as follows,
import pandas as pd
d1 ={'col1': ['I ate dinner','I ate dinner', 'the play was inetresting','the play was inetresting'],
'col2': ['min', 'max', 'mid','min'],
'col3': ['min', 'max', 'max','max']}
d2 ={'col1': ['I ate dinner',' the glass is shattered', 'the play was inetresting'],
'col2': ['min', 'max', 'max'],
'col3': ['max', 'min', 'mid']}
df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)
I have created a column in df2 called 'exist' and added values (true, false) based on whether the sentences in df2.col1 exist in df1.col1:
common = df1.merge(df2,on=['col1'])
index_list = df2[(~df2.col1.isin(common.col1))].index.to_list()
df2['exist'] = ' '
df2.loc[index_list, 'exist'] = 'false'
df2.loc[df2["exist"] == " ",'exist'] = 'true'
what I would like to do now, is that if the value in the exist column == true, I would like to add that row under the similar row in df1. so the desired output should be:
output:
col1 col2 col3
0 I ate dinner min min
1 I ate dinner max max
2 I ate dinner min max
3 the play was inetresting mid max
4 the play was inetresting min max
5 the play was inetresting max mid
I guess I have to use np.where, but I am not sure how to formulate the append to get the desired output
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
第一个想法是通过
df1.col1
过滤df2
值,并通过concat
然后按DataFrame.sort_values
:如果只需要两个 DataFrame 中的共同值,则可以通过
numpy.intersect1d
:First idea is filter
df2
values bydf1.col1
and append todf1
byconcat
and then sorting byDataFrame.sort_values
:If need only common values in both DataFrames is possible filter by
numpy.intersect1d
:IIUC,您想要添加匹配的行而不一定依赖于排序。
输出:
IIUC, you want to add the matching row(s) and not necessarily rely on sorting.
output: