对 2 个 pandas DataFrame 进行异或运算

发布于 2025-01-12 05:23:59 字数 861 浏览 6 评论 0原文

有什么方法可以从第一个 DataFrame 中删除第二个 DataFrame 中可以找到的所有行，并添加仅在第二个 DataFrame 中独有的行（= XOR）？这里有一个转折点：第一个 DataFrame 有一个列在比较过程中应被忽略。

import pandas as pd

df1 = pd.DataFrame({'col1': [1,2,3],
                   'col2': [4,5,6],
                   'spec': ['A','B','C']})

df2 = pd.DataFrame({'col1': [1,9],
                   'col2': [4,9]}) 


result = pd.DataFrame({'col1': [2,3,9],
                   'col2': [5,6,9],
                   'spec': ['B','C','df2']})

df1 = df1.astype(str) 
df2 = df1.astype(str)

这类似于 UNION（不是 UNION ALL）操作。

合并

   col1  col2 spec
0     1     4    A
1     2     5    B
2     3     6    C

和

   col1  col2
0     1     4
1     9     9

至

   col1  col2 spec
1     2     5    B
2     3     6    C
1     9     9  df2

原文

Is there any way to remove from first DataFrame all rows which can be found in second DataFrame and add rows which are exclusive only in second DataFrame (= XOR)? Here's a twist: the first DataFrame has one column that shall be ignored during comparison.

import pandas as pd

df1 = pd.DataFrame({'col1': [1,2,3],
                   'col2': [4,5,6],
                   'spec': ['A','B','C']})

df2 = pd.DataFrame({'col1': [1,9],
                   'col2': [4,9]}) 


result = pd.DataFrame({'col1': [2,3,9],
                   'col2': [5,6,9],
                   'spec': ['B','C','df2']})

df1 = df1.astype(str) 
df2 = df1.astype(str)

This is analogical to UNION (not UNION ALL) operation.

Combine

   col1  col2 spec
0     1     4    A
1     2     5    B
2     3     6    C

and

   col1  col2
0     1     4
1     9     9

   col1  col2 spec
1     2     5    B
2     3     6    C
1     9     9  df2

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

待天淡蓝洁白时 2025-01-19 05:24:00

您可以连接并删除重复项：

out = (pd.concat((df1, df2.assign(spec='df2')))
       .drop_duplicates(subset=['col1','col2'], keep=False))

或过滤掉公共行并连接：

out = pd.concat((df1[~df1[['col1','col2']].isin(df2[['col1','col2']]).all(axis=1)], 
                 df2[~df2.isin(df1[['col1','col2']]).all(axis=1)].assign(spec='df2')))

输出：

   col1  col2 spec
1     2     5    B
2     3     6    C
1     9     9  df2

You could concatenate and drop duplicates:

out = (pd.concat((df1, df2.assign(spec='df2')))
       .drop_duplicates(subset=['col1','col2'], keep=False))

or filter out the common rows and concatenate:

out = pd.concat((df1[~df1[['col1','col2']].isin(df2[['col1','col2']]).all(axis=1)], 
                 df2[~df2.isin(df1[['col1','col2']]).all(axis=1)].assign(spec='df2')))

Output:

   col1  col2 spec
1     2     5    B
2     3     6    C
1     9     9  df2

回复收藏 0 原文

~没有更多了~

关于作者

情归归情

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

对 2 个 pandas DataFrame 进行异或运算

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

对 2 个 pandas DataFrame 进行异或运算

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。