合并数据框并仅提取其他数据框中不存在的数据框的行
我正在尝试合并两个数据范围并创建一个新的数据框架,该框架仅包含第一个数据框架中不存在的行中的行。例如:
我作为输入的dataFrames:
我想具有的data frame:
< img src =“ https://i.sstatic.net/mfxxm.png” alt =“在此处输入图像说明”>
您知道是否有办法做到这一点?如果您能帮助我,我将不仅仅是感谢!谢谢,埃莱尼
I am trying to merge two dataframes and create a new dataframe containing only the rows from the first dataframe that does not exist in the second one. For example:
The dataframes that I have as input:
The dataframe that I want to have as output:
Do you know if there is a way to do that? If you could help me, I would be more than thankful!! Thanks, Eleni
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
创建一些数据,我们有两个数据范围:
我们可以使用
pandas.merge
结合相等的行。而且我们可以使用其indosator = true
功能来标记仅从左侧(以及适用时右)的行。由于我们只需要那些独特的左侧的人,因此我们可以使用How =“ left”
合并才能提高效率。太好了,所以最终结果是使用合并
但是,仅保留具有
left_only
的指示器的人:如果,则需要通过列的子集进行重复地进行重复。在这种情况下,我会这样进行合并,重复该子集,以免从左侧和右侧获得重复版本的其他列。
pd.merge(df1,df2 [subset],on = subset,how =“ left”,indistor = true)
>Creating some data, we have two dataframes:
We can use
pandas.merge
to combine equal rows. And we can use itsindicator=True
feature to mark those rows that are only from the left (and right, when applicable). Since we only need those that are unique to left, we can merge usinghow="left"
to be more efficient.Great, so then the final result is using the merge
but only keeping those that have an indicator of
left_only
:If you'd want to deduplicate by a subset of the columns, that should be possible. In that case I would do the merge it like this, repeating the subset so that we don't get other columns in duplicate versions from the left and right side.
pd.merge(df1, df2[subset], on=subset, how="left", indicator=True)