将两个数据框列与二进制数据进行比较

发布于 2025-02-03 16:26:16 字数 344 浏览 4 评论 0原文

我有两个带有二进制数据（1s和0s）的列，我想检查一个列与另一列之间的相似百分比是多少。显然，由于它们是二进制的，因此重要的是巧合基于每个单元的位置，而不是全球量为0和1s。例如：

column_1     column_2
   0            1
   1            1
   0            0
   1            0

在这种情况下，在这两个列中，0s和1s的数量相同（这意味着100％的巧合），但是，考虑到每条的顺序或位置，只有50％的巧合。最后的脂肪是我想找出的那个。

我知道我可以通过循环做到这一点……但是，如果列表较大，这可能是一个问题。

原文

I have two columns with binary data (1s and 0s) And I want to check what's the percent similiarity between one column and the other. Obviously, as they are binary, it is important that the coincidence is based in the position of each cell, not in the global amount of 0s and 1s. In example:

column_1     column_2
   0            1
   1            1
   0            0
   1            0

In that case, in both columns there are the same equal number of 0s and 1s (which means a 100% coincidence) however, taking into account the order or position of each, there's just a 50% coincidence. That last steatment is the one I'm trying to figure out.

I know I could do it with a loop... however in case of larger lists that could be a problem.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

禾厶谷欠 2025-02-10 16:26:17

这将获得一个二进制向量，在col 1等于2和0的情况下，将其总结并除以样品数量。

sim = sum( df.column_1 == df.column_2 ) / len(df.column_1)

This gets a binary vector that gives True where col 1 equals 2 and 0 else where, sums it up, and divides by the number of samples.

sim = sum( df.column_1 == df.column_2 ) / len(df.column_1)

回复收藏 0 原文

~没有更多了~

关于作者

筱果果

暂无简介

文章

394 人气

关注发私信

友情链接

文江博客

将两个数据框列与二进制数据进行比较

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

将两个数据框列与二进制数据进行比较

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。