矩阵%in%矩阵
假设我有两个矩阵,每个矩阵都有两列和不同的行数。我想检查并查看一个矩阵的哪些对位于另一个矩阵中。如果这些是一维的,我通常只需执行 a %in% x
即可获得结果。 match
似乎只适用于向量。
> a
[,1] [,2]
[1,] 1 2
[2,] 4 9
[3,] 1 6
[4,] 7 7
> x
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
我希望结果为 c(FALSE,TRUE,TRUE,FALSE)
。
Suppose I have two matrices, each with two columns and differing numbers of row. I want to check and see which pairs of one matrix are in the other matrix. If these were one-dimensional, I would normally just do a %in% x
to get my results. match
seems only to work on vectors.
> a
[,1] [,2]
[1,] 1 2
[2,] 4 9
[3,] 1 6
[4,] 7 7
> x
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
I would like the result to be c(FALSE,TRUE,TRUE,FALSE)
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
重新创建数据:
定义一个与
%in%
类似的矩阵函数%inm%
:将其应用于您的数据:
要比较单行:
Recreate your data:
Define a function
%inm%
that is a matrix analogue to%in%
:Apply this to your data:
To compare a single row:
另一种方法是:
更通用的版本是:
Another approach would be:
A more general version of this is:
安德里的解决方案非常好。但如果你有大矩阵,你可能想尝试基于递归的其他方法。如果您按列工作,您可以通过排除第一个位置不匹配的所有内容来减少计算时间:
比较:
编辑:
我检查接受的答案只是为了好玩。比 double apply 表现更好(因为你摆脱了内部循环),但递归仍然占主导地位! ;-)
Andrie's solution is perfectly fine. But if you have big matrices, you might want to try something else, based on recursion. If you work columnwise, you can cut down on the calculation time by excluding everything that doesn't match at the first position:
The comparison :
EDIT:
I checked the accepted answer just for fun. Performs better than the double apply ( as you get rid of the inner loop), but recursion still rules! ;-)
这是另一种方法,使用
digest
包并为每行创建校验和
,校验和是使用哈希算法生成的(默认为md5
)Here is another approach using the
digest
package and creatingchecksums
for each row, which are generated using a hashing algorithm (the default beingmd5
)进入游戏较晚:我之前使用“带分隔符粘贴”方法编写了一个算法,然后找到了此页面。我猜测这里的代码片段之一将是最快的,但是:
所以显然构造字符串并执行单个集合操作是最快的!
(PS 我检查过,所有 3 种算法都给出相同的结果)
Coming in late to the game: I had previously written an algorithm using the "paste with delimiter" method, and then found this page. I was guessing that one of the code snippets here would be the fastest, but:
So apparently constructing character strings and doing a single set operation is fastest!
(PS I checked and all 3 algorithms give the same result)