在 R 中查找重复模式
假设我有一个由 0 到 100 之间的数字组成的 5 x 100 矩阵:
1 5 10 15 3
2 15 3 8 27
1 22 34 45 35
28 27 32 3 8
......
我想找到重复的数字“模式”(主要是一对或三元组)。
因此,在我的示例中,我会让 3,15 夫妇出现两次,并且三元组 3,8,27 也出现两次(我不关心顺序)。
您将如何在 R 中实现它?
我想分别有情侣和三胞胎,并统计他们的数量。
谢谢 尼科
Say I have a matrix of 5 x 100 of numbers between 0 and 100 for instance:
1 5 10 15 3
2 15 3 8 27
1 22 34 45 35
28 27 32 3 8
......
I would like to find repeated "patterns" of numbers (mainly couples or triplets).
So in my example I would have the couple 3,15 appearing twice and the triplet 3, 8, 27 also appearing twice (I don't care about the order).
How would you implement that in R?
I would like to have couples and triplets separately and have their count.
thanks
nico
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一种方法。对于 100 行矩阵的每一行,您可以找到所有数字对/三元组(使用
combn
),并对这些数字对/进行频率计数(使用table
)三倍。我定义的pasteSort
函数在排序后从向量中创建一个字符串。我们将此函数应用于每行中的每个对/元组,并在进行频率计数之前从矩阵中收集所有对/元组。请注意,如果一对在同一行上重复,则被视为“重复”。Here is one way. For each row of your 100-row matrix, you find all pairs/triples of numbers (using
combn
) and do a frequency-count (usingtable
) of the pairs/triples. ThepasteSort
function I defined creates a string out of a vector after sorting it. We apply this function to each pair/tuple in each row, and collect all pairs/tuples from the matrix before doing the frequency-count. Note that if a pair repeats on the same row, it's counted as a "repeat".