计算较大矩阵内矩阵出现次数的算法
我现在面临一个问题,我需要计算某个 MxM 矩阵出现在 NxN 矩阵中的次数(这个矩阵应该比第一个矩阵大)。关于如何执行此操作有任何提示吗?我将用 C 语言实现它,并且没有选项可以更改它。
修订版1
大家好,我真的很感谢所有关于此事的答案和意见。我应该告诉你,经过几个小时的努力,我们得出了一个解决方案,它并不严格类似于博耶-摩尔方法,而是我自己的算法。我计划在测试和完成后发布它。这些解决方案现在正在使用大学集群和 C 库 MPI 进行并行化,以实现速度优化。
i'm facing a problem right now, i need to count the number of time a certain MxM matrix appears inside a NxN one (this one should be larger than the first one). Any hints on how to do this? I'm going to implement it in C and there is no option to change it.
Revision 1
Hi everyone, i would really like to say thank to all the answers and opinions on the matter. I should tell you that after many hours of hard work we have come to a solutions that is not strictly like the Boyer-Moore approach, but rather an algorithm on my own. I'm planning on publishing it once is tested and finished. The solutions is now being adapted to be paralelized for speed optimization using the university cluster with the C Library MPI.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
嗯,听起来像是字符串匹配的二维版本。我想知道Boyer-Moore是否有2D版本?
二维匹配的 Boyer-Moore 方法
啊,显然有。 :-)
Hm, sounds like a 2-dimensional version of string matching. I wonder if there's a 2D version of Boyer-Moore?
A Boyer-Moore Approach for Two-Dimensional Matching
Ah, apparently there is. :-)
立即想到的一种方法:
将大矩阵视为 N*N“字符”的线性字符串,将小矩阵视为长度为 (M+1)*M 的线性正则表达式,前 M 为“文字字符”每个“行”的位置以及每行剩余位置中的
.{NM}
的等价物。将正则表达式编译为 DFA 并运行它。找到所有匹配项的时间是 O(N*N)。我怀疑有更高级的算法可以工作(更高级的一维子串搜索算法的改编),但这个还不错。
One approach that immediately came to mind:
Treat the big matrix as a linear string of N*N "characters" and the small matrix as a linear regular expression of length (M+1)*M with "literal characters" in the first M positions of each "row" and the equivalent of
.{N-M}
in the remaining position of each row. Compile the regular expression to a DFA and run it.Time to find all matches is O(N*N). I suspect there are fancier algorithms that would work (adaptations of fancier 1-d substring search algorithms) but this one's not too bad.
最近,已经开发出用于构建二维后缀树的快速算法。这些可用于及时查找较大矩阵中的任何方形子矩阵,与较大矩阵的大小无关:
您可能需要成为订阅者可以查看这些论文。
我还发现了这个免费提供的算法,它描述了一种预期线性时间的随机构造算法:
我希望构建一个如果您需要在单个矩阵中搜索许多不同的子矩阵,2D 后缀树和搜索会更快,但如果您每次在不同的矩阵中搜索,它可能比 2D Boyer-Moore 慢。
Recently, fast algorithms have been developed for building 2D suffix trees. These can be used to find any square submatrix in a larger matrix in time independent of the size of the larger matrix:
You might need to be a subscriber to see those papers.
I also found this freely available one, which describes a randomised construction algorithm that is expected linear-time:
I expect that constructing a 2D suffix tree and searching with it would be faster if you need to search for many different submatrices in a single matrix, but it's probably slower than 2D Boyer-Moore if you'll be searching inside a different matrix each time.