m * n 矩阵相等

发布于 2024-07-21 09:17:37 字数 371 浏览 3 评论 0原文

检查两个 m * n 矩阵相等的最有效方法是什么,更重要的是检查导致两个矩阵不相等的 [i][j] 索引(或多个索引)。

就我而言,“m”相对较小(<=4),而 n 相对较大(<=512)。

问题背景:我的应用程序有一个活动备用设置。 每当发生导致状态更改的事件时,活动服务器就会向备用服务器发送更新。 然而,我们观察到有时备用设备与活动设备不同步,即使活动设备已发送所有更新。 因此,我们计划对同步的数据结构进行审核。 审计将计算活动的校验和并将其发送到从站。 从机也会做同样的事情,如果不匹配,将返回 NAk。 然后,主动设备将再次同步从设备。 我的问题是我希望从站返回导致校验和不匹配的 [i][j] 位置。

编辑:C语言

What is the most efficient way for checking for equality of two m * n matrices and more importantly the [i][j] index (or indices) which caused the two matrices to be unequal.

In my case, 'm' is relatively small (<=4) and n is relatively large (<=512).

Context of the problem : I have an Active Standby setup for my application. Whenever an event happens which causes a state change, the active sends an update to the standby. However, we have observed sometimes standby is out-of-sync with the active even though the active has send all updates. We are planning therefore to run an audit on the data structure synced. The audit will calculate a checksum on active and send them to the slave. The slave will do the same and will return a NAk if they do not match. The active will then sync the slave again. My problem is I want the slave to return the [i][j] position which caused the checksum to not match.

Edit: Language C

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

霊感 2024-07-28 09:17:37

虽然对于 m >> 的情况没有多大用处。 n,如果 m ~ n,您可以分别对所有行和列进行校验和,总共提供 m + n 个要传输的校验和。 通过这样做,您知道当第 i 行校验和与第 j 列校验和不匹配时,条目 A_ij 存在问题矩阵。 但可能还存在其他问题,具体取决于校验和的稳健程度以及允许误报的频率。

对于您的情况,发送 516 个单独的校验和并不比发送 2048 个条目的整个矩阵有显着优势,因此实现这一点可能只是浪费您的时间进行过早的优化。 但对于 512×512 矩阵,发送 1024 个校验和比发送 262,144 个条目要好得多。

While it's not much use for the case where m >> n, if m ~ n you can checksum all rows and columns individually, giving you a total of m + n checksums to transmit. By doing this, you know that when the ith row checksum and jth column checksum do not match, there's a problem with entry A_ij of the matrix. But there could be other problems, depending on how robust your checksums are and how often they allow false negatives.

For your case, sending 516 separate checksums is not a significant win over sending the whole matrix of 2048 entries, and so implementing this is probably just wasting your time with premature optimization. But for a 512×512 matrix, sending 1024 checksums is much nicer than sending 262,144 entries.

别想她 2024-07-28 09:17:37

由于您不知道矩阵在哪里不匹配,因此您必须逐个元素地比较它们。 只需迭代矩阵并进行比较即可。

您必须注意可能的缓存未命中惩罚 - 您需要按照不会导致不必要的缓存行重新加载的顺序扫描矩阵。 这是语言相关的。 例如,对于 C,您需要让外循环迭代第一个索引,让内循环迭代第二个索引。

Since you have no idea where the matrices mismatch you'll have to compare them element-by-element. Just iterate the matrices and compare.

You have to take care of possible cache misses penalties - you need to scan matrices in such order that you don't cause unnecessary cach line reloads. This is language dependent. For C, for example, you need to have the outer loop iterating the first index and the inner loop iterating the second index.

走野 2024-07-28 09:17:37

正如 Sharptooth 所说,校验和大多无法逆转。 如果您只能对矩阵的部分进行散列,则可以进行某种二分搜索,在每次迭代中消除剩余范围的一半。 即使有多个不匹配的元素,这也可以起作用:您必须检查两半。
另外,你的矩阵大约有 2000 个单元,实际上非常小。 因此,比较它们应该很快。 如果每个对象都包含大量数据,您可以对每个对象进行散列(因此您有 2000 个散列,这应该比您的对象小得多),并比较散列矩阵 - 那么您将确定问题出在哪里。< br>
再次请记住,计算校验和意味着遍历整个矩阵,因此按照建议,比较它们的最佳方式可能是一一比较。

As sharptooth stated, checksums mostly cannot be reversed. If you can hash only parts for the matrix you can a sort of binary search, where you eliminate half of remaining range at every iteration. This can work even when there's more than one mismatched element: you have to check both halves.
Also, your matrix has about 2000 cells, is in fact very small. Comparing them should therefore be quick. If every object contains a lot of data you can hash every object (so you have 2000 hashes, which should be much smaller than your objects), and compare the matrices of hashes - then you'll know for sure where the problem is.
Again, keep in mind that calculating a checksum means going over the whole matrix, so the best wat to compare them is probably one-by-one, as suggested.

空名 2024-07-28 09:17:37

信息论告诉我们,在这里你不可能不劳而获。 如果有m * n个单元,并且每个单元都包含k位信息(例如16位整数),那么矩阵的可能性空间占用m * n * k 位。

如果您希望能够发送一条“消息”并处理从“它们是同步的”到“每个细胞都以独特而奇怪的方式不同”的每种情况,那么自然法则要求您发送此消息m * n * k 位长。 如果你使用m * n * b - 1位,我将能够构造出两种你无法区分的情况。 事实上,你的状态空间的一半将变得无法区分。

现在,如果您进一步描述您的要求,我们可以削减一些可能性空间。 例如,您可以以便宜的价格获得识别 1 个不同步电池的能力,正如其他人所描述的那样。 请记住,如果存在 2 个差异,则设计用于定位 1 个差异的算法将完全失败。 例如,它会告诉您单元格 A 不同步,而实际上是单元格 B 和 C。

Information Theory tells us that you can't get something for nothing here. If there are m * n cells and each of them contains k bits of information (e.g. 16 bit integers), then the possibility space of your matrix occupies m * n * k bits.

If you want to be able to send one single "message" and handle every case from "they are in sync" to "every cell is different in a unique and strange way" then the laws of nature require you to make this message m * n * k bits long. If you use m * n * b - 1 bits, I will be able to construct two situations that you cannot distinguish. In fact, half of your state space will become indistinguishable.

Now, if you describe your requirements further, we can cut some possibility space. What you can get on the cheap, for example, is the ability to recognize 1 cell out of sync, as has been described by others. Keep in mind that the algorithm designed to locate 1 diff will completely fail if there are 2 diffs. e.g. it will tell you that cell A is out of sync when it's really cells B and C.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文