是否有一种有效的行列交换数据结构?
我有一个数字矩阵,我希望能够:
- 交换行
- 交换列
如果我要使用指向行的指针数组,那么我可以在 O(1) 中轻松地在行之间切换,但交换列是 O (N) 其中 N 是行数。
我有一种明显的感觉,没有一个双赢的数据结构可以为这两个操作提供 O(1),尽管我不确定如何证明它。还是我错了?
I have a matrix of numbers and I'd like to be able to:
- Swap rows
- Swap columns
If I were to use an array of pointers to rows, then I can easily switch between rows in O(1) but swapping a column is O(N) where N is the amount of rows.
I have a distinct feeling there isn't a win-win data structure that gives O(1) for both operations, though I'm not sure how to prove it. Or am I wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
没有完全思考这一点:
我认为你关于行指针的想法是正确的开始。然后,为了能够“交换”列,我只需拥有另一个具有列数大小的数组,并在每个字段中存储该列当前物理位置的索引。
现在要交换第 1 列和第 2 列,只需将 c 更改为 {0,2,1}
当您想读取第 1 行时,您可以这样做
Without having thought this entirely through:
I think your idea with the pointers to rows is the right start. Then, to be able to "swap" the column I'd just have another array with the size of number of columns and store in each field the index of the current physical position of the column.
Now to exchange column 1 and 2, you would just change c to {0,2,1}
When you then want to read row 1 you'd do
不过这里只是随机的(没有经验知道这到底有多有效,而且这是一个没有咖啡的深夜):
我的想法是矩阵的内部是一个哈希表而不是一个数组。
数组中的每个单元格都包含三部分信息:
在我看来,这可以很容易地用元组
((i, j) 表示, v)
,其中(i, j)
表示单元格的位置(第 i 行,第 j 列),而 v是 a 的正常表示矩阵。但让我们在这里引出一些想法。让我们将
i
视为某种规范标识符,而不是将i
表示为一个位置(即 0 在 1 在 2 在 3 之前等)。排。让我们对j
执行相同的操作。 (虽然在最一般的情况下,i
和j
可以不受限制,但让我们假设一个简单的情况,它们将保持在范围 [0..M] 和 [ 0..N] 对于 M x N 矩阵,但不表示单元格的实际坐标)。现在,我们需要一种方法来跟踪行的标识符以及与该行关联的当前索引。这显然需要一个键/值数据结构,但由于索引的数量是固定的(矩阵通常不会增长/收缩),并且只处理整数索引,我们可以将其实现为固定的一维数组。对于 M 行的矩阵,我们可以(在 C 中):
对于第 m 行,RowMap[m] 给出当前矩阵中的行的标识符。
我们将对列使用相同的内容:
其中 ColumnMap[n] 是第 n 列的标识符。
现在回到我在开头提到的哈希表:
由于我们拥有完整的信息(矩阵的大小),因此我们应该能够生成完美的哈希函数(无冲突)。这是一种可能性(对于适度大小的数组):
如果这是哈希表的哈希函数,那么对于大多数大小的数组,我们应该获得零冲突。这允许我们在 O(1) 时间内从哈希表中读取/写入数据。
最酷的部分是将每行/列的索引与哈希表中的标识符连接起来:
现在,由于矩阵的接口仅使用这些句柄,而不使用内部标识符,因此可以进行交换操作行或列对应于
RowMap
或ColumnMap
数组中的简单更改:所以应该是这样 - 具有 O(1) 访问和变异以及 O(1) 访问和变异的矩阵(1) 行交换和O(1) 列交换。当然,即使是 O(1) 的哈希访问也会比基于数组的 O(1) 访问慢,并且会使用更多的内存,但至少行/列之间是相等的。
当涉及到如何实现矩阵时,我试图尽可能保持不可知论,所以我写了一些 C 语言。如果你更喜欢另一种语言,我可以更改它(如果你理解的话那就最好了),但我认为它非常具有自我描述性,尽管我不能确保它在 C 方面的正确性,因为我实际上是一个 C++ 人员,现在正试图表现得像一个 C 人员(我是否提到过我没有咖啡?)。就我个人而言,用完整的面向对象语言编写将使条目设计更加公正,并且还使代码更美观,但正如我所说,这是一个快速完成的实现。
Just a random though here (no experience of how well this really works, and it's a late night without coffee):
What I'm thinking is for the internals of the matrix to be a hashtable as opposed to an array.
Every cell within the array has three pieces of information:
In my mind, this is readily represented by the tuple
((i, j), v)
, where(i, j)
denotes the position of the cell (i-th row, j-th column), and vThe would be a somewhat normal representation of a matrix. But let's astract the ideas here. Rather than
i
denoting the row as a position (i.e. 0 before 1 before 2 before 3 etc.), let's just consideri
to be some sort of canonical identifier for it's corresponding row. Let's do the same forj
. (While in the most general case,i
andj
could then be unrestricted, let's assume a simple case where they will remain within the ranges [0..M] and [0..N] for an M x N matrix, but don't denote the actual coordinates of a cell).Now, we need a way to keep track of the identifier for a row, and the current index associated with the row. This clearly requires a key/value data structure, but since the number of indices is fixed (matrices don't usually grow/shrink), and only deals with integral indices, we can implement this as a fixed, one-dimensional array. For a matrix of M rows, we can have (in C):
For the m-th row,
RowMap[m]
gives the identifier of the row in the current matrix.We'll use the same thing for columns:
where
ColumnMap[n]
is the identifier of the n-th column.Now to get back to the hashtable I mentioned at the beginning:
Since we have complete information (the size of the matrix), we should be able to generate a perfect hashing function (without collision). Here's one possibility (for modestly-sized arrays):
If this is the hash function for the hashtable, we should get zero collisions for most sizes of arrays. This allows us to read/write data from the hashtable in O(1) time.
The cool part is interfacing the index of each row/column with the identifiers in the hashtable:
Now, since the interface to the matrix only uses these handles, and not the internal identifiers, a
swap
operation of either rows or columns corresponds to a simple change in theRowMap
orColumnMap
array:So that should be it - a matrix with O(1) access and mutation, as well as O(1) row swapping and O(1) column swapping. Of course, even an O(1) hash access will be slower than the O(1) of array-based access, and more memory will be used, but at least there is equality between rows/columns.
I tried to be as agnostic as possible when it comes to exactly how you implement your matrix, so I wrote some C. If you'd prefer another language, I can change it (it would be best if you understood), but I think it's pretty self descriptive, though I can't ensure it's correctedness as far as C goes, since I'm actually a C++ guys trying to act like a C guy right now (and did I mention I don't have coffee?). Personally, writing in a full OO language would do it the entrie design more justice, and also give the code some beauty, but like I said, this was a quickly whipped up implementation.