找到矩阵中全零的有效方法?
我正在考虑有效的算法来查找一行矩阵中零的数量,但只能想到 O(n2) 解决方案(即通过迭代每行和列)。有没有更有效的方法来计算零?
例如,给定矩阵,
3, 4, 5, 6 7, 8, 0, 9 10, 11, 12, 3 4, 0, 9, 10
我会报告有两个零。
I am thinking of efficient algorithm to find the number of zeros in a row of matrix but can only think of O(n2) solution (i.e by iterating over each row and column). Is there a more efficient way to count the zeros?
For example, given the matrix
3, 4, 5, 6 7, 8, 0, 9 10, 11, 12, 3 4, 0, 9, 10
I would report that there are two zeros.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
在不存储任何外部信息的情况下,不,你不能做得比 θ(N2) 更好。原理很简单 - 如果您不查看矩阵中的所有 N2 个位置,则无法保证已找到所有零,并且最终可能会给出错误的答案后退。例如,如果我知道您查看的位置少于 N2 个位置,那么我可以在矩阵上运行您的算法并查看您报告了多少个零。然后我可以查看您未访问的位置,将它们全部替换为零,然后再次运行您的算法。由于您的算法不会查看这些位置,因此它无法知道它们中有零,因此算法的两次运行中至少有一次会返回错误的答案。
更一般地说,在设计处理数据的算法时,查看是否可以比某些运行时做得更好的一个好方法是使用这种“对抗性分析”。问自己一个问题:如果我跑得比某个时间 O(f(n)) 快,对手是否可以通过改变答案但我无法检测到的方式操纵数据?这种分析与一些更聪明的数学一起证明了在平均情况下基于比较的排序算法不能比 Ω(n log n) 做得更好。
如果矩阵还有一些其他属性(例如,如果它已排序),那么您可能会比在 O(N2) 中运行做得更好。例如,假设您知道矩阵的所有行都已排序。然后,您可以轻松地对每一行进行二分搜索,以确定它包含多少个零,这需要 O(N log N) 时间并且速度更快。
根据您的设置参数,如果您假设允许并行扫描,您可能能够使算法运行得更快。例如,如果你的机器上有 K 个处理器,可以专门用于扫描矩阵的任务,那么你可以将矩阵分成 K 个大小大致均匀的组,让每个处理器计算组中零的数量,然后将这些计算的结果相加。这最终会为您提供 θ(N2 / K) 的运行时间,因为运行时间分布在多个核心上。
Without storing any external information, no, you can't do any better than Θ(N2). The rationale is simple - if you don't look at all N2 locations in the matrix, then you can't guarantee that you've found all of the zeros and might end up giving the wrong answer back. For example, if I know that you look at fewer than N2 locations, then I can run your algorithm on a matrix and see how many zeros you report. I could then look at the locations that you didn't access, replace them all with zeros, and run your algorithm again. Since your algorithm doesn't look at those locations, it can't know that they have zeros in them, and so at least one of the two runs of the algorithm would give back the wrong answer.
More generally, when designing algorithms to process data, a good way to see if you can do better than certain runtimes is to use this sort of "adversarial analysis." Ask yourself the question: if I run faster than some time O(f(n)), could an adversary manipulate the data in ways that change the answer but I wouldn't be able to detect? This is the sort of analysis that, along with some more clever math, proves that comparison-based sorting algorithms cannot do any better than Ω(n log n) in the average case.
If the matrix has some other properties to it (for example, if it's sorted), then you might be able to do a better job than running in O(N2). As an example, suppose that you know that all rows of the matrix are sorted. Then you can easily do a binary search on each row to determine how many zeros it contains, which takes O(N log N) time and is faster.
Depending on the parameters of your setup, you might be able to get the algorithm to run faster if you assume that you're allowed to scan in parallel. For example, if your machine has K processors on it that can be dedicated to the task of scanning the matrix, then you could split the matrix into K roughly evenly-sized groups, have each processor count the number of zeros in the group, then sum the results of these computations up. This ends up giving you a runtime of Θ(N2 / K), since the runtime is split across multiple cores.
总是 O(n^2) - 或者更确切地说 O(nxm)。你无法跳过它。
但是,如果您知道矩阵是稀疏(只有少数元素具有非零值),您只能存储非零值和矩阵大小。然后考虑使用散列来存储整个矩阵 - 通常创建将行号映射到嵌套散列的散列。
示例:
将表示为:
则:
Always O(n^2) - or rather O(n x m). You cannot jump over it.
But if you know that matrix is sparse (only a few elements have nonzero values), you can store only values that are non zero and matrix size. Then consider using hashing over storing whole matrix - generally create hash which maps a row number to a nested hash.
Example:
Will be represented as:
Then:
对于任何未排序的矩阵,它应该是 O(n)。因为通常我们用“n”表示总元素。
如果矩阵包含 X 行和 Y 列,则 X x Y = n。
例如,在 4 X 4 未排序矩阵中,总共有 16 个元素。所以当我们以 2 个循环线性迭代 4 X 4 = 16 次时。它将是 O(n),因为数组中的总元素是 16。
许多人投票支持 O(n^2),因为他们将 n X n 视为矩阵。
如果我的理解有误,请指正。
For any un sorted matrix it should be O(n). Since generally we represent total elements with 'n'.
If Matrix contains X Rows and Y Columns, X by Y = n.
E.g In 4 X 4 un sorted matrix it total elements 16. so When we iterate in linear with 2 loops 4 X 4 = 16 times. it will be O(n) because the total elements in the array are 16.
Many people voted for O(n^2) because they considered n X n as matrix.
Please correct me if my understanding is wrong.
假设当您说“在矩阵的一行中”时,您的意思是您有行索引
i
并且您想要计算i
中零的数量 -第 行,你可以做得比 O(N^2) 更好。假设
N
是行数,M
是列数,然后存储您的矩阵作为单个数组
[3,4,5,6,7,8,0,9,10,11,12,34,0,9,10]
,然后访问行i
,您可以访问索引N*i
处的数组。由于数组具有恒定的时间访问,因此这部分不依赖于矩阵的大小。然后,您可以通过访问从
0
到N-1< 的
j
的元素N*i + j
来迭代整行。 /code>,这是 O(N),前提是您知道要访问哪一行并且正在使用数组。Assuming that when you say "in a row of a matrix", you mean that you have the row index
i
and you want to count the number of zeros in thei
-th row, you can do better than O(N^2).Suppose
N
is the number of rows andM
is the number of columns, then store yourmatrix as a single array
[3,4,5,6,7,8,0,9,10,11,12,34,0,9,10]
, then to access rowi
, you access the array at indexN*i
.Since arrays have constant time access, this part doesn't depend on the size of the matrix. You can then iterate over the whole row by visiting the element
N*i + j
forj
from0
toN-1
, this is O(N), provided you know which row you want to visit and you are using an array.假设给定的矩阵是
M
执行M+(-M)
运算但是请使用默认的+
使用my_add(int a, int b)
这样会给你一个像这样的矩阵
现在你创建一个
s := 0
并继续将所有元素添加到 s 中。s += a[i][j]
您甚至可以在一个周期内完成这两项操作。
s += my_add(a[i][j], (-1)*a[i][j])
但仍然是
O(m*n)
< strong>注意
要计算 1 的数量,您通常会检查矩阵中的所有项目。如果不对所有元素进行操作,我认为您无法说出 1 的数量。并循环其
(m*n)
的所有元素。当且仅当您可以不选中某些元素并说出 1 的数量编辑
时,它才可以比
(m*n)
更快但是如果您移动2x2
矩阵上的内核并跳跃,您将得到(m*n)/k
迭代,例如,如果您对相邻元素a[i][j], a[i+ 进行操作1][j], a[i][j+1], a[i+1][j+1]
直到i
m
& <代码>我< nassuming the given Matrix is
M
do anM+(-M)
operation but do use the default+
use insteadmy_add(int a, int b)
such thatThat will give you a matrix like
Now you create a
s := 0
and keep adding all elements to s.s += a[i][j]
You can do both in one cycle even.
s += my_add(a[i][j], (-1)*a[i][j])
But still Its
O(m*n)
NOTE
To count the number of 1's you generally check all items in the Matrix. without operating on all elements I don't think you can tell the number of 1's. and to loop all elements its
(m*n)
. It can be faster than(m*n)
if and only if you can leave some elements unchecked and say the number of 1'sEDIT
However if you move a
2x2
kernel over the matrix and hop you will get(m*n)/k
iteration e.g. if you operate on neighboring elementsa[i][j], a[i+1][j], a[i][j+1], a[i+1][j+1]
tilli < m
&i< n
由于我将解释的原因,这不是一个完美的答案,但它提供了一种替代解决方案,可能比您描述的解决方案更快:
由于您不需要知道矩阵中零的位置,因此您可以将其展平为一维数组。
最后,计算零元素位于数组的开头,直到到达非零数字。
在某些情况下,这比检查每个元素更快,尽管在最坏的情况下,快速排序将花费 O(n2),除了最后的零计数之外,这可能比迭代每行和列更糟糕。
This is not a perfect answer for the reasons I'll explain, but it offers an alternative solution potentially faster than the one you described:
Since you don't need to know the position of the zeros in the matrix, you can flatten it into a 1D array.
After that, perform a quicksort on the elements, this may provide a performance of O(n log n), depending on the randomness of the matrix you feed in.
Finally, count the zero elements at the beginning of the array until you reach a non-zero number.
In some cases, this will be faster than checking every element, although in a worst-case scenario the quicksort will take O(n2), which in addition to the zero counting at the end may be worse than iterating over each row and column.