数组中最常见的值
我将如何找到数组中三个最常见的元素?我正在处理长度为 10,000 的数组,其元素 = 0-100 之间的随机整数。
我正在考虑使用两个数组,其中一个长度为 100,并且通过使用 if 语句来递增。但是,我想知道是否有一种方法可以仅使用一个 for/if 循环(语句)来查找这些值。
How would I go about finding the three most common elements in an array? I am working with an array of length 10,000 with elements = random integer from 0-100.
I was thinking of using two arrays, one of length 100 and just incrementing by using an if statement. However, I was wondering if there is a way that only one for/if loop(statement) could be used to find these values.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您要以恒定的次数遍历列表来执行此操作,则需要第二个数据结构。
如果该集合中的值有下限和上限,并且值相对密集,那么计数器数组是一个很好的解决方案。
否则,最好使用 Map,其中键是集合的元素,值是计数器。
分析
如果在开始之前您没有该集合的下限/上限,那么您不知道要分配的计数器数组。因此,您必须对数组进行初步遍历以找到边界……现在您有了一个两遍解决方案。
如果确实有下限和上限,但集合很稀疏,则初始化计数数组的成本 + 查找三个最大计数的成本将主导对集合元素进行计数的成本。如果差异足够大(即输入很大且非常稀疏),HashMap 将会更快并且占用更少的内存。
或者
如果允许更改数组,则可以将其按升序排序
O(NlogN)
,然后在排序后的单次遍历中找到三个最常见的元素大批。If you are going to do this in a constant number of passes over the list, you need a second data structure.
If you have lower and upper bounds for the values in that set and the values are relatively dense, then an array of counters is a good solution.
Otherwise, it is better to use a
Map<Integer, Integer>
, where the keys are elements of the set and the values are counters.Analysis
If you don't have lower / upper bounds on the set before you start, then you don't know big an array of counters to allocate. So you have to make a preliminary pass over the array to find the bounds ... and you now have a two pass solution.
If you do have lower and upper bounds but the set is sparse, then the cost of initializing the array of counts + the cost of finding the three largest counts will dominate the cost of counting the set elements. If the difference is large enough (i.e. the input is large & very sparse) a HashMap will be faster and will take less memory.
Alternatively
If you are allowed to change the array, you can sort it into ascending order
O(NlogN)
and then find the three most common elements in a single pass over the sorted array.您可以在一个循环中完成它,但我认为您仍然需要第二个数组。
即循环输入数组,每次看到一个值时,都会增加“计数器”数组中的适当索引。但是,还要保留 3 个“顶级”索引(已排序)。每次增加时,请根据前 3 个索引的值检查新值,考虑到您可能只是简单地重新排序“顶部”值列表这一事实。
You can do it in one loop, but I think you still need that second array.
I.e. loop over your input array, and each time you see a value, you increment the appropriate index in your 'counter' array. But, also keep 3 'top' indexes (sorted). Each time you increment, check your new value against the value at the top 3 indexes, accounting for the fact that you may be dealing with simply re-ordering your list of 'top' values.
可能有更好的方法可以做到这一点,但这是一种方法。我刚刚打印了模式数组,但您可以对其进行排序以查看实际出现次数最多的数字。这很简单,因为我们知道我们正在处理的数字的上限和下限,但如果您不知道这些界限,那么您需要接受 Stephen C 给出的建议。
There are probably better ways to do this, but this is a way. I just printed the mode array, but you can sort it to see what number actually occurred the most. This is simple because we know the upper and lower bounds of the numbers we are messing with, but if you don't know those bounds then you need to take the advice Stephen C gave.
抱歉,这是一个蹩脚的解决方案,但它按照你的要求工作,希望它有帮助
sorry, its a crappy solution, but it works like you asked, hope it helps