我可以在哪里使用多数投票算法的技术

发布于 2024-11-09 11:35:35 字数 264 浏览 4 评论 0 原文

线性时间多数算法?的答案所示,可以以线性方式计算元素数组的多数时间和 log(n) 空间。

结果表明,每个看到该算法的人都认为这是一项很酷的技术。但这个想法是否可以推广到新算法?

该算法的隐藏功能似乎在于保留一个扮演复杂角色的计数器 - 例如“(到目前为止多数元素的计数)-(到目前为止第二多数元素的计数)”。是否有基于相同思想的其他算法?

As seen in the answers to Linear time majority algorithm?, it is possible to compute the majority of an array of elements in linear time and log(n) space.

It was shown that everyone who sees this algorithm believes that it is a cool technique. But does the idea generalize to new algorithms?

It seems the hidden power of this algorithm is in keeping a counter that plays a complex role -- such as "(count of majority element so far) - (count of second majority so far)". Are there other algorithms based on the same idea?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

祁梦 2024-11-16 11:35:35

嗯,让我们首先开始了解该算法为何有效,以便“隔离”其中的想法。

该算法的要点是,如果您有一个多数元素,那么您可以将它的每次出现与“另一个”元素相匹配,然后您就有更多的“备用”元素。

因此,我们只有一个计数器来计算客人答案的“备用”出现次数。
如果它达到 0,那么从我们“选举”“当前”元素作为“当前”位置的客体主要元素开始,它就不是子序列的多数元素。
此外,由于我们的“来宾”元素与所考虑的子序列中出现的所有其他元素相匹配,因此所考虑的子序列中没有主要元素。

现在,因为:

  1. 我们的算法仅在存在主要元素时给出正确答案,并且
  2. 如果存在主要元素,那么当计数器变为零时,如果我们忽略“当前”子序列,那么它仍然会

是正确的答案通过矛盾可以看出,如果存在一个主要元素,那么当计数器永远不会为零时,我们就会得到整个序列的后缀。

现在:可以在新的 O(1) 大小 O(n) 时间算法中利用的想法是什么?

对我来说,每当您需要计算元素序列上的属性 P 时,您都可以应用此技术:

  1. 可以从 seq[n, m] 扩展到 Q(seq[n, m+1]) 不成立,则在 O(1) 时间内 >seq[n, m+1]
  2. , m]) 可以在 O(1) 时间和空间内根据 P(seq[n, j])P(seq[j, m])< 进行计算/code> if Q(seq[n, j]) 成立

在我们的例子中,P 是我们“选出的”主要元素的“备用”出现,而 Q 是“P 为零”。

如果您以这种方式看待事物,最长公共子序列利用了相同的想法(不知道它的“凉爽系数”;))

Umh, let's first start to understand why the algorithm works, in order to "isolate" the ideas there.

The point of the algorithm is that if you have a majority element, then you can match each occurrence of it with an "another" element, and then you have some more "spare".

So, we just have a counter which counts the number of "spare" occurrences of our guest answer.
If it reaches 0, then it isn't a majority element for the subsequence starting from when we have "elected" the "current" element as the guest major element to the "current" position.
Also, since our "guest" element matches every other element occurrence in the considered subsequence, there are no major elements in the considered subsequence.

Now, since:

  1. our algorithm gives a correct answer only if there is a major element, and
  2. if there is a major element, then it'll still be if we ignore the "current" subsequence when the counter goes to zero

it is obvious to see by contradiction that, if a major element exists, then we have a suffix of the whole sequence when the counter never gets to zero.

Now: what's the idea that can be exploited in new, O(1) size O(n) time algorithms?

To me, you can apply this technique whenever you have to compute a property P on a sequence of elements which:

  1. can be exteded from seq[n, m] to seq[n, m+1] in O(1) time if Q(seq[n, m+1]) doesn't hold
  2. P(seq[n, m]) can be computed in O(1) time and space from P(seq[n, j]) and P(seq[j, m]) if Q(seq[n, j]) holds

In our case, P is the "spare" occurrences of our "elected" major element and Q is "P is zero".

If you see things in that way, longest common subsequence exploits the same idea (dunno about its "coolness factor" ;))

苄①跕圉湢 2024-11-16 11:35:35

Jaydev Misra 和 David Gries 有一篇论文,名为查找重复元素ACM 页面)将其概括为重复次数超过 n/k 的元素次(k=2 是多数问题)。

当然,这可能与原始问题非常相似,您可能正在寻找“不同”的算法。

这是一个可能不同的示例。

给出一个算法来检测括号字符串(“(”和“)”)是否格式正确。

我相信标准的解决方案是维护一个计数器。

旁注:

对于不能为常数空间等的答案,请向他们询问计算模型。例如,在 WORD RAM 模型中,假设整数/数组索引等为 O(1)。

很多人错误地混合和匹配模型。例如,他们很乐意让 n 个整数的输入数组为 O(n),数组索引为 O(1) 空间,但他们认为计数器是 Omega(log n) 等,这是无意义的。如果他们想考虑以位为单位的大小,那么输入本身就是 Omega(n log n) 等。

Jaydev Misra and David Gries have a paper called Finding Repeated Elements (ACM page) which generalizes it to an element repeating more than n/k times (k=2 is the majority problem).

Of course, this is probably very similar to the original problem, and you are probably looking for 'different' algorithms.

Here is an example which is possibly different.

Give an algorithm which will detect if a string of parentheses ( '(' and ')') is well formed.

I believe the standard solution is to maintain a counter.

Side note:

As to answers which claim cannot be constant space etc, ask them for the model of computation. In the WORD RAM model for instance, you assume the integers/array indices etc are O(1).

A lot of folks incorrectly mix and match models. For instance, they will happily have the input array of n integers be O(n), have an array index be O(1) space, but a counter they consider Omega(log n) etc, which is nonsense. If they want to consider the size in bits, then the input itself is Omega(n log n) etc.

爱给你人给你 2024-11-16 11:35:35

对于想要了解该算法的作用及其原理的人:查看我的详细答案

在这里我将描述该算法的自然扩展(或概括)。因此,在标准多数投票算法中,您必须找到一个在流中至少出现 n/2 次的元素,其中 n 是流的大小. 您可以在 O(n) 时间内完成此操作(使用一个很小的常量并在 O(log(n)) 空间内完成,最坏的情况并且极不可能


通用算法允许您找到 k 个最频繁的项目,其中每次在原始流中至少出现 n/(k+1) 次。< /strong> 请注意,如果 k=1,您最终会得到原来的问题,

该问题的解决方案与原来的问题非常相似,只是您现在维护 k 个计数器和 k 个可能的元素,而不是一个计数器和一个可能的元素。逻辑以类似的方式进行。您迭代数组,如果该元素在可能的元素中,则增加它的计数器,如果其中一个计数器为零,则用新元素替换该计数器的元素,否则只需减少该计数器。价值观。

与原始多数投票算法一样,您需要保证拥有这 k 个多数元素,否则您必须再次遍历数组以验证之前找到的可能元素是否正确。这是我的 python 尝试(尚未进行彻底的测试)。

from collections import defaultdict
def majority_element_general(arr, k=1):
    counter, i = defaultdict(int), 0
    while len(counter) < k and i < len(arr):
        counter[arr[i]] += 1
        i += 1

    for i in arr[i:]:
        if i in counter:
            counter[i] += 1
        elif len(counter) < k:
            counter[i] = 1
        else:
            fields_to_remove = []
            for el in counter:
                if counter[el] > 1:
                    counter[el] -= 1
                else:
                    fields_to_remove.append(el)
            for el in fields_to_remove:
                del counter[el]

    potential_elements = counter.keys()
    # might want to check that they are really frequent.
    return potential_elements

For people who want to understand what does this algorithm do and why does it works: look at my detailed answer.

Here I will describe a natural extension of this algorithm (or a generalization). So in a standard majority voting algorithm you have to find an element which appears at least n/2 times in the stream, where n is the size of the stream. You can do this in O(n) time (with a tiny constant and in O(log(n)) space, worse case and highly unlikely.


The generalized algorithm allows you to find k most frequent items, where each time appeared at least n/(k+1) times in the original stream. Note that if k=1, you end up with your original problem.

Solution to this problem is really similar to the original one, except instead of one counter and one possible element, you maintain k counters and k possible elements. Now the logic goes in a similar way. You iterate through the array and if the element is in the possible elements, you increase it's counter, if one of the counters is zero - substitute the element of this counter with new element. Otherwise just decrease the values.

As with original majority voting algorithm, you need to have a guarantee that you have these k majority elements, otherwise you have to do another pass over the array to verify that your previously found possible elements are correct. Here is my python attempt (have not done a thorough testing).

from collections import defaultdict
def majority_element_general(arr, k=1):
    counter, i = defaultdict(int), 0
    while len(counter) < k and i < len(arr):
        counter[arr[i]] += 1
        i += 1

    for i in arr[i:]:
        if i in counter:
            counter[i] += 1
        elif len(counter) < k:
            counter[i] = 1
        else:
            fields_to_remove = []
            for el in counter:
                if counter[el] > 1:
                    counter[el] -= 1
                else:
                    fields_to_remove.append(el)
            for el in fields_to_remove:
                del counter[el]

    potential_elements = counter.keys()
    # might want to check that they are really frequent.
    return potential_elements
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文