计算统计模式

发布于 2024-07-13 08:34:49 字数 189 浏览 14 评论 0原文

我目前正在尝试验证,给定一个长度为 N 的未排序数组 A 和一个整数 k,是否存在某个元素出现 n/k 次或更多次。

我对这个问题的想法是计算众数,然后将其与 n/k 进行比较。 但是,我不知道如何快速计算此模式。 我的最终结果需要是 nlog(k),但我真的不知道如何做到这一点。 我能找到的最快的是 nk...

I'm currently trying to verify whether or not, given an unsorted array A of length N and an integer k, whether there exists some element that occurs n/k times or more.

My thinking for this problem was to compute the mode and then compare this to n/k. However, I don't know how to compute this mode quickly. My final result needs to be nlog(k), but I have no idea really on how to do this. The quickest I could find was nk...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

故笙诉离歌 2024-07-20 08:34:49

使用哈希表来统计每个值的频率:

uint[int] counts;
foreach(num; myArray) {
     counts[num]++;
}

int mostFrequent;
uint maxCount = 0;
foreach(num, count; counts) {
    if(count > maxCount) { 
        mostFrequent = num;
        maxCount = count;
    }
}

Use a hash table to count the frequency of each value:

uint[int] counts;
foreach(num; myArray) {
     counts[num]++;
}

int mostFrequent;
uint maxCount = 0;
foreach(num, count; counts) {
    if(count > maxCount) { 
        mostFrequent = num;
        maxCount = count;
    }
}
银河中√捞星星 2024-07-20 08:34:49

设 m = n/k 向上舍入。 进行快速排序,但丢弃长度小于 m 的子列表。

与快速排序一样,您可能会运气不好并反复选择接近末端的枢轴。 但如果你随机选择枢轴,这种情况发生的可能性很小。

递归将有 O(log(k)) 个级别,每个级别需要 O(n) 时间。

Set m = n/k rounded up. Do a quicksort, but discard sublists of length less than m.

Like quicksort, you can have bad luck and repeatedly choose pivots that close to the ends. But this has a small probability of happening, if you choose the pivots randomly.

There'll be O(log(k)) levels to the recursion, and each level takes O(n) time.

烟雨扶苏 2024-07-20 08:34:49

只需遍历数组并在哈希/字典中保存计数(一旦找到 n/k 就返回 true,否则返回 false)将是 O(n)

编辑,如下所示:

counts = {}
for n in numbers:
    if ( counts.has_key( n ) ):
        counts[ n ] += 1
    else:
        counts[ n ] = 1
    if ( counts[ n ] >= n / k ):
        return true
return false

Just walking the array and keeping counts in a hash/dictionary (and returning true once n/k is found, else false) would be O(n)

edit, something like:

counts = {}
for n in numbers:
    if ( counts.has_key( n ) ):
        counts[ n ] += 1
    else:
        counts[ n ] = 1
    if ( counts[ n ] >= n / k ):
        return true
return false
孤千羽 2024-07-20 08:34:49

在 F# .net 中计算具有单一模式的数据集(整数)的统计模式

let foundX (num: int, dList) = List.filter (fun x -> x = num) dList
let groupNum dList =
    dList
    |> (List.map (fun x -> foundX (x, dList)))
    |> (List.maxBy (fun x -> x.Length))

let Mode (dList: int List) = 
    let x = groupNum dList
    x.Head

//using Mode
let data = [1;1;1;1;1;1;1;1;2;2;3;3;3;1;4;4;4;4;4]
Mode data;;`

Calculating Statistical Mode in F# .net for data set (integers) that has single Mode

let foundX (num: int, dList) = List.filter (fun x -> x = num) dList
let groupNum dList =
    dList
    |> (List.map (fun x -> foundX (x, dList)))
    |> (List.maxBy (fun x -> x.Length))

let Mode (dList: int List) = 
    let x = groupNum dList
    x.Head

//using Mode
let data = [1;1;1;1;1;1;1;1;2;2;3;3;3;1;4;4;4;4;4]
Mode data;;`

遥远的绿洲 2024-07-20 08:34:49

伪代码:

 found = false
 value = null
 B = new hashtable
 for (i =0, j = A[i]; i < |A| and !found; ++i, j=A[i])
    if B contains key j
       B[j] = B[j] + 1
       if B[j] > |A|/k
          found = true
          value = j
       endif
    else 
       B[j] = 1
    endif
 end for

假设您的哈希表实现具有 O(1) 插入/查找,这应该是 O(n)

Pseudocode:

 found = false
 value = null
 B = new hashtable
 for (i =0, j = A[i]; i < |A| and !found; ++i, j=A[i])
    if B contains key j
       B[j] = B[j] + 1
       if B[j] > |A|/k
          found = true
          value = j
       endif
    else 
       B[j] = 1
    endif
 end for

Assuming that your hashtable implementation has O(1) insert/lookup this should be O(n)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文