计算一组值看起来有多好(分布有多好)

发布于 2024-10-14 11:11:29 字数 358 浏览 6 评论 0原文

这组值: 1 2 3 3 4 1 如果你在条形图上想到它,看起来相当不错:

*   *
* * * *
=======
1 2 3 4 

而这个看起来很糟糕.. 1 2 2 2 2 2 2 2 2 9 8

  *
  *
  * 
  * 
  *
  *
  *
* *           * *
=================
1 2 3 4 5 6 7 8 9

这是因为有很多 2 并且 2 和 8 之间有很大的差距...

我需要找到一个公式来计算一组数字看起来有多漂亮.. 我想我需要一些偏差函数..有什么想法吗?

谢谢

this set of values:
1 2 3 3 4 1
looks pretty nice if you think of it on a bar chart:

*   *
* * * *
=======
1 2 3 4 

while this one looks bad..
1 2 2 2 2 2 2 2 2 9 8

  *
  *
  * 
  * 
  *
  *
  *
* *           * *
=================
1 2 3 4 5 6 7 8 9

This is because there are a lot of 2 and a big gap between the 2 and the 8...

I need to find a formula which computes how nice a set of number looks..
I think I'll need some deviation function.. any idea?

thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

贱贱哒 2024-10-21 11:11:29

卡方分析可能就是您正在寻找的。如果以正确的方式使用,它将为您提供一个数字,描述您的分布与离散均匀分布的接近程度。离散均匀分布将是平坦的(即每个直方图桶中的元素数量大致相同),这似乎符合您对“好”的定义。

A chi-square analysis is probably what you're looking for. If used in the right way it will give you a number describing how close your distribution is to a discrete uniform distribution. A discrete uniform distribution will be flat (i.e. have approximately the same number of elements in each of the histogram buckets), which seems to fit your definition of 'nice'.

迟到的我 2024-10-21 11:11:29

这对我来说似乎很合理,但我对统计学的了解相当有限:

from collections import Counter
def tonums( s ):
        return [int(x) for x in s if x!=' ']

def nice( nums ):
    # how far do they spread
    used_range = range(min(nums), max(nums)+1)

    # how often would each number occur if they were equally distributed
    expected = 1.0*len(nums)/len(used_range)

    # how often do they actually occur
    counter = Counter(nums)

    # compute the variance
    return sum((count-expected)**2 for item, count in counter.iteritems())


# should be fst < snd
print nice(tonums('1 2 3 3 4 1'))
print nice(tonums('1 2 2 2 2 2 2 2 2 9 8'))

# these should be 0
print nice(tonums('1'))
print nice(tonums('1 1 1 1'))

# should be equal
print nice(tonums('1 1 2 3'))
print nice(tonums('1 2 2 3'))

This seems reasonable to me, but I have pretty limited knowledge of statistics:

from collections import Counter
def tonums( s ):
        return [int(x) for x in s if x!=' ']

def nice( nums ):
    # how far do they spread
    used_range = range(min(nums), max(nums)+1)

    # how often would each number occur if they were equally distributed
    expected = 1.0*len(nums)/len(used_range)

    # how often do they actually occur
    counter = Counter(nums)

    # compute the variance
    return sum((count-expected)**2 for item, count in counter.iteritems())


# should be fst < snd
print nice(tonums('1 2 3 3 4 1'))
print nice(tonums('1 2 2 2 2 2 2 2 2 9 8'))

# these should be 0
print nice(tonums('1'))
print nice(tonums('1 1 1 1'))

# should be equal
print nice(tonums('1 1 2 3'))
print nice(tonums('1 2 2 3'))
我纯我任性 2024-10-21 11:11:29

你对“好”的定义有点宽泛。 含义的解释,我建议采用两种方法

  1. 根据我对良好计算(或估计) 如何远离正态分布你的数据是。统计教科书或统计包应该讨论这个问题。
  2. 执行某种傅里叶变换 - 许多高频分量可能不“好”。

Your definition of "nice" is somewhat broad. I'd suggest two approaches to it based on my interpretation of what you mean by nice

  1. Compute (or estimate) how far away from being normally distributed your data is. A stats textbook or stats package should discuss this.
  2. Perform some kind of Fourier transform - lot of high frequency components probably aren't "nice".
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文