计算一组值看起来有多好（分布有多好）

发布于 2024-10-14 11:11:29 字数 358 浏览 6 评论 0原文

这组值： 1 2 3 3 4 1 如果你在条形图上想到它，看起来相当不错：

*   *
* * * *
=======
1 2 3 4

而这个看起来很糟糕.. 1 2 2 2 2 2 2 2 2 9 8

  *
  *
  * 
  * 
  *
  *
  *
* *           * *
=================
1 2 3 4 5 6 7 8 9

这是因为有很多 2 并且 2 和 8 之间有很大的差距...

我需要找到一个公式来计算一组数字看起来有多漂亮.. 我想我需要一些偏差函数..有什么想法吗？

谢谢

原文

this set of values:
1 2 3 3 4 1
looks pretty nice if you think of it on a bar chart:

*   *
* * * *
=======
1 2 3 4

while this one looks bad..
1 2 2 2 2 2 2 2 2 9 8

  *
  *
  * 
  * 
  *
  *
  *
* *           * *
=================
1 2 3 4 5 6 7 8 9

This is because there are a lot of 2 and a big gap between the 2 and the 8...

I need to find a formula which computes how nice a set of number looks..
I think I'll need some deviation function.. any idea?

thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

贱贱哒 2024-10-21 11:11:29

卡方分析可能就是您正在寻找的。如果以正确的方式使用，它将为您提供一个数字，描述您的分布与离散均匀分布的接近程度。离散均匀分布将是平坦的（即每个直方图桶中的元素数量大致相同），这似乎符合您对“好”的定义。

回复收藏 0 原文

迟到的我 2024-10-21 11:11:29

这对我来说似乎很合理，但我对统计学的了解相当有限：

from collections import Counter
def tonums( s ):
        return [int(x) for x in s if x!=' ']

def nice( nums ):
    # how far do they spread
    used_range = range(min(nums), max(nums)+1)

    # how often would each number occur if they were equally distributed
    expected = 1.0*len(nums)/len(used_range)

    # how often do they actually occur
    counter = Counter(nums)

    # compute the variance
    return sum((count-expected)**2 for item, count in counter.iteritems())


# should be fst < snd
print nice(tonums('1 2 3 3 4 1'))
print nice(tonums('1 2 2 2 2 2 2 2 2 9 8'))

# these should be 0
print nice(tonums('1'))
print nice(tonums('1 1 1 1'))

# should be equal
print nice(tonums('1 1 2 3'))
print nice(tonums('1 2 2 3'))

This seems reasonable to me, but I have pretty limited knowledge of statistics:

from collections import Counter
def tonums( s ):
        return [int(x) for x in s if x!=' ']

def nice( nums ):
    # how far do they spread
    used_range = range(min(nums), max(nums)+1)

    # how often would each number occur if they were equally distributed
    expected = 1.0*len(nums)/len(used_range)

    # how often do they actually occur
    counter = Counter(nums)

    # compute the variance
    return sum((count-expected)**2 for item, count in counter.iteritems())


# should be fst < snd
print nice(tonums('1 2 3 3 4 1'))
print nice(tonums('1 2 2 2 2 2 2 2 2 9 8'))

# these should be 0
print nice(tonums('1'))
print nice(tonums('1 1 1 1'))

# should be equal
print nice(tonums('1 1 2 3'))
print nice(tonums('1 2 2 3'))

回复收藏 0 原文