如何在 Ruby 中计算标准差?
我有几条具有给定属性的记录,我想找到标准差。
我该怎么做?
I have several records with a given attribute, and I want to find the standard deviation.
How do I do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
测试它:
01/17/2012:
由于 Dave Sag 修复了“sample_variance”
Testing it:
01/17/2012:
fixing "sample_variance" thanks to Dave Sag
看来安吉拉可能一直想要一个现有的图书馆。在玩过 statsample、array-statisics 和其他一些之后,如果您试图避免重新发明轮子。
我无法谈论它的统计正确性,或者你对猴子修补枚举的舒适度;但它易于使用且易于贡献。
It appears that Angela may have been wanting an existing library. After playing with statsample, array-statisics, and a few others, I'd recommend the descriptive_statistics gem if you're trying to avoid reinventing the wheel.
I can't speak to its statistical correctness, or your comfort with monkey-patching Enumerable; but it's easy to use and easy to contribute to.
上面给出的答案很优雅,但有一个小错误。我自己并不是统计专家,我坐下来详细阅读了许多网站,发现这个网站对如何导出标准差给出了最容易理解的解释。 http://sonia.hubpages.com/hub/stddev
上面答案中的错误位于
sample_variance
方法。这是我的更正版本,以及一个显示其工作原理的简单单元测试。
在
./lib/enumerable/standard_deviation.rb
和./test
中使用从简单电子表格派生的数字。The answer given above is elegant but has a slight error in it. Not being a stats head myself I sat up and read in detail a number of websites and found this one gave the most comprehensible explanation of how to derive a standard deviation. http://sonia.hubpages.com/hub/stddev
The error in the answer above is in the
sample_variance
method.Here is my corrected version, along with a simple unit test that shows it works.
in
./lib/enumerable/standard_deviation.rb
in
./test
using numbers derived from a simple spreadsheet.我不太喜欢向
Enumerable
添加方法,因为可能会产生不需要的副作用。它还为任何继承自 Enumerable 的类提供了真正特定于数字数组的方法,这在大多数情况下没有意义。虽然这对于测试、脚本或小型应用程序来说很好,但对于大型应用程序来说却存在风险,因此这里有一个基于 @tolitius 答案的替代方案,该答案已经很完美了。这比其他任何东西都更适合参考:
然后您可以这样使用它:
行为是相同的,但它避免了向
Enumerable
添加方法的开销和风险。I'm not a big fan of adding methods to
Enumerable
since there could be unwanted side effects. It also gives methods really specific to an array of numbers to any class inheriting fromEnumerable
, which doesn't make sense in most cases.While this is fine for tests, scripts or small apps, it's risky for larger applications, so here's an alternative based on @tolitius' answer which was already perfect. This is more for reference than anything else:
And then you use it as such:
The behavior is the same, but it avoids the overheads and risks of adding methods to
Enumerable
.所提供的计算效率不是很高,因为它们需要多次(至少两次,但通常是三次,因为除了 std-dev 之外,您通常还想显示平均值)传递数组。
我知道 Ruby 不是追求效率的地方,但这里是我的实现,它通过一次遍历列表值来计算平均值和标准差:
The presented computation are not very efficient because they require several (at least two, but often three because you usually want to present average in addition to std-dev) passes through the array.
I know Ruby is not the place to look for efficiency, but here is my implementation that computes average and standard deviation with a single pass over the list values:
作为一个简单的函数,给定一个数字列表:
As a simple function, given a list of numbers:
如果手头的记录是
Integer
或Rational
类型,您可能需要使用Rational
而不是Float
来计算方差code> 以避免舍入引入的错误。例如:(
谨慎的做法是为空列表和其他边缘情况添加特殊情况处理。)
那么平方根可以定义为:
If the records at hand are of type
Integer
orRational
, you may want to compute the variance usingRational
instead ofFloat
to avoid errors introduced by rounding.For example:
(It would be prudent to add special-case handling for empty lists and other edge cases.)
Then the square root can be defined as:
如果人们使用 postgres ...它为 stddev_pop 和 stddev_samp 提供聚合函数 - postgresql聚合函数
stddev(相当于stddev_samp)至少从postgres 7.1开始可用,从8.2开始提供samp和pop。
In case people are using postgres ... it provides aggregate functions for stddev_pop and stddev_samp - postgresql aggregate functions
stddev (equiv of stddev_samp) available since at least postgres 7.1, since 8.2 both samp and pop are provided.
或者怎么样:
Or how about:
您可以将其作为辅助方法并在任何地方对其进行评估。
You can place this as helper method and assess it everywhere.