NumPy 中的加权标准差
numpy.average()
有一个权重选项,但 numpy.std()
没有。有人对解决方法有建议吗?
numpy.average()
has a weights option, but numpy.std()
does not. Does anyone have suggestions for a workaround?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
下面这个简短的“手动计算”怎么样?
How about the following short "manual calculation"?
statsmodels
中有一个类可以方便计算加权统计:statsmodels.stats.weightstats.DecrStatsW
。
假设此数据集和权重:
您初始化类(请注意,您必须传入校正因子,增量
然后您可以计算:
.mean
加权平均值:<前><代码>>>>加权统计平均值
1.97196261682243
.std
加权标准差< /强>:<前><代码>>>>加权统计.std
0.21434289609681711
.var
加权方差< /强>:<前><代码>>>>加权统计.var
0.045942877107170932
.std_mean
加权平均值的标准误差:<前><代码>>>>加权统计数据.std_mean
0.020818822467555047
以防万一您对标准误差和标准差之间的关系感兴趣:标准误差(对于
ddof == 0
)计算为加权标准差除以平方权重总和减 1 的根 (GitHub 上statsmodels
版本 0.9 的相应源代码):There is a class in
statsmodels
that makes it easy to calculate weighted statistics:statsmodels.stats.weightstats.DescrStatsW
.Assuming this dataset and weights:
You initialize the class (note that you have to pass in the correction factor, the delta degrees of freedom at this point):
Then you can calculate:
.mean
the weighted mean:.std
the weighted standard deviation:.var
the weighted variance:.std_mean
the standard error of weighted mean:Just in case you're interested in the relation between the standard error and the standard deviation: The standard error is (for
ddof == 0
) calculated as the weighted standard deviation divided by the square root of the sum of the weights minus 1 (corresponding source forstatsmodels
version 0.9 on GitHub):这是另一种选择:
Here's one more option:
numpy/scipy 中似乎还没有这样的函数,但是有一个 ticket< /a> 提议添加此功能。在那里你会发现 Statistics.py 它实现了加权标准偏差。
There doesn't appear to be such a function in numpy/scipy yet, but there is a ticket proposing this added functionality. Included there you will find Statistics.py which implements weighted standard deviations.
gaborous 提出了一个非常好的例子:
加权无偏样本协方差的正确方程,URL(版本:2016-06-28)
There is a very good example proposed by gaborous:
Correct equation for weighted unbiased sample covariance, URL (version: 2016-06-28)
“频率权重”自从“加权样本标准差Python”以来,谷歌搜索导致了这篇文章:
或者修改@Eric的答案如下:
A follow-up to "sample" or "unbiased" standard deviation in the "frequency weights" sense since "weighted sample standard deviation python" Google search leads to this post:
Or modifying the answer by @Eric as follows:
我只是在寻找与 numpy
np.std
函数等效的 API,它也允许设置axis
参数:(我只是用二维测试了它,所以感觉如果有问题,可免费进行改进。)
感谢 Eric O Lebigot 的原始答案。
I was just searching for an API equivalent of the numpy
np.std
function that also allows theaxis
parameter to be set:(I just tested it with two dimensions, so feel free for improvements if something is incorrect.)
Thanks to Eric O Lebigot for the original answer.