R 中合并标准差的现有函数?
我有 4 个总体,均值和标准差已知。我想知道总平均值和总标准差。总平均值显然很容易计算,但 R 有一个方便的实用函数,weighted.mean()。是否存在用于组合标准差的类似函数?
计算并不复杂,但现有的函数将使我的代码更清晰、更容易理解。
额外的问题,你用什么工具来搜索这样的函数?我知道它一定就在那里,但我已经做了很多搜索,但找不到它。谢谢!
I have 4 populations with known means and standard deviations. I would like to know the grand mean and grand sd. The grand mean is obviously simple to calculate, but R has a handy utility function, weighted.mean(). Does a similar function exist for combining standard deviations?
The calculation is not complicated, but an existing function would make my code cleaner and easier to understand.
Bonus question, what tools do you use to search for functions like this? I know it must be out there, but I've done a lot of searching and can't find it. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不知道特定的包或函数名称,但从维基百科页面推出您自己的函数似乎很容易。假设人群没有重叠:
I don't know of a specific package or function name but it seems easy to roll your own function from Wikipedia's page. Assuming no overlap in the populations:
人口不重叠吗?
例如,维基百科中的示例将像这样工作:
则标准差将为 sqrt(combinevar(xbar,s,n)[2]) ,该函数如下所示:
如果您不想下载该库,
Are the populations non overlapping?
For instance the example in wikipedia would work like this:
and standard deviation would be sqrt(combinevar(xbar,s,n)[2])
if you don't want to download the library the function goes like this:
使用
utilities
包中的sample.decomp 函数 这种统计问题现在已在
实用程序
包。该函数可以根据子组矩计算合并样本矩,或者根据其他子组矩和合并矩计算缺失的子组矩。它适用于四阶分解,即样本大小、样本均值、样本方差/标准差、样本偏度和样本峰度的分解。如何使用该函数:这里我们给出一个示例,其中我们使用该函数来计算由四个子组组成的合并样本的样本矩。为此,我们首先生成一个模拟数据集
DATA
,其中包含四个大小不等的子组,并将它们汇集为单个数据集POOL
。子组和合并样本的矩可以使用同一包中的moments函数获得。现在我们已经有了子组的矩集,我们可以使用
sample.decomp
函数从子组样本矩中获取合并样本矩。作为此函数的输入,您可以使用子组的moments
输出,也可以分别输入样本大小和样本矩作为向量(这里我们将使用后者)。正如您所看到的,这为合并样本提供了与根据基础数据直接计算相同的样本矩。如您所见,
sample.decomp
函数允许计算合并样本方差。您可以在包文档中阅读有关此功能的信息。Use the
sample.decomp
function in theutilities
packageStatistical problems of this kind have now been automated in the
sample.decomp
function in theutilities
package. This function can compute pooled sample moments from subgroup moments, or compute missing subgroup moments from the other subgroup moments and pooled moments. It works for decompositions up to fourth order ---i.e., decompositions of sample size, sample mean, sample variance/standard deviation, sample skewness, and sample kurtosis.How to use the function: Here we give an example where we use the function to compute the sample moments of a pooled sample composed of four subgroups. To do this, we first generate a mock dataset
DATA
containing four subgroups with unequal sizes, and we pool these as the single datasetPOOL
. The moments of the subgroups and the pooled sample can be obtained using themoments
function in the same package.Now that we have set of moments for subgroups, we can use the
sample.decomp
function to obtain the pooled sample moments from the subgroup sample moments. As an input to this function you can either use themoments
output for the subgroups or you can input the sample sizes and sample moments separately as vectors (here we will do the latter). As you can see, this gives the same sample moments for the pooled sample as direct computation from the underlying data.As you can see, the
sample.decomp
function allows computation of the pooled sample variance. You can read about this function in the package documentation.