如何获取整个矩阵、数组或数据帧的平均值、中位数和其他统计数据?
我知道这是一个基本问题,但由于某种奇怪的原因我无法找到答案。
我应该如何在整个数组、矩阵或数据帧上应用基本统计函数(如平均值、中位数等)以获得唯一的答案,而不是行或列上的向量
I know this is a basic question but for some strange reason I am unable to find an answer.
How should I apply basic statistical functions like mean, median, etc. over entire array, matrix or dataframe to get unique answers and not a vector over rows or columns
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
由于这个问题出现得相当多,我将更全面地对待这个问题,包括“等等”。除了
mean
和median
之外。对于矩阵或数组,正如其他人所说,
mean
和median
将返回单个值。但是,var
将计算二维矩阵的列之间的协方差。有趣的是,对于多维数组,var
返回返回单个值。二维矩阵上的 sd 可以工作,但已弃用,返回列的标准差。更好的是,mad
在二维矩阵和多维数组上返回单个值。如果您想要返回单个值,最安全的途径是首先使用as.vector()
进行强制。玩得开心吗?对于
data.frame
,mean
已弃用,但将再次单独作用于列。median
要求您首先强制转换为向量,或者unlist
。和以前一样,var
将返回协方差,sd
再次被弃用,但会返回列的标准差。mad
要求您强制转换为向量或取消列出
。 ,如果您希望某些内容对所有值起作用,通常只需先取消列出
它即可。编辑:最新突发新闻():在 R 3.0.0 中,mean.data.frame 已失效:
Since this comes up a fair bit, I'm going to treat this a little more comprehensively, to include the 'etc.' piece in addition to
mean
andmedian
.For a matrix, or array, as the others have stated,
mean
andmedian
will return a single value. However,var
will compute the covariances between the columns of a two dimensional matrix. Interestingly, for a multi-dimensional array,var
goes back to returning a single value.sd
on a 2-d matrix will work, but is deprecated, returning the standard deviation of the columns. Even better,mad
returns a single value on a 2-d matrix and a multi-dimensional array. If you want a single value returned, the safest route is to coerce usingas.vector()
first. Having fun yet?For a
data.frame
,mean
is deprecated, but will again act on the columns separately.median
requires that you coerce to a vector first, orunlist
. As before,var
will return the covariances, andsd
is again deprecated but will return the standard deviation of the columns.mad
requires that you coerce to a vector orunlist
. In general for adata.frame
if you want something to act on all values, you generally will justunlist
it first.Edit: Late breaking news(): In R 3.0.0 mean.data.frame is defunctified:
默认情况下,
mean
和median
等适用于整个数组或矩阵。例如:
对于数据框,您可以首先将它们强制为矩阵(默认情况下,这是在列上的原因是因为数据框可以包含带有字符串的列,而您无法取其平均值):
请注意您的数据框在强制转换为矩阵之前具有所有数字列。或者排除非数字的。
By default,
mean
andmedian
etc work over an entire array or matrix.E.g.:
For data frames, you can coerce them to a matrix first (the reason this is by default over columns is because a dataframe can have columns with strings in it, which you can't take the mean of):
Just be careful that your dataframe has all numeric columns before coercing to matrix. Or exclude the non-numeric ones.
您可以通过 install.packages('dplyr') 使用
library dplyr
然后You can use
library dplyr
via install.packages('dplyr') and then