如何获取整个矩阵、数组或数据帧的平均值、中位数和其他统计数据?

发布于 2025-01-08 06:03:30 字数 102 浏览 1 评论 0原文

我知道这是一个基本问题,但由于某种奇怪的原因我无法找到答案。

我应该如何在整个数组、矩阵或数据帧上应用基本统计函数(如平均值、中位数等)以获得唯一的答案,而不是行或列上的向量

I know this is a basic question but for some strange reason I am unable to find an answer.

How should I apply basic statistical functions like mean, median, etc. over entire array, matrix or dataframe to get unique answers and not a vector over rows or columns

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦亿 2025-01-15 06:03:30

由于这个问题出现得相当多,我将更全面地对待这个问题,包括“等等”。除了 meanmedian 之外。

  1. 对于矩阵或数组,正如其他人所说,meanmedian 将返回单个值。但是,var 将计算二维矩阵的列之间的协方差。有趣的是,对于多维数组,var 返回返回单个值。二维矩阵上的 sd 可以工作,但已弃用,返回列的标准差。更好的是,mad 在二维矩阵和多维数组上返回单个值。如果您想要返回单个值,最安全的途径是首先使用 as.vector() 进行强制。玩得开心吗?

  2. 对于 data.framemean 已弃用,但将再次单独作用于列。 median 要求您首先强制转换为向量,或者unlist。和以前一样,var 将返回协方差,sd 再次被弃用,但会返回列的标准差。 mad 要求您强制转换为向量或取消列出。 ,如果您希望某些内容对所有值起作用,通常只需先取消列出它即可。

编辑:最新突发新闻():在 R 3.0.0 中,mean.data.frame 已失效:

o   mean() for data frames and sd() for data frames and matrices are
defunct.

Since this comes up a fair bit, I'm going to treat this a little more comprehensively, to include the 'etc.' piece in addition to mean and median.

  1. For a matrix, or array, as the others have stated, mean and median will return a single value. However, var will compute the covariances between the columns of a two dimensional matrix. Interestingly, for a multi-dimensional array, var goes back to returning a single value. sd on a 2-d matrix will work, but is deprecated, returning the standard deviation of the columns. Even better, mad returns a single value on a 2-d matrix and a multi-dimensional array. If you want a single value returned, the safest route is to coerce using as.vector() first. Having fun yet?

  2. For a data.frame, mean is deprecated, but will again act on the columns separately. median requires that you coerce to a vector first, or unlist. As before, var will return the covariances, and sd is again deprecated but will return the standard deviation of the columns. mad requires that you coerce to a vector or unlist. In general for a data.frame if you want something to act on all values, you generally will just unlist it first.

Edit: Late breaking news(): In R 3.0.0 mean.data.frame is defunctified:

o   mean() for data frames and sd() for data frames and matrices are
defunct.
情栀口红 2025-01-15 06:03:30

默认情况下,meanmedian 等适用于整个数组或矩阵。

例如:

# array:
m <- array(runif(100),dim=c(10,10))
mean(m) # returns *one* value.

# matrix:
mean(as.matrix(m)) # same as before

对于数据框,您可以首先将它们强制为矩阵(默认情况下,这是在列上的原因是因为数据框可以包含带有字符串的列,而您无法取其平均值):

# data frame
mdf <- as.data.frame(m)
# mean(mdf) returns column means
mean( as.matrix(mdf) ) # one value.

请注意您的数据框在强制转换为矩阵之前具有所有数字列。或者排除非数字的。

By default, mean and median etc work over an entire array or matrix.

E.g.:

# array:
m <- array(runif(100),dim=c(10,10))
mean(m) # returns *one* value.

# matrix:
mean(as.matrix(m)) # same as before

For data frames, you can coerce them to a matrix first (the reason this is by default over columns is because a dataframe can have columns with strings in it, which you can't take the mean of):

# data frame
mdf <- as.data.frame(m)
# mean(mdf) returns column means
mean( as.matrix(mdf) ) # one value.

Just be careful that your dataframe has all numeric columns before coercing to matrix. Or exclude the non-numeric ones.

云柯 2025-01-15 06:03:30

您可以通过 install.packages('dplyr') 使用 library dplyr 然后

dataframe.mean <- dataframe %>%
  summarise_all(mean) # replace for median

You can use library dplyr via install.packages('dplyr') and then

dataframe.mean <- dataframe %>%
  summarise_all(mean) # replace for median
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文