按组返回 NA 的函数

发布于 2025-01-12 20:54:45 字数 1596 浏览 1 评论 0原文

我有以下数据框,我正在尝试为其创建函数:

df<- structure(list(BLG = c(37.037037037037, 12.0603015075377, 93.5593220338983, 
3.96563119629874, 77.634011090573, 71.608040201005, 3.96563119629874, 
119.775421085465, 44.8765893792072), GSF = c(0, 0, 0, 0, 11.090573012939, 
0, 0, 0, 0), LMB = c(66.6666666666667, 24.1206030150754, 40.6779661016949, 
31.7250495703899, 73.9371534195933, 67.8391959798995, 31.7250495703899, 
22.4578914535246, 31.413612565445), YLB = c(0, 0, 0, 0, 14.7874306839187, 
0, 0, 0, 0), BLC = c(3.7037037037037, 0, 4.06779661016949, 7.93126239259749, 
7.39371534195933, 11.3065326633166, 7.93126239259749, 3.74298190892077, 
22.4382946896036), WHC = c(7.40740740740741, 0, 0, 0, 0, 0, 0, 
7.48596381784155, 4.48765893792072), RSF = c(0, 0, 0, 0, 0, 0, 
0, 0, 4.48765893792072), CCF = c(3.7037037037037, 0, 8.13559322033898, 
0, 0, 0, 0, 0, 0), BLB = c(0, 0, 0, 0, 0, 0, 0, 0, 0), group = c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), row.names = c(NA, -9L), class = c("data.table", 
"data.frame"))

函数

p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column 

estimate2 = function(df) {
  
  y_est2 = df
  
  sqrt(mean((y_est2-p_true)^2))/p_true*100
}


final<- df %>%
  group_by(group) %>%
  group_modify(~ as.data.frame.list(estimate2(.)))

最终输出应该是 3x9 数据框:每组每一列一个值。可以使用 plyr::ddply(df, .(group),estimate2) 获得预期的输出格式,

甚至无需尝试使用 estimate2(df) 跨组运行该函数(并取出组列)它仍然说参数不是逻辑或数字;返回NA。

我不知道为什么,因为我运行的函数与此函数非常相似,只是内部的实际方程略有不同,而且它们工作得很好。

有人知道我哪里出错了吗?

I have the following data frame that I am trying to make a function for:

df<- structure(list(BLG = c(37.037037037037, 12.0603015075377, 93.5593220338983, 
3.96563119629874, 77.634011090573, 71.608040201005, 3.96563119629874, 
119.775421085465, 44.8765893792072), GSF = c(0, 0, 0, 0, 11.090573012939, 
0, 0, 0, 0), LMB = c(66.6666666666667, 24.1206030150754, 40.6779661016949, 
31.7250495703899, 73.9371534195933, 67.8391959798995, 31.7250495703899, 
22.4578914535246, 31.413612565445), YLB = c(0, 0, 0, 0, 14.7874306839187, 
0, 0, 0, 0), BLC = c(3.7037037037037, 0, 4.06779661016949, 7.93126239259749, 
7.39371534195933, 11.3065326633166, 7.93126239259749, 3.74298190892077, 
22.4382946896036), WHC = c(7.40740740740741, 0, 0, 0, 0, 0, 0, 
7.48596381784155, 4.48765893792072), RSF = c(0, 0, 0, 0, 0, 0, 
0, 0, 4.48765893792072), CCF = c(3.7037037037037, 0, 8.13559322033898, 
0, 0, 0, 0, 0, 0), BLB = c(0, 0, 0, 0, 0, 0, 0, 0, 0), group = c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), row.names = c(NA, -9L), class = c("data.table", 
"data.frame"))

Function

p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column 

estimate2 = function(df) {
  
  y_est2 = df
  
  sqrt(mean((y_est2-p_true)^2))/p_true*100
}


final<- df %>%
  group_by(group) %>%
  group_modify(~ as.data.frame.list(estimate2(.)))

The final output should be a 3x9 data frame: one value for each column per group. Can get the intended output format with plyr::ddply(df, .(group), estimate2)

Even without trying to run the function across groups with estimate2(df) (and taking out the group column) it still says argument is not logical or numeric; returning NA.

I'm not sure why though because I've run functions very similar to this one that only differ slightly by the actual equation inside and they work fine.

Anyone know where I'm going wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

清风无影 2025-01-19 20:54:45

问题出在 mean 命令上。用 ?mean 查看它的帮助,它说:

x
一个 R 对象。目前有用于数字/逻辑向量和日期、日期时间和时间间隔对象的方法。仅当 Trim = 0 时才允许使用复向量。

但您想要计算数据框三行的平均值。

我不完全确定以下内容是否是您想要的,但您可以取消列出数据框,使其成为向量。除以 p_true 然后被回收到该向量的长度。然后,您可以再次将结果组合到数据框中:

p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column 

estimate2 = function(df) {
  
  y_est2 = df
  
  return_df <- as.data.frame(t(sqrt(mean(unlist((y_est2-p_true)^2)))/p_true*100))
  names(return_df) <- names(y_est2)
  return(return_df)
}

final<- df %>%
  group_by(group) %>%
  group_modify(~ as.data.frame.list(estimate2(.)))

这将返回:

# A tibble: 3 x 10
# Groups:   group [3]
  group   BLG   GSF   LMB   YLB   BLC   WHC   RSF   CCF   BLB
  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1  38.7  321.  68.3  401.  357.  535.  268.  642.  401.
2     2  45.9  381.  81.1  477.  424.  635.  318.  763.  477.
3     3  45.6  378.  80.4  473.  420.  630.  315.  756.  473.

The problem is the mean command. Looking at the help for it with ?mean it says:

x
An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for trim = 0, only.

But you want to calculate the mean for three rows of a data frame.

I'm not entirely sure if the following is what you want, but you can unlist your data frame so that it is a vector. The division by p_true is then recycled to the length of this vector. You can then combine the result again into a data frame:

p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column 

estimate2 = function(df) {
  
  y_est2 = df
  
  return_df <- as.data.frame(t(sqrt(mean(unlist((y_est2-p_true)^2)))/p_true*100))
  names(return_df) <- names(y_est2)
  return(return_df)
}

final<- df %>%
  group_by(group) %>%
  group_modify(~ as.data.frame.list(estimate2(.)))

This returns:

# A tibble: 3 x 10
# Groups:   group [3]
  group   BLG   GSF   LMB   YLB   BLC   WHC   RSF   CCF   BLB
  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1  38.7  321.  68.3  401.  357.  535.  268.  642.  401.
2     2  45.9  381.  81.1  477.  424.  635.  318.  763.  477.
3     3  45.6  378.  80.4  473.  420.  630.  315.  756.  473.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文