我可以在 R 中聚合数据帧并保留字符串变量吗？

发布于 2024-12-12 03:25:54 字数 589 浏览 4 评论 0原文

我有一个以下形式的数据框：

  Family Code Length Type
1      A    1     11 Alpha
2      A    3      8 Beta
3      A    3      9 Beta
4      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6      2 Beta
7      C    6      5 Beta
8      C    6      4 Beta

我想通过取长度值的平均值将数据集减少为包含唯一代码值的数据集，但也保留所有字符串变量，即

  Family Code Length Type
1      A    1     11 Alpha
2      A    3    8.5 Beta
3      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6   3.67 Beta

我尝试过aggregate()和ddply () 但这些似乎用 NA 替换了字符串，我正在努力寻找解决这个问题的方法。

原文

I have a data frame of the form:

  Family Code Length Type
1      A    1     11 Alpha
2      A    3      8 Beta
3      A    3      9 Beta
4      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6      2 Beta
7      C    6      5 Beta
8      C    6      4 Beta

I would like to reduce the data set to one containing unique values of Code by taking a mean of Length values, but to retain all string variables too, i.e.

  Family Code Length Type
1      A    1     11 Alpha
2      A    3    8.5 Beta
3      B    4      7 Alpha
5      B    5      8 Alpha
6      C    6   3.67 Beta

I've tried aggregate() and ddply() but these seem to replace strings with NA and I'm struggling to find a way round this.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

临风闻羌笛 2024-12-19 03:25:54

由于 Family 和 Type 在 Code 组中是不变的，因此当您使用 时，您也可以对它们进行“分组”，而无需更改任何内容ddply。如果您的原始数据集是 dat

ddply(dat, .(Family, Code, Type), summarize, Length=mean(Length))

给出

  Family Code  Type    Length
1      A    1 Alpha 11.000000
2      A    3  Beta  8.500000
3      B    4 Alpha  7.000000
4      B    5 Alpha  8.000000
5      C    6  Beta  3.666667

If Family and Type are not Constant inside a Code group，那么您需要定义如何总结/聚合这些值。在此示例中，我仅采用单个唯一值：

ddply(dat, .(Code), summarize, Family=unique(Family), 
  Length=mean(Length), Type=unique(Type))

使用 `dplyr`更新

类似选项是

 library(dplyr)
 dat %>% 
     group_by(Family, Code, Type) %>%
     summarise(Length=mean(Length))

和

  dat %>%
     group_by(Code) %>%
     summarise(Family=unique(Family), Length=mean(Length), Type=unique(Type))

Since Family and Type are constant within a Code group, you can "group" on those as well without changing anything when you use ddply. If your original data set was dat

ddply(dat, .(Family, Code, Type), summarize, Length=mean(Length))

gives

  Family Code  Type    Length
1      A    1 Alpha 11.000000
2      A    3  Beta  8.500000
3      B    4 Alpha  7.000000
4      B    5 Alpha  8.000000
5      C    6  Beta  3.666667

If Family and Type are not constant within a Code group, then you would need to define how to summarize/aggregate those values. In this example, I just take the single unique value:

ddply(dat, .(Code), summarize, Family=unique(Family), 
  Length=mean(Length), Type=unique(Type))

Update

Similar options using dplyr are

 library(dplyr)
 dat %>% 
     group_by(Family, Code, Type) %>%
     summarise(Length=mean(Length))

and

  dat %>%
     group_by(Code) %>%
     summarise(Family=unique(Family), Length=mean(Length), Type=unique(Type))

回复收藏 0 原文

~没有更多了~

关于作者

深空失忆

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

我可以在 R 中聚合数据帧并保留字符串变量吗？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

使用 `dplyr`更新

Update

关于作者

相关话题

热门标签

推荐作者

紫罗兰の梦幻

-2134

liuxuanli

意中人

○愚か者の日

xxhui

友情链接

我可以在 R 中聚合数据帧并保留字符串变量吗？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

使用 dplyr更新

Update

关于作者

相关话题

热门标签

推荐作者

紫罗兰の梦幻

-2134

liuxuanli

意中人

○愚か者の日

xxhui

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

使用 `dplyr`更新