合并行在选定的列中共享相同的观察结果

发布于 2025-02-10 12:16:04 字数 847 浏览 2 评论 0原文

我正在清洁数据集，并且在清洁重复项后，我想合并在特定列中共享相同观察的行（例如ID列）。

我希望合并/聚合，以便每个选定的观察结果只有一排（即：每个ID 一行）。如果可能的话，汇总行将总结所有观察值，但选择合并的观测值（ID）。

这将是假设的设置：

    set.seed(18)
    dat <- data.frame(ID=c(1,2,1,2,2,3),value=c(5,5,7,8,3,2),location=c("NY","LA","NY","LA","LA","LA"))
    dat

我想知道如何获得

    set.seed(9)
    dat1 <- data.frame(id=c(1,2,3),value=c(5+7,5+8+3,2),location=c("NY","LA","LA"))
    dat1

与ID相对于ID的汇总，将观测值“值”总结并选择相应的位置。

另外，我想知道是否可以将数据框架分组有关位置，例如获取：

    set.seed(6)
    dat2 <- data.frame(location=c("NY","LA"),value=c(5+7,5+8+3+2),meanvalue=c(mean(5+7),mean(5+8+3+2)))
    dat2

我没有将ID放入该表中，因为在这种情况下，它并不重要：可以求和或删除，它是不会考虑任何进一步的计算。我知道我的卑鄙的输出是错误的：我希望获得所有行共享相同位置的平均值（即洛杉矶和纽约的平均值）。如果您还可以在这一方面纠正我，我将不胜感激。

感谢您的帮助！

原文

I'm cleaning a data set and after cleaning duplicates, I would like to merge the rows that share the same observation in a specific column (e.g. ID column).

I am looking to merge/aggregate so that I only have one row per chosen observation (i.e. here: one row per ID) left.
If possible, the aggregate row would sum-up all observations but the chosen one to merge (ID).

This would be hypothetical settings:

    set.seed(18)
    dat <- data.frame(ID=c(1,2,1,2,2,3),value=c(5,5,7,8,3,2),location=c("NY","LA","NY","LA","LA","LA"))
    dat

And I would like to know how to obtain

    set.seed(9)
    dat1 <- data.frame(id=c(1,2,3),value=c(5+7,5+8+3,2),location=c("NY","LA","LA"))
    dat1

Which aggregate with respect to ID, sum the observations "value" and pick the corresponding location.

Also, I would like to know if it's possible to group the dataframe with respect to location, such as to obtain:

    set.seed(6)
    dat2 <- data.frame(location=c("NY","LA"),value=c(5+7,5+8+3+2),meanvalue=c(mean(5+7),mean(5+8+3+2)))
    dat2

I did not put ID in this table because in this case, it does not matter: it can be summed or deleted, it's not going to be taken into account for any further computation.
I know that my output for meanvalue is wrong: I am looking to get the mean of all rows sharing the same location (i.e. mean for LA and NY). I would appreciate if you also can correct me on this one.

Thank you for your help!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

季末如歌 2025-02-17 12:16:04

我看到您包括set.seed，但没有看到任何采样或随机过程（除非我错过了什么）。

使用tidyverse的一种方法是以下内容。让我知道这是否是您的想法。

对于第一部分，请使用group_by基于value基于ID 和位置：

library(tidyverse)

dat %>%
  group_by(ID, location) %>%
  summarise(sum_value = sum(value))

输出 /strong>

     ID location sum_value
  <dbl> <chr>        <dbl>
1     1 NY              12
2     2 LA              16
3     3 LA               2

在第二部分中，如果您group_by 位置，则可以使用sum and 和 mean 总结：

dat %>%
  group_by(location) %>%
  summarise(sum_value = sum(value), mean_value = mean(value))

输出

  location sum_value mean_value
  <chr>        <dbl>      <dbl>
1 LA              18        4.5
2 NY              12        6

I see that you included set.seed but did not see any sampling or randomized procedures (unless I missed something).

One approach with tidyverse is the following. Let me know if this is what you had in mind.

For the first part, use group_by to sum the value based on ID and location:

library(tidyverse)

dat %>%
  group_by(ID, location) %>%
  summarise(sum_value = sum(value))

Output

     ID location sum_value
  <dbl> <chr>        <dbl>
1     1 NY              12
2     2 LA              16
3     3 LA               2

For the second part, if you group_by the location, you can then use sum and mean with summarise:

dat %>%
  group_by(location) %>%
  summarise(sum_value = sum(value), mean_value = mean(value))

Output

  location sum_value mean_value
  <chr>        <dbl>      <dbl>
1 LA              18        4.5
2 NY              12        6

回复收藏 0 原文

~没有更多了~

关于作者

南七夏

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

合并行在选定的列中共享相同的观察结果

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

lylex099819

yg

mb_PT8LkUS5

埋情葬爱

佚名

奢望

友情链接

合并行在选定的列中共享相同的观察结果

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

lylex099819

yg

mb_PT8LkUS5

埋情葬爱

佚名

奢望

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。