R：整理 - 仅合并tibble中的重复列

发布于 2025-02-13 04:32:02 字数 2144 浏览 3 评论 0原文

我的蒂布尔的某些列被撕成两列。我想将它们合并回。重复的列具有相同的名称，并且read_delim（）添加“ ... 2”和“ ... 3”，具有相同的列名称。重复列中不应有两个数值值，但是如果代码可以处理此例外，那将是不错的（两者的平均值都很好）。它经常发生，两个重复的列都包含NAS。有些列仅发生一次（例如date＆amp;时间，pyrano＃1，...）。 “ date＆amp; time”是唯一没有NAS的一致列。

数据看起来像这样：

head(df)

A tibble: 6 × 10

  | ------------------- | ------------------- | ------------------ | -----------------
  | `Date&Time`         |`SNOWDEPTH#1#HS...2` |`SNOWDEPTH#1#HS...3`| `PYRANO#1#RSWR…`
  | <dttm>              |               <dbl> |           <dbl>    |        <dbl>
  |:-------------------:|:-------------------:|:------------------:|:----------------- 
1 | 1997-11-19 16:30:00 |                   0 |               NA   |            NA
2 | 1997-11-19 17:00:00 |                  NA |               10   |            NA
3 | 1997-11-19 17:30:00 |                   9 |               NA   |            NA
4 | 1997-11-19 18:00:00 |                  NA |               NA   |            NA
5 | 1997-11-19 18:30:00 |                   9 |               NA   |            NA
6 | 1997-11-19 19:00:00 |                   9 |               NA   |            NA

# with 6 more variables: `MODEL_SNOWPACK#1#SWE` <dbl>,
#   `THERMO_HYGRO#1#TA_30MIN_MEAN...6` <dbl>,
#   `THERMO_HYGRO#1#TA_30MIN_MEAN...7` <dbl>,
#   `IRTHERMO#1#TSS_30MIN_MEAN...8` <dbl>,
#   `IRTHERMO#1#TSS_30MIN_MEAN...9` <dbl>,
#   `SNOWTHERMO#1#TS0_30MIN_MEAN` <dbl>

我想使用循环循环通过许多文件，但是不幸的是，重复的列并不总是相同的。理想情况下，代码应找到重复的列并自动合并它们。

到目前为止，我尝试过的是：

substr(colnames(df), 1, 7)

[1]“ date＆amp; ti”“ snowdep”“ snowdep”“ pyrano＃”“ model_s”“ thermo_”“ thermo_”“ thermo_” [8]“ irtherm”“ irtherm”“雪地”

df %>% 
      group_by(., substr(colnames(.), 1, 7), na.rm=TRUE) %>% 
      summarise_all()

[8]“ irtherm”“ irtherm” “ group_by（）中的错误：呢添加计算列的问题。由突变（）中的错误引起的：呢计算时问题.. 1 = subnames（Colnames（。），1，7）。 ✖.. 1必须是尺寸407400或1，而不是10。运行rlang :: last_error（）以查看错误发生的位置。

非常感谢您的帮助！

原文

Some columns of my tibble are torn into two columns. I would like to merge them back together. The duplicate columns have the same name and read_delim() adds "...2" and "...3" to have identical column names. There shouldn't be two numerical values in a duplicate column, but it would be nice, if the code could handle this exception (there the mean of both would be nice). It frequently occurs, that both duplicate columns contain NAs. Some columns occur only once (like Date&Time, PYRANO#1, ...). "Date&Time" is the only consistent column without NAs.

The data looks like this:

head(df)

A tibble: 6 × 10

  | ------------------- | ------------------- | ------------------ | -----------------
  | `Date&Time`         |`SNOWDEPTH#1#HS...2` |`SNOWDEPTH#1#HS...3`| `PYRANO#1#RSWR…`
  | <dttm>              |               <dbl> |           <dbl>    |        <dbl>
  |:-------------------:|:-------------------:|:------------------:|:----------------- 
1 | 1997-11-19 16:30:00 |                   0 |               NA   |            NA
2 | 1997-11-19 17:00:00 |                  NA |               10   |            NA
3 | 1997-11-19 17:30:00 |                   9 |               NA   |            NA
4 | 1997-11-19 18:00:00 |                  NA |               NA   |            NA
5 | 1997-11-19 18:30:00 |                   9 |               NA   |            NA
6 | 1997-11-19 19:00:00 |                   9 |               NA   |            NA

# with 6 more variables: `MODEL_SNOWPACK#1#SWE` <dbl>,
#   `THERMO_HYGRO#1#TA_30MIN_MEAN...6` <dbl>,
#   `THERMO_HYGRO#1#TA_30MIN_MEAN...7` <dbl>,
#   `IRTHERMO#1#TSS_30MIN_MEAN...8` <dbl>,
#   `IRTHERMO#1#TSS_30MIN_MEAN...9` <dbl>,
#   `SNOWTHERMO#1#TS0_30MIN_MEAN` <dbl>

I would like to use a for-loop to loop through many of these files, but unfortunately the duplicate columns aren't always the same. Ideally the code should find duplicate columns and merge them automatically.

What I have tried so far:

substr(colnames(df), 1, 7)

[1] "Date&Ti" "SNOWDEP" "SNOWDEP" "PYRANO#" "MODEL_S" "THERMO_" "THERMO_"
[8] "IRTHERM" "IRTHERM" "SNOWTHE"

df %>% 
      group_by(., substr(colnames(.), 1, 7), na.rm=TRUE) %>% 
      summarise_all()

Error in group_by():
! Problem adding computed columns.
Caused by error in mutate():
! Problem while computing ..1 = substr(colnames(.), 1, 7).
✖ ..1 must be size 407400 or 1, not 10.
Run rlang::last_error() to see where the error occurred.

Thanks a lot for your help!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

娇柔作态 2025-02-20 04:32:03

一个可能的解决方案是将其转换为长格式，删除“ ...”（等），然后使用功能转换回宽格式：

library(tidyverse)

df |>
  pivot_longer(-date) |>
  mutate(name = str_remove(name, "\\.\\.\\.\\d")) |>
  pivot_wider(values_fn = ~ mean(., na.rm = TRUE))

输出：

# A tibble: 6 × 2
   date `SNOWDEPTH#1#HS`
  <int>            <dbl>
1     1                0
2     2               10
3     3                9
4     4              NaN
5     5                9
6     6                9

和一些数据：

df <- tibble(date = 1:6,
             `SNOWDEPTH#1#HS...2` = c(0, NA, 9, NA, 9, 9),
             `SNOWDEPTH#1#HS...3` = c(NA, 10, NA, NA, NA, NA))

A possible solution is to turn it into a long format, remove the "..." (etc.) and then transform it back to wide format with a function:

library(tidyverse)

df |>
  pivot_longer(-date) |>
  mutate(name = str_remove(name, "\\.\\.\\.\\d")) |>
  pivot_wider(values_fn = ~ mean(., na.rm = TRUE))

Output:

# A tibble: 6 × 2
   date `SNOWDEPTH#1#HS`
  <int>            <dbl>
1     1                0
2     2               10
3     3                9
4     4              NaN
5     5                9
6     6                9

And some data:

df <- tibble(date = 1:6,
             `SNOWDEPTH#1#HS...2` = c(0, NA, 9, NA, 9, 9),
             `SNOWDEPTH#1#HS...3` = c(NA, 10, NA, NA, NA, NA))

回复收藏 0 原文

~没有更多了~

关于作者

难如初

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

R：整理 - 仅合并tibble中的重复列

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

陪我终i

别忘他

野心澎湃

蒲公英的约定

。

旧时模样

友情链接

R：整理 - 仅合并tibble中的重复列

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

陪我终i

别忘他

野心澎湃

蒲公英的约定

。

旧时模样

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。