R循环：将列添加到表中（如果尚不存在）

发布于 2024-11-30 13:32:39 字数 937 浏览 1 评论 0原文

我正在尝试使用 R 中的 for 循环编译多个文件中的数据。我想将所有数据放入一张表中。下面的计算只是一个例子。

library(reshape)

dat1 <- data.frame("Specimen" = paste("sp", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2), "Density_3" = rnorm(10,4,2))
dat2 <- data.frame("Specimen" = paste("fg", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2))

dat <- c("dat1", "dat2")
for(i in 1:length(dat)){
data <- get(dat[i])
melt.data <- melt(data, id = 1)
assign(paste(dat[i], "tbl", sep=""), cast(melt.data, ~ variable, mean))
}

rbind(dat1tbl, dat2tbl)

在 dat2 中添加额外列的最流畅方法是什么？我想获得相同的列名称（在本例中为“Density_3”）并用零填充（如果它尚不存在）。假设我有大约 100 个表，列数（Density_1、2、3 等）在 5 到 6 之间变化。

我尝试了以下操作，但没有成功：

if(names(data) %in% "Density_3" == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

另一个：是否有一种平滑的方法来 rbind() 表？看来 rbind(get(dat)) 不起作用。

原文

I am trying to compile data from several files using for loops in R. I would like to get all the data into one table. Following calculation is just an example.

library(reshape)

dat1 <- data.frame("Specimen" = paste("sp", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2), "Density_3" = rnorm(10,4,2))
dat2 <- data.frame("Specimen" = paste("fg", 1:10, sep=""), "Density_1" = rnorm(10,4,2), "Density_2" = rnorm(10,4,2))

dat <- c("dat1", "dat2")
for(i in 1:length(dat)){
data <- get(dat[i])
melt.data <- melt(data, id = 1)
assign(paste(dat[i], "tbl", sep=""), cast(melt.data, ~ variable, mean))
}

rbind(dat1tbl, dat2tbl)

What is the smoothest way to add an extra column into dat2? I would like to get the same column name ("Density_3" in this case) and fill it up with zeros, if it does not already exist. Assume that I have ~100 tables with number of columns (Density_1, 2, 3 etc) varying between 5 and 6.

I tried following, but it didn't work:

if(names(data) %in% "Density_3" == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

Another one: is there a smooth way to rbind() the tables? It seems that rbind(get(dat)) does not work.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

若无相欠,怎会相见 2024-12-07 13:32:39

盯着这个问题一段时间后，我认为它的意图可能被不必要的 get 和 assign 操作所掩盖。我认为答案是 pylr::rbind.fill

我会构造“dat”，不是作为字符向量，而是作为两个数据帧的列表，使用 aggregate( ..., FUN=mean) （因为我还没有登上 reshape2/plyr 总线，除了 melt 和 rbind.fill ）然后do.call(rbind.fill, ...) 在结果列表上。无论如何，这就是我认为你想要的。我认为为真正缺失的值添加零不是一个好主意。

> rbind.fill(dat1tbl, dat2tbl)
  value Density_1 Density_2 Density_3
1 (all)  5.006709  4.088988  2.958971
2 (all)  4.178586  3.812362        NA

After staring at this question for a while I think its intent may have been obscured by the unnecessary get and assign manipulations. And I think the answer is pylr::rbind.fill

I would have constructed "dat", not as a character vector but as a list of two dataframes, used aggregate( ..., FUN=mean) (because I haven't gotten on the reshape2/plyr bus, except for melt and rbind.fill that is ) and then do.call(rbind.fill, ...) on the resulting list. At any rate this is what I think you want. I do not think it is a good idea to add in zeros for what are really missing values.

> rbind.fill(dat1tbl, dat2tbl)
  value Density_1 Density_2 Density_3
1 (all)  5.006709  4.088988  2.958971
2 (all)  4.178586  3.812362        NA

回复收藏 0 原文

黎夕旧梦 2024-12-07 13:32:39

这是一篇旧帖子，但无论如何：我相信如果您切换顺序，您上面提到的代码将会起作用：

if("Density_3" %in% names(data) == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

正如您所拥有的，这部分“Density_3”％in％names（data）== FALSE 会给你一个 TRUE/FALSE 向量（对于每一列），而你想要的只是该特定列的一个值。因此，您需要询问该列是否存在于数据框中，而不是相反。

This is an old post, but in any case: I believe the code you mention above would have worked if you switch the order:

if("Density_3" %in% names(data) == FALSE){
dat.all$Density_3 <- 0
} else {
dat.all$Density_3 <- dat.all$Density3}

As you have it, this part "Density_3" %in% names(data) == FALSE would give you a vector of TRUE/FALSE (for each column), while what you want is only one value, for that specific column. So, you need to ask if that column is present in the data frame, and not the opposite.

回复收藏 0 原文

~没有更多了~