cbind 一个带有空数据框的数据框 - cbind.fill?

发布于 2024-12-12 20:46:16 字数 1330 浏览 0 评论 0原文

我想我正在寻找 cbindrbind.fill 类似物(在 Hadley 的 plyr 包中)。我看了看,但没有cbind.fill

我想做的是以下内容:

#set these just for this example
one_option <- TRUE
diff_option <- TRUE

return_df <- data.frame()

if (one_option) {
    #do a bunch of calculations, produce a data.frame, for simplicity the following small_df
    small_df <- data.frame(a=1, b=2)
    return_df <- cbind(return_df,small_df)
}

if (diff_option) {
    #do a bunch of calculations, produce a data.frame, for simplicity the following small2_df
    small2_df <- data.frame(l="hi there", m=44)
    return_df <- cbind(return_df,small2_df)
}

return_df

可以理解,这会产生错误:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 0, 1

我当前的修复是将行 return_df <- data.frame() 替换为 return_df <- data。 frame(dummy=1) 然后代码就可以工作了。然后,我只是从最后的 return_df 中删除虚拟对象。添加虚拟对象并运行上面的代码后,我发现

      dummy a b        l  m
1     1 1 2 hi there 44

我只需要删除虚拟对象,例如:

> return_df[,2:ncol(return_df)]
  a b        l  m
1 1 2 hi there 44

我确信我缺少一种更简单的方法来执行此操作。

编辑:我想我不是在寻找 cbind.fill,因为这意味着将在 cbind 之后创建 NA 值,这不是我想要的。

I think I'm looking for an analog of rbind.fill (in Hadley's plyr package) for cbind. I looked, but there is no cbind.fill.

What I want to do is the following:

#set these just for this example
one_option <- TRUE
diff_option <- TRUE

return_df <- data.frame()

if (one_option) {
    #do a bunch of calculations, produce a data.frame, for simplicity the following small_df
    small_df <- data.frame(a=1, b=2)
    return_df <- cbind(return_df,small_df)
}

if (diff_option) {
    #do a bunch of calculations, produce a data.frame, for simplicity the following small2_df
    small2_df <- data.frame(l="hi there", m=44)
    return_df <- cbind(return_df,small2_df)
}

return_df

Understandably, this produces an error:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 0, 1

My current fix is to replace the line return_df <- data.frame() with return_df <- data.frame(dummy=1) and then the code works. I then just remove dummy from the return_df at the end. After adding the dummy and running the above code, I get

      dummy a b        l  m
1     1 1 2 hi there 44

I then just need to get rid of the dummy, e.g.:

> return_df[,2:ncol(return_df)]
  a b        l  m
1 1 2 hi there 44

I'm sure I'm missing an easier way to do this.

edit: I guess I'm not looking for a cbind.fill because that would mean that an NA value would be created after the cbind, which is not what I want.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

守护在此方 2024-12-19 20:46:18

我建议修改泰勒的答案。我的函数允许使用向量对 data.frames 和/或矩阵进行 cbind 操作,而不会像泰勒的解决方案中那样丢失列名

cbind.fill <- function(...){
  nm <- list(...) 
  dfdetect <- grepl("data.frame|matrix", unlist(lapply(nm, function(cl) paste(class(cl), collapse = " ") )))
  # first cbind vectors together 
  vec <- data.frame(nm[!dfdetect])
  n <- max(sapply(nm[dfdetect], nrow)) 
  vec <- data.frame(lapply(vec, function(x) rep(x, n)))
  if (nrow(vec) > 0) nm <- c(nm[dfdetect], list(vec))
  nm <- lapply(nm, as.data.frame)

  do.call(cbind, lapply(nm, function (df1) 
    rbind(df1, as.data.frame(matrix(NA, ncol = ncol(df1), nrow = n-nrow(df1), dimnames = list(NULL, names(df1))))) )) 
}

cbind.fill(data.frame(idx = numeric()), matrix(0, ncol = 2), 
           data.frame(qwe = 1:3, rty = letters[1:3]), type = "GOOD", mark = "K-5")
#       idx V1 V2 qwe rty type mark
#     1  NA  0  0   1   a GOOD  K-5
#     2  NA NA NA   2   b GOOD  K-5
#     3  NA NA NA   3   c GOOD  K-5

I suggest a modification of Tyler's answer. My function allows cbind-ing of data.frames and/or matrices with vectors without loosing column names as it happens in Tyler's solution

cbind.fill <- function(...){
  nm <- list(...) 
  dfdetect <- grepl("data.frame|matrix", unlist(lapply(nm, function(cl) paste(class(cl), collapse = " ") )))
  # first cbind vectors together 
  vec <- data.frame(nm[!dfdetect])
  n <- max(sapply(nm[dfdetect], nrow)) 
  vec <- data.frame(lapply(vec, function(x) rep(x, n)))
  if (nrow(vec) > 0) nm <- c(nm[dfdetect], list(vec))
  nm <- lapply(nm, as.data.frame)

  do.call(cbind, lapply(nm, function (df1) 
    rbind(df1, as.data.frame(matrix(NA, ncol = ncol(df1), nrow = n-nrow(df1), dimnames = list(NULL, names(df1))))) )) 
}

cbind.fill(data.frame(idx = numeric()), matrix(0, ncol = 2), 
           data.frame(qwe = 1:3, rty = letters[1:3]), type = "GOOD", mark = "K-5")
#       idx V1 V2 qwe rty type mark
#     1  NA  0  0   1   a GOOD  K-5
#     2  NA NA NA   2   b GOOD  K-5
#     3  NA NA NA   3   c GOOD  K-5
凉城 2024-12-19 20:46:18

我只是发现了一个技巧,当我们想要将列添加到空数据框中时,只需首先rbind它,然后再cbind它。

    newdf <- data.frame()
    # add the first column
    newdf <- rbind(newdf,data.frame("col1"=c("row1"=1,"row2"=2)))
    # add the second column
    newdf <- cbind(newdf,data.frame("col2"=c("row1"=3,"row2"=4)))
    # add more columns
    newdf <- cbind(newdf,data.frame("col3"=c("row1"=5,"row2"=6)))
    # result
    #     col1 col2 col3
    #row1    1    3    5
    #row2    2    4    6

我不知道为什么,但这对我有用。

I just find a trick that when we want to add columns into an empty dataframe, just rbind it at first time, than cbind it later.

    newdf <- data.frame()
    # add the first column
    newdf <- rbind(newdf,data.frame("col1"=c("row1"=1,"row2"=2)))
    # add the second column
    newdf <- cbind(newdf,data.frame("col2"=c("row1"=3,"row2"=4)))
    # add more columns
    newdf <- cbind(newdf,data.frame("col3"=c("row1"=5,"row2"=6)))
    # result
    #     col1 col2 col3
    #row1    1    3    5
    #row2    2    4    6

I don't know why, but it works for me.

究竟谁懂我的在乎 2024-12-19 20:46:18

我们可以添加 id 列,然后使用合并

df1 <- mtcars[1:5, 1:2]
#                    mpg cyl id
# Mazda RX4         21.0   6  1
# Mazda RX4 Wag     21.0   6  2
# Datsun 710        22.8   4  3
# Hornet 4 Drive    21.4   6  4
# Hornet Sportabout 18.7   8  5

df2 <- mtcars[6:7, 3:4]
#            disp  hp
# Valiant     225 105
# Duster 360  360 245

#Add id column then merge
df1$id <- seq(nrow(df1)) 
df2$id <- seq(nrow(df2)) 

merge(df1, df2, by = "id", all.x = TRUE, check.names = FALSE)
#   id  mpg cyl disp  hp
# 1  1 21.0   6  225 105
# 2  2 21.0   6  360 245
# 3  3 22.8   4   NA  NA
# 4  4 21.4   6   NA  NA
# 5  5 18.7   8   NA  NA

We could add id column then use merge:

df1 <- mtcars[1:5, 1:2]
#                    mpg cyl id
# Mazda RX4         21.0   6  1
# Mazda RX4 Wag     21.0   6  2
# Datsun 710        22.8   4  3
# Hornet 4 Drive    21.4   6  4
# Hornet Sportabout 18.7   8  5

df2 <- mtcars[6:7, 3:4]
#            disp  hp
# Valiant     225 105
# Duster 360  360 245

#Add id column then merge
df1$id <- seq(nrow(df1)) 
df2$id <- seq(nrow(df2)) 

merge(df1, df2, by = "id", all.x = TRUE, check.names = FALSE)
#   id  mpg cyl disp  hp
# 1  1 21.0   6  225 105
# 2  2 21.0   6  360 245
# 3  3 22.8   4   NA  NA
# 4  4 21.4   6   NA  NA
# 5  5 18.7   8   NA  NA
冷清清 2024-12-19 20:46:18

我们可以使用 list 代替 data.frame ,并在最后将其转换为 data.frame 。例如:

df = list()
df2 = data.frame(col1 = 1:3, col2 = c('a','b','c'))
df = as.data.frame(cbind(df, as.matrix(df2)))
df

#   col1 col2
# 1    1    a
# 2    2    b
# 3    3    c

We can use a list instead of data.frame and convert it to a data.frame at the end. For instance:

df = list()
df2 = data.frame(col1 = 1:3, col2 = c('a','b','c'))
df = as.data.frame(cbind(df, as.matrix(df2)))
df

#   col1 col2
# 1    1    a
# 2    2    b
# 3    3    c
错爱 2024-12-19 20:46:18

对于cbind.fill命名列表,其中名称重叠并且您希望按名称cbind,Tyler的答案可以修改为以下内容:

cbind.fill <- function(...){
    nm <- list(...) 
    nm <- lapply(nm, as.matrix)
    names <- unique(do.call(c, lapply(nm, rownames)))
    res <- matrix(nrow = length(names), ncol = length(nm))
    rownames(res) <- names
    for(i in 1:length(nm))
      res[rownames(nm[[i]]),i] <- nm[[i]][,1]
    res
}

To cbind.fill named lists, where the names overlap and you wish to cbind by names, Tyler's answer may be modified to the following:

cbind.fill <- function(...){
    nm <- list(...) 
    nm <- lapply(nm, as.matrix)
    names <- unique(do.call(c, lapply(nm, rownames)))
    res <- matrix(nrow = length(names), ncol = length(nm))
    rownames(res) <- names
    for(i in 1:length(nm))
      res[rownames(nm[[i]]),i] <- nm[[i]][,1]
    res
}
吹泡泡o 2024-12-19 20:46:17

虽然我认为 Tyler 的解决方案是直接的,也是最好的,但我只是提供另一种方法,使用我们已有的 rbind.fill() 。

require(plyr) # requires plyr for rbind.fill()
cbind.fill <- function(...) {                                                                                                                                                       
  transposed <- lapply(list(...),t)                                                                                                                                                 
  transposed_dataframe <- lapply(transposed, as.data.frame)                                                                                                                         
  return (data.frame(t(rbind.fill(transposed_dataframe))))                                                                                                                          
} 

While, I think Tyler's solution is direct and the best here, I just provide the other way, using rbind.fill() that we already have.

require(plyr) # requires plyr for rbind.fill()
cbind.fill <- function(...) {                                                                                                                                                       
  transposed <- lapply(list(...),t)                                                                                                                                                 
  transposed_dataframe <- lapply(transposed, as.data.frame)                                                                                                                         
  return (data.frame(t(rbind.fill(transposed_dataframe))))                                                                                                                          
} 
潇烟暮雨 2024-12-19 20:46:17

使用 rowr::cbind.fill

rowr::cbind.fill(df1,df2,fill = NA)
   A B
1  1 1
2  2 2
3  3 3
4  4 4
5  5 5
6 NA 6

Using rowr::cbind.fill

rowr::cbind.fill(df1,df2,fill = NA)
   A B
1  1 1
2  2 2
3  3 3
4  4 4
5  5 5
6 NA 6
鯉魚旗 2024-12-19 20:46:17

qpcR 包中的 cbind.na 可以做到这一点。

    install.packages("qpcR")
    library(qpcR)
    qpcR:::cbind.na(1, 1:7)

cbind.na from the qpcR package can do that.

    install.packages("qpcR")
    library(qpcR)
    qpcR:::cbind.na(1, 1:7)
逆夏时光 2024-12-19 20:46:17

ab 是数据框时,以下应该可以正常工作:

ab <- merge(a, b, by="row.names", all=TRUE)[,-1]

或者另一种可能性:

rows <- unique(c(rownames(a), rownames(b)))
ab <- cbind(a[rows ,], b[rows ,])

When a and b are data frames, following should work just fine:

ab <- merge(a, b, by="row.names", all=TRUE)[,-1]

or another possibility:

rows <- unique(c(rownames(a), rownames(b)))
ab <- cbind(a[rows ,], b[rows ,])
够运 2024-12-19 20:46:16

这是一个 cbind 填充:

cbind.fill <- function(...){
    nm <- list(...) 
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow)) 
    do.call(cbind, lapply(nm, function (x) 
        rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

让我们尝试一下:

x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)

cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])

我想我从某个地方偷了这个。

从这里编辑偷窃:链接

Here's a cbind fill:

cbind.fill <- function(...){
    nm <- list(...) 
    nm <- lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow)) 
    do.call(cbind, lapply(nm, function (x) 
        rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

Let's try it:

x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)

cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])

I think I stole this from somewhere.

EDIT STOLE FROM HERE: LINK

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文