R - rbind 问题 ->具有相同名称和格式的重复列

发布于 2025-01-16 11:25:42 字数 626 浏览 4 评论 0原文

最近，我在 RStudio 中使用 rbind 将数据表添加到另一个表时遇到了一个问题。假设两个数据表有两列名称和格式完全相同（我已使用 str() 检查过这一点）。但是当我想绑定它们时（代码 table3<-rbind(table1, table2,fill=T)），它会复制这些列，以便在生成的 table3 中有两列具有完全相同的名称（第一个列的条目为来自 table1 的所有行和 table2 的所有行的第二行）或者它的列名只有一次，但来自 table2 的所有行条目都是 NA。两者都非常烦人，而且也是一个新问题，因为我之前使用了完全相同的代码并且它运行得很好。我使用的RVersion是R.4.1.1。我是否忽略了什么？或者这个版本可能存在一些错误？

非常感谢您的帮助。

表 1 如下所示：1。表 2 如下所示：2。

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜点 2025-01-23 11:25:42

尝试使用 merge()，例如：

table3 <- merge(table1,table2, by.x = 'names of table1', by.y = 'names of table2')

try to use merge(), e.g.:

table3 <- merge(table1,table2, by.x = 'names of table1', by.y = 'names of table2')

回复收藏 0 原文

不知在何时 2025-01-23 11:25:42

检查以确保两者的编码相同。

可重现的例子。

对列名称进行编码

quux1 <- quux2 <- data.table(Rechtsträger="a")
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1:            a
# 2:            a
Encoding(names(quux1))
# [1] "unknown"
Encoding(names(quux1)) <- "latin1"
rbind(quux1, quux2)
# Error in rbindlist(l, use.names, fill, idcol) : 
#   Column 1 ['Rechtsträger'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.

一个修复？复制编码：

Encoding(names(quux2)) <- Encoding(names(quux1))
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1:            a
# 2:            a

对内容进行编码

我认为这不是问题，因为它会产生不同的（尽管相似）错误。

vec <- "Rechtsträger"
quux1 <- data.table(Rechtsträger=vec)
Encoding(vec)
# [1] "latin1"
Encoding(vec) <- "UTF-8"
Encoding(vec)
# [1] "UTF-8"
quux2 <- data.table(Rechtsträger=vec)
rbind(quux1, quux2)
# Error in nchar(x) : invalid multibyte string, element 2

类似的修复：

Encoding(quux2[[1]]) <- Encoding(quux1[[1]])
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1: Rechtsträger
# 2: Rechtsträger

请注意未来的自己：我假设 dput 的输出完全明确，但发现了它的缺陷：

quux1 <- quux2 <- data.table(Rechtsträger="a")
Encoding(names(quux1)) <- "latin1"
dput(quux1)
# structure(list(Rechtsträger = "a"), row.names = c(NA, -1L), class = c("data.table", 
# "data.frame"), .internal.selfref = <pointer: 0x0000000004501ef0>)
dput(quux2)
# structure(list(Rechtsträger = "a"), row.names = c(NA, -1L), class = c("data.table", 
# "data.frame"), .internal.selfref = <pointer: 0x0000000004501ef0>)

identical(deparse1(quux1), deparse1(quux2))
# [1] TRUE
identical(Encoding(names(quux1)), Encoding(names(quux2)))
# [1] FALSE

Check to ensure that the encoding is the same on both.

Reproducible examples.

Encoding on the column names

quux1 <- quux2 <- data.table(Rechtsträger="a")
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1:            a
# 2:            a
Encoding(names(quux1))
# [1] "unknown"
Encoding(names(quux1)) <- "latin1"
rbind(quux1, quux2)
# Error in rbindlist(l, use.names, fill, idcol) : 
#   Column 1 ['Rechtsträger'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.

One fix? Copy the encoding:

Encoding(names(quux2)) <- Encoding(names(quux1))
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1:            a
# 2:            a

Encoding on the contents

I don't think this is the problem, as it is producing a different (though similar) error.

vec <- "Rechtsträger"
quux1 <- data.table(Rechtsträger=vec)
Encoding(vec)
# [1] "latin1"
Encoding(vec) <- "UTF-8"
Encoding(vec)
# [1] "UTF-8"
quux2 <- data.table(Rechtsträger=vec)
rbind(quux1, quux2)
# Error in nchar(x) : invalid multibyte string, element 2

Similar fix:

Encoding(quux2[[1]]) <- Encoding(quux1[[1]])
rbind(quux1, quux2)
#    Rechtsträger
#          <char>
# 1: Rechtsträger
# 2: Rechtsträger

Note to future self: my assumption that the output from dput is perfectly unambiguous has found its flaw:

quux1 <- quux2 <- data.table(Rechtsträger="a")
Encoding(names(quux1)) <- "latin1"
dput(quux1)
# structure(list(Rechtsträger = "a"), row.names = c(NA, -1L), class = c("data.table", 
# "data.frame"), .internal.selfref = <pointer: 0x0000000004501ef0>)
dput(quux2)
# structure(list(Rechtsträger = "a"), row.names = c(NA, -1L), class = c("data.table", 
# "data.frame"), .internal.selfref = <pointer: 0x0000000004501ef0>)

identical(deparse1(quux1), deparse1(quux2))
# [1] TRUE
identical(Encoding(names(quux1)), Encoding(names(quux2)))
# [1] FALSE

回复收藏 0 原文

~没有更多了~