R:向空数据框添加行时丢失列名称
我刚刚开始使用 R 并遇到一个奇怪的行为:在空数据框中插入第一行时,原始列名称丢失。
示例:
a<-data.frame(one = numeric(0), two = numeric(0))
a
#[1] one two
#<0 rows> (or 0-length row.names)
names(a)
#[1] "one" "two"
a<-rbind(a, c(5,6))
a
# X5 X6
#1 5 6
names(a)
#[1] "X5" "X6"
如您所见,列名称 one 和 two 被替换为 X5 和 X6。
有人可以告诉我为什么会发生这种情况吗?有没有正确的方法可以在不丢失列名的情况下做到这一点?
霰弹枪解决方案是将名称保存在辅助向量中,然后在完成数据框处理后将它们添加回来。
谢谢
上下文:
我创建了一个函数,它收集一些数据并将它们作为新行添加到作为参数接收的数据帧中。 我创建数据框,迭代数据源,将 data.frame 传递给每个函数调用以填充其结果。
I am just starting with R and encountered a strange behaviour: when inserting the first row in an empty data frame, the original column names get lost.
example:
a<-data.frame(one = numeric(0), two = numeric(0))
a
#[1] one two
#<0 rows> (or 0-length row.names)
names(a)
#[1] "one" "two"
a<-rbind(a, c(5,6))
a
# X5 X6
#1 5 6
names(a)
#[1] "X5" "X6"
As you can see, the column names one and two were replaced by X5 and X6.
Could somebody please tell me why this happens and is there a right way to do this without losing column names?
A shotgun solution would be to save the names in an auxiliary vector and then add them back when finished working on the data frame.
Thanks
Context:
I created a function which gathers some data and adds them as a new row to a data frame received as a parameter.
I create the data frame, iterate through my data sources, passing the data.frame to each function call to be filled up with its results.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
rbind
帮助页面指定:因此,事实上,
a
在您的rbind
指令中被忽略。似乎并没有完全被忽略,因为它是一个数据帧,所以 rbind 函数被称为 rbind.data.frame :也许插入行的一种方法是:
但根据您的代码,可能有更好的方法。
The
rbind
help pages specifies that :So, in fact,
a
is ignored in yourrbind
instruction. Not totally ignored, it seems, because as it is a data frame therbind
function is called asrbind.data.frame
:Maybe one way to insert the row could be :
But there may be a better way to do it depending on your code.
几乎要屈服于这个问题。
1) 创建数据框,并将
stringsAsFactor
设置为FALSE
,否则您将直接进入下一期2) 不要使用
rbind
- 不知道为什么实际上它弄乱了列名。只需这样做:df[nrow(df)+1,] <- c("d","gsgsgd",4)
was almost surrendering to this issue.
1) create data frame with
stringsAsFactor
set toFALSE
or you run straight into the next issue2) don't use
rbind
- no idea why on earth it is messing up the column names. simply do it this way:df[nrow(df)+1,] <- c("d","gsgsgd",4)
解决方法是:
?rbind
指出合并对象需要匹配的名称:Workaround would be:
?rbind
states that merging objects demands matching names:FWIW,另一种设计可能让您的函数为两列构建向量,而不是绑定到数据框:
修改函数中的向量:
根据需要重复,然后一次性创建 data.frame:
FWIW, an alternative design might have your functions building vectors for the two columns, instead of rbinding to a data frame:
Modify the vectors in your functions:
Repeat as needed, then create your data.frame in one go:
使此工作通用且最少重新键入列名称的一种方法如下。此方法不需要破解 NA 或 0。
rs 将具有正确的名称
另一种更干净地执行此操作的方法是使用 data.table:
请注意,data.table 也是 data.frame。
One way to make this work generically and with the least amount of re-typing the column names is the following. This method doesn't require hacking the NA or 0.
rs will have the correct names
Another way to do this more cleanly is to use data.table:
Notice that a data.table is also a data.frame.
你可以这样做:
给初始数据框添加一行,
添加新行并取出 NAS,
但要注意你的新行没有 NA,否则它也会被删除。
干杯
阿古斯
You can do this:
give one row to the initial data frame
add your new row and take out the NAS
but watch out that your newrow does not have NAs or it will be erased too.
Cheers
Agus
我使用以下解决方案向空数据框添加一行:
HTH。
亲切的问候
格奥尔格
I use the following solution to add a row to an empty data frame:
HTH.
Kind regards
Georg
我没有使用
numeric(0)
构建 data.frame,而是使用as.numeric(0)
。这会创建一个额外的初始行
绑定附加行
然后使用负索引删除第一行(虚假)
注意:它会弄乱索引(最左边)。我还没弄清楚如何防止这种情况(还有其他人吗?),但大多数时候这可能并不重要。
Instead of constructing the data.frame with
numeric(0)
I useas.numeric(0)
.This creates an extra initial row
Bind the additional rows
Then use negative indexing to remove the first (bogus) row
Note: it messes up the index (far left). I haven't figured out how to prevent that (anyone else?), but most of the time it probably doesn't matter.
对这一令人尊敬的烦恼的研究使我来到了这一页。我想为 Georg 的出色答案添加更多解释(https://stackoverflow.com/a/41609844/2757825 ),这不仅解决了OP提出的问题(丢失字段名称),而且还防止了所有字段不必要的转换为因子。对我来说,这两个问题是相辅相成的。我想要一个基于 R 的解决方案,它不涉及编写额外的代码,但保留两个不同的操作:定义数据框、追加行——这就是 Georg 的答案提供的。
下面的前两个示例说明了问题,第三个和第四个示例显示了 Georg 的解决方案。
示例 1:使用 rbind 将新行作为向量追加
示例 2:将新行作为 rbind 内的数据框追加
示例 3:将新行作为数据框附加到 rbind 内,并设置 stringsAsFactors=FALSE
示例 4:与示例 3 类似,但一次添加多行。
Researching this venerable R annoyance brought me to this page. I wanted to add a bit more explanation to Georg's excellent answer (https://stackoverflow.com/a/41609844/2757825), which not only solves the problem raised by the OP (losing field names) but also prevents the unwanted conversion of all fields to factors. For me, those two problems go together. I wanted a solution in base R that doesn't involve writing extra code but preserves the two distinct operations: define the data frame, append the row(s)--which is what Georg's answer provides.
The first two examples below illustrate the problems and the third and fourth show Georg's solution.
Example 1: Append the new row as vector with rbind
Example 2: Append the new row as a data frame inside rbind
Example 3: Append the new row inside rbind as a data frame, with stringsAsFactors=FALSE
Example 4: Like example 3, but adding multiple rows at once.
您可以使用
tibble
包中的add_row
:输出
You can use
add_row
from thetibble
package:Output