使用 TRUE 和 FALSE 数组索引 data.frames
我在 R 中索引 data.frames 时遇到一些问题。我是 R 初学者。我有一个名为 d
的 data.frame
,它有 35512 列和 77 行。我有一个名为 rd 的列表,其中包含 35512 个元素。我希望 d
中与 rd
中的项目相对应的所有列都小于 100。这就是我正在做的事情:
# just to prove I'm not crazy
> length(colnames(d))
[1] 35512
> length(rownames(d))
[1] 77
> length(rd)
[1] 35512
# find all the elements of rd less than 100 (+ unnecessary faffing?)
> i <- unlist(rd<100)
> names(i) <- NULL
# try to extract all the elements of d corresponding to rd < 100
> d <- d[,i]
Error in `[.data.frame`(d, , i) : undefined columns selected
我真的不想做unlist
和 names(i) <- NULL
的东西,但我变得非常偏执。谁能帮忙解释一下这个错误消息到底意味着什么?
如果有帮助,可以使用以下命令创建 rd 变量:
rd = lapply(lapply(d, range), diff)
希望能够告诉我 d 的每列范围的差异。
PS 对于任何能告诉我一个命令来查找 data.frame 的形状而不是查询其行和列名称的长度的人来说,这真是太棒了。
编辑:这是 rd
的样子:
> rd[1:3]
$`10338001`
[1] 7198.886
$`10338003`
[1] 4748.963
$`10338004`
[1] 3173.046
当我完成我的工作后,i
看起来像这样:
> i[7:10]
[1] FALSE FALSE FALSE TRUE
I'm having some trouble indexing data.frames in R. I'm an R beginner. I have a data.frame
called d
which has 35512 columns and 77 rows. I have a list called rd
which contains 35512 elements. I'd like all the columns of d
which correspond to the items in rd
less than 100. Here's what I'm doing:
# just to prove I'm not crazy
> length(colnames(d))
[1] 35512
> length(rownames(d))
[1] 77
> length(rd)
[1] 35512
# find all the elements of rd less than 100 (+ unnecessary faffing?)
> i <- unlist(rd<100)
> names(i) <- NULL
# try to extract all the elements of d corresponding to rd < 100
> d <- d[,i]
Error in `[.data.frame`(d, , i) : undefined columns selected
I don't really want to be doing the unlist
and names(i) <- NULL
stuff but I'm getting seriously paranoid. Can anyone help with what the hell this error message means?
In case it helps, the rd
variable is created using the following:
rd = lapply(lapply(d, range), diff)
Which hopefully tells me the difference in the range of each column of d
.
P.S. bonus awesomeness for anyone who can tell me a command to find the shape of a data.frame other than querying the length of its row and column names.
Edit: Here's what rd
looks like:
> rd[1:3]
I'm having some trouble indexing data.frames in R. I'm an R beginner. I have a data.frame
called d
which has 35512 columns and 77 rows. I have a list called rd
which contains 35512 elements. I'd like all the columns of d
which correspond to the items in rd
less than 100. Here's what I'm doing:
# just to prove I'm not crazy
> length(colnames(d))
[1] 35512
> length(rownames(d))
[1] 77
> length(rd)
[1] 35512
# find all the elements of rd less than 100 (+ unnecessary faffing?)
> i <- unlist(rd<100)
> names(i) <- NULL
# try to extract all the elements of d corresponding to rd < 100
> d <- d[,i]
Error in `[.data.frame`(d, , i) : undefined columns selected
I don't really want to be doing the unlist
and names(i) <- NULL
stuff but I'm getting seriously paranoid. Can anyone help with what the hell this error message means?
In case it helps, the rd
variable is created using the following:
rd = lapply(lapply(d, range), diff)
Which hopefully tells me the difference in the range of each column of d
.
P.S. bonus awesomeness for anyone who can tell me a command to find the shape of a data.frame other than querying the length of its row and column names.
Edit: Here's what rd
looks like:
10338001`
[1] 7198.886
I'm having some trouble indexing data.frames in R. I'm an R beginner. I have a data.frame
called d
which has 35512 columns and 77 rows. I have a list called rd
which contains 35512 elements. I'd like all the columns of d
which correspond to the items in rd
less than 100. Here's what I'm doing:
# just to prove I'm not crazy
> length(colnames(d))
[1] 35512
> length(rownames(d))
[1] 77
> length(rd)
[1] 35512
# find all the elements of rd less than 100 (+ unnecessary faffing?)
> i <- unlist(rd<100)
> names(i) <- NULL
# try to extract all the elements of d corresponding to rd < 100
> d <- d[,i]
Error in `[.data.frame`(d, , i) : undefined columns selected
I don't really want to be doing the unlist
and names(i) <- NULL
stuff but I'm getting seriously paranoid. Can anyone help with what the hell this error message means?
In case it helps, the rd
variable is created using the following:
rd = lapply(lapply(d, range), diff)
Which hopefully tells me the difference in the range of each column of d
.
P.S. bonus awesomeness for anyone who can tell me a command to find the shape of a data.frame other than querying the length of its row and column names.
Edit: Here's what rd
looks like:
10338003`
[1] 4748.963
I'm having some trouble indexing data.frames in R. I'm an R beginner. I have a data.frame
called d
which has 35512 columns and 77 rows. I have a list called rd
which contains 35512 elements. I'd like all the columns of d
which correspond to the items in rd
less than 100. Here's what I'm doing:
# just to prove I'm not crazy
> length(colnames(d))
[1] 35512
> length(rownames(d))
[1] 77
> length(rd)
[1] 35512
# find all the elements of rd less than 100 (+ unnecessary faffing?)
> i <- unlist(rd<100)
> names(i) <- NULL
# try to extract all the elements of d corresponding to rd < 100
> d <- d[,i]
Error in `[.data.frame`(d, , i) : undefined columns selected
I don't really want to be doing the unlist
and names(i) <- NULL
stuff but I'm getting seriously paranoid. Can anyone help with what the hell this error message means?
In case it helps, the rd
variable is created using the following:
rd = lapply(lapply(d, range), diff)
Which hopefully tells me the difference in the range of each column of d
.
P.S. bonus awesomeness for anyone who can tell me a command to find the shape of a data.frame other than querying the length of its row and column names.
Edit: Here's what rd
looks like:
10338004`
[1] 3173.046
and when I've done my faffing, i
looks like this:
> i[7:10]
[1] FALSE FALSE FALSE TRUE
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您尝试过吗:
这是一个独立的示例:
要获取数据框的形状,请使用
nrow
和ncol
。编辑:
根据您对我的
NA
问题的回答,听起来您的索引中存在非逻辑值,这些值是由于列表中缺少值而导致的。最好的办法是首先决定如何处理缺失值。然后使用 is.na 函数处理它们(这里我扩展了上面的示例):为了处理这个问题,我将把 NA 值设置为 0(这意味着相应的列将是包含在最终的 data.frame 中):
您需要自己决定如何处理
NA
值。Have you tried this:
Here's a self-contained example:
To get the shape of a dataframe, use
nrow
andncol
.Edit:
Based on your response to my
NA
question, it sounds like you have non-logical values in your index that result from missing values in your list. The best thing to do is to first decide how you want to treat a missing value. Then deal with them using theis.na
function (here I extend my example from above):To deal with this, I will set that NA value to 0 (which means that it the respective column will be included in the final data.frame):
You need to decide for yourself what to do with the
NA
values.对于额外的 Q,您可以使用“dim”命令获得数据框或矩阵的“形状”。
For the bonus Q, you get "shape" of a data frame or matrix using the "dim" command.