使用$和字符值动态选择数据框列
我有一个由不同列名组成的向量,我希望能够循环每个列名以从 data.frame 中提取该列。例如,考虑数据集 mtcars 和存储在字符向量 cols 中的一些变量名称。当我尝试使用 cols
的动态子集从 mtcars
中选择变量时,这些都不起作用,
cols <- c("mpg", "cyl", "am")
col <- cols[1]
col
# [1] "mpg"
mtcars$col
# NULL
mtcars$cols[1]
# NULL
相同的值
mtcars$mpg
我怎样才能让它们返回与此外我如何循环 遍历 cols
中的所有列以获取某种循环中的值。
for(x in seq_along(cols)) {
value <- mtcars[ order(mtcars$cols[x]), ]
}
I have a vector of different column names and I want to be able to loop over each of them to extract that column from a data.frame. For example, consider the data set mtcars
and some variable names stored in a character vector cols
. When I try to select a variable from mtcars
using a dynamic subset of cols
, nether of these work
cols <- c("mpg", "cyl", "am")
col <- cols[1]
col
# [1] "mpg"
mtcars$col
# NULL
mtcars$cols[1]
# NULL
how can I get these to return the same values as
mtcars$mpg
Furthermore how can I loop over all the columns in cols
to get the values in some sort of loop.
for(x in seq_along(cols)) {
value <- mtcars[ order(mtcars$cols[x]), ]
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
由于某些CSV文件具有相同的同一列的各种名称,因此存在类似的问题。
这是解决方案:
我写了一个函数,以返回列表中的第一个有效列名,然后使用该函数...
Had similar problem due to some CSV files that had various names for the same column.
This was the solution:
I wrote a function to return the first valid column name in a list, then used that...
我身上发生过好几次。使用 data.table 包。当您只有 1 列需要参考时。使用
或
当您有 2 个或更多列要引用时,请确保使用:
x 可以是另一个 data.frame 中的字符串。
Happened to me several times. Use data.table package. When you only have 1 column that you need to refer to. Use either
or
When you have 2 or more columns to refer to, make sure to use:
That x can be strings in another data.frame.
太晚了..但我想我已经有了答案 -
这是我的示例 Study.df 数据框 -
然后 -
too late.. but I guess I have the answer -
Here's my sample study.df dataframe -
And then -
如果您想选择具有特定名称的列,那么
您也可以循环运行它
添加动态名称的反向方法,例如,如果 A 是数据框,xyz 是要命名为 x 的列,那么我
再次喜欢这样,这也可以在循环中添加
if you want to select column with specific name then just do
you can run it in loop as well
reverse way to add dynamic name eg if A is data frame and xyz is column to be named as x then I do like this
again this can also be added in loop
您无法使用
$
进行此类子集化。在源代码 (R/src/main/subset.c
) 中,它指出:第二个参数?什么?!您必须意识到
$
与 R 中的其他所有内容一样(包括例如(
、+
、^
etc) 是一个函数,它接受参数并进行计算,实际上可以重写为或
但是...
...例如永远不会工作,其他任何必须首先执行的操作也不会工作。您只能传递一个字符串,该字符串是第二个参数。 从不评估,
而是使用
[
(或者如果您只想提取单个列作为向量)。 )例如,
您可以使用
do.call
构造对order
的调用来执行排序,如下所示:You can't do that kind of subsetting with
$
. In the source code (R/src/main/subset.c
) it states:Second argument? What?! You have to realise that
$
, like everything else in R, (including for instance(
,+
,^
etc) is a function, that takes arguments and is evaluated.df$V1
could be rewritten asor indeed
But...
...for instance will never work, nor will anything else that must first be evaluated in the second argument. You may only pass a string which is never evaluated.
Instead use
[
(or[[
if you want to extract only a single column as a vector).For example,
You can perform the ordering without loops, using
do.call
to construct the call toorder
. Here is a reproducible example below:使用 dplyr 提供了一种简单的语法来对数据帧进行排序
使用 NSE 版本 可能很有用,如下所示 允许动态构建排序列表
Using dplyr provides an easy syntax for sorting the data frames
It might be useful to use the NSE version as shown here to allow dynamically building the sort list
如果我正确理解,您有一个包含变量名称的向量,并希望循环遍历每个名称并通过它们对数据框架进行整理。如果是这样,此示例应为您说明解决方案。您的主要问题(完整的示例还没有完成,所以我“不确定您还缺少什么),它应该是
order(q1_r1000 [,parameter [x]])
而不是订单(Q1_R1000 $参数[X])
,因为参数是一个外部对象,其中包含一个与数据框架直接列相对的可变名称(当$
时将是适当的)。If I understand correctly, you have a vector containing variable names and would like loop through each name and sort your data frame by them. If so, this example should illustrate a solution for you. The primary issue in yours (the full example isn't complete so I"m not sure what else you may be missing) is that it should be
order(Q1_R1000[,parameter[X]])
instead oforder(Q1_R1000$parameter[X])
, since parameter is an external object that contains a variable name opposed to a direct column of your data frame (which when the$
would be appropriate).我将实现
sym
rlang
软件包的功能。假设col
具有“ mpg”
的值。这个想法是将其征收。保持编码!
I would implement the
sym
function ofrlang
package. Let's say thecol
has value as"mpg"
. The idea is to subset it.Keep Coding!
另一种解决方案是使用#get:
Another solution is to use #get: