如何将文本和变量粘贴到 R 中的逻辑表达式中?

发布于 2024-11-17 00:06:06 字数 499 浏览 3 评论 0原文

我想将变量粘贴到用于子集数据的逻辑表达式中,但粘贴时子集函数不会将它们视为列名称(无论是带 ot 还是不带引号)。

我有一个数据框,其中的列名为 col1、col2 等。我想对 colx < 的行进行子集化。 0.05

这行得通:

subsetdata<-subset(dataframe, col1<0.05)

subsetdata<-subset(dataframe, col2<0.05)

这行不通:

for (k in 1:2){
subsetdata<-subset(dataframe, paste("col",k,sep="")<0.05)
}

for (k in 1:2){
subsetdata<-subset(dataframe, noquote(paste("col",k,sep=""))<0.05)
}

我找不到答案;有什么建议吗?

I want to paste variables in the logical expression that I am using to subset data, but the subset function does not see them as column names when pasted (either with ot without quotes).

I have a dataframe with columns named col1, col2 etc. I want to subset for the rows in which colx < 0.05

This DOES work:

subsetdata<-subset(dataframe, col1<0.05)

subsetdata<-subset(dataframe, col2<0.05)

This does NOT work:

for (k in 1:2){
subsetdata<-subset(dataframe, paste("col",k,sep="")<0.05)
}

for (k in 1:2){
subsetdata<-subset(dataframe, noquote(paste("col",k,sep=""))<0.05)
}

I can't find the answer; any suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

王权女流氓 2024-11-24 00:06:06

尝试使用子集会使这件事变得比实际需要的困难得多。请注意,?subset表示第二个参数(也称为subset)必须是一个表达式,而您没有给它一个表达式:

> is.expression(paste("col",1:2,sep="")<0.05)
[1] FALSE

您可以构造一个未计算的表达式,然后在将其传递给时对其进行计算>子集,但还有更简单的方法。例如:只需利用 < 运算符的矢量化性质。

# sample data
set.seed(21)
dataframe <- data.frame(col1=rnorm(10),col2=rnorm(10),col3=1)

logicalCols <- dataframe[,paste("col",1:2,sep="")] < 0.05
#        col1  col2
#  [1,] FALSE  TRUE
#  [2,] FALSE FALSE
#  [3,] FALSE  TRUE
#  [4,]  TRUE FALSE
#  [5,] FALSE FALSE
#  [6,] FALSE FALSE
#  [7,]  TRUE FALSE
#  [8,]  TRUE FALSE
#  [9,] FALSE  TRUE
# [10,]  TRUE  TRUE
ANY <- apply(logicalCols, 1, any)  # any colx < 0.05
ALL <- apply(logicalCols, 1, all)  # all colx < 0.05
dataframe[ANY,]
dataframe[ALL,]

You're making this a lot harder than it needs to be by trying to use subset. Note that ?subset says the second argument (also named subset) must be an expression and you're not giving it an expression:

> is.expression(paste("col",1:2,sep="")<0.05)
[1] FALSE

You could construct an unevaluated expression then evaluate it as you pass it to subset, but there are much easier ways. For example: just take advantage of the vectorized nature of the < operator.

# sample data
set.seed(21)
dataframe <- data.frame(col1=rnorm(10),col2=rnorm(10),col3=1)

logicalCols <- dataframe[,paste("col",1:2,sep="")] < 0.05
#        col1  col2
#  [1,] FALSE  TRUE
#  [2,] FALSE FALSE
#  [3,] FALSE  TRUE
#  [4,]  TRUE FALSE
#  [5,] FALSE FALSE
#  [6,] FALSE FALSE
#  [7,]  TRUE FALSE
#  [8,]  TRUE FALSE
#  [9,] FALSE  TRUE
# [10,]  TRUE  TRUE
ANY <- apply(logicalCols, 1, any)  # any colx < 0.05
ALL <- apply(logicalCols, 1, all)  # all colx < 0.05
dataframe[ANY,]
dataframe[ALL,]
む无字情书 2024-11-24 00:06:06

以下是一些更接近 Jasper 方法的选项。首先,您可以将列名称定义为单独的变量,然后使用它从 data.frame 中选择变量,就好像它是一个 list (因为 >data.frame 基本上是一个 list):

for(k in 1:2){
  colname <- paste("col",k,sep="")
  subsetdata <- dataframe[dataframe[[colname]] < 0.05, ]
}

或者您可以这样引用列名称:

  subsetdata <- dataframe[dataframe[colname,] < 0.05, ]

最后,您可以使用 subset,尽管您需要提供一个逻辑表达式(正如 Joshua Ulrich 所指出的):

  subsetdata <- subset(dataframe, eval(substitute(x < 0.05, list(x = as.name(colname)))))

Here are a couple of options that are closer to the Jasper's approach. First, you could define the column name as a separate variable and then use it to select the variable from the data.frame as if it were a list (since a data.frame is basically a list):

for(k in 1:2){
  colname <- paste("col",k,sep="")
  subsetdata <- dataframe[dataframe[[colname]] < 0.05, ]
}

Or you could refer to the column name as such:

  subsetdata <- dataframe[dataframe[colname,] < 0.05, ]

Finally, you could use subset, although you need to provide a logical expression (as pointed out by Joshua Ulrich):

  subsetdata <- subset(dataframe, eval(substitute(x < 0.05, list(x = as.name(colname)))))
魄砕の薆 2024-11-24 00:06:06

我不太清楚您要做什么,但也许看到 &|subset 操作中使用会有所帮助。

col1col2 都小于 0.05:

subsetdata<-subset(dataframe, col1 < 0.05 & col2 < 0.05)

col1col2 小于 0.05:

subsetdata<-subset(dataframe, col1 < 0.05 | col2 < 0.05)

Joshua 的答案是一个很好的方法在许多列上更容易地做到这一点。

It's not quite clear to me what you're trying to do but perhaps seeing & and | used in a subset operation would be helpful.

Both col1 and col2 less than 0.05:

subsetdata<-subset(dataframe, col1 < 0.05 & col2 < 0.05)

Either col1 or col2 less than 0.05:

subsetdata<-subset(dataframe, col1 < 0.05 | col2 < 0.05)

Joshua's answer is a great way of doing this more easily over many columns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文