在 R 中按行名称(有条件)过滤矩阵

发布于 2025-01-11 11:58:43 字数 912 浏览 0 评论 0原文

假设您有这个矩阵:

> dput(b)
structure(c(8.428852043462e-16, 0.98006786315672, 0.0636247563553075, 
-0.246858810409958, -1.37811970502942, -0.281625554642936, -8.91350446654785e-16, 
-0.305283565399869, -1.00802628192793, 0.14027577547337, -1.66288850621351, 
0.16259170026583, -1.3280185195633e-15, 0.278629912397198, -0.188868484543887, 
1.0533053295465, 1.16670767240438, -0.48819960367166), .Dim = c(6L, 
3L), .Dimnames = list(c("(Intercept)", "F_slowPC1", "F_slowPC2", 
"F_slowPC3", "data_yFYFF", "data_yPUNEW"), c("PC1", "PC2", "PC3"
)))

我只想获取以“data_y”字符串开头的行。

我试图用逻辑条件过滤它,其中:

stingr::str_detect(rownames(b), "data_y")
[1] FALSE FALSE FALSE FALSE  TRUE  TRUE

所以我尝试了

b[rownames(b) %in% str_detect(rownames(b),"data_y")==T]

,但它只是让我Numeric(0)

如何获取包含“data_y”的所有行?

我不想将此矩阵转换为数据框。

Suppose you have this matrix:

> dput(b)
structure(c(8.428852043462e-16, 0.98006786315672, 0.0636247563553075, 
-0.246858810409958, -1.37811970502942, -0.281625554642936, -8.91350446654785e-16, 
-0.305283565399869, -1.00802628192793, 0.14027577547337, -1.66288850621351, 
0.16259170026583, -1.3280185195633e-15, 0.278629912397198, -0.188868484543887, 
1.0533053295465, 1.16670767240438, -0.48819960367166), .Dim = c(6L, 
3L), .Dimnames = list(c("(Intercept)", "F_slowPC1", "F_slowPC2", 
"F_slowPC3", "data_yFYFF", "data_yPUNEW"), c("PC1", "PC2", "PC3"
)))

I want to just get the rows that start with "data_y" string.

I was trying to filter it up with a logical condition where:

stingr::str_detect(rownames(b), "data_y")
[1] FALSE FALSE FALSE FALSE  TRUE  TRUE

So I tried

b[rownames(b) %in% str_detect(rownames(b),"data_y")==T]

but it just gets me Numeric(0).

How can I get all the rows that contain "data_y"?

I would prefer to not transform this matrix to data frame.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

_蜘蛛 2025-01-18 11:58:43

使用 b[i, ] 对行进行子集化。 b[i] 对由 i 索引的 as.vector(b) 元素进行子集化,这不是您想要的。

您不需要 stringr 来构造 i,因为基础 R 具有 startsWithgrep。这些语句中的任何一个都可以工作:

b[startsWith(rownames(b), "data_y"), , drop = FALSE]
b[grep("^data_y", rownames(b)), , drop = FALSE]

drop = FALSE 保证矩阵结果。默认情况下,如果仅对一行进行索引,则结果是无量纲向量。您可以比较 b[1, ]b[1, , drop = FALSE] 来了解我的意思。

Use b[i, ] to subset rows. b[i] subsets the elements of as.vector(b) indexed by i, which is not what you want.

You don't need stringr to construct i, because base R has startsWith and grep. Either of these statements would work:

b[startsWith(rownames(b), "data_y"), , drop = FALSE]
b[grep("^data_y", rownames(b)), , drop = FALSE]

drop = FALSE guarantees a matrix result. By default, the result is a dimensionless vector if only one row is indexed. You can compare b[1, ] and b[1, , drop = FALSE] to see what I mean.

执手闯天涯 2025-01-18 11:58:43

您只需

b[stringr::str_detect(rownames(b), "data_y"), ]

在函数后添加 , 即可指定您在行上选择。

You just need

b[stringr::str_detect(rownames(b), "data_y"), ]

Add a , after your function to specify you select on the rows.

我爱人 2025-01-18 11:58:43

data.table 的另一种选择:

library(data.table)
bt <- as.data.table(b, keep.rownames = TRUE)

bt[like(rn,"data_y")]

#            rn        PC1        PC2        PC3
#1:  data_yFYFF -1.3781197 -1.6628885  1.1667077
#2: data_yPUNEW -0.2816256  0.1625917 -0.4881996

*注意:这可以在一行中完成,但选择创建一个数据表而不是使用 setDT 以防万一您不想要更改原始数据框。

Another option with data.table:

library(data.table)
bt <- as.data.table(b, keep.rownames = TRUE)

bt[like(rn,"data_y")]

#            rn        PC1        PC2        PC3
#1:  data_yFYFF -1.3781197 -1.6628885  1.1667077
#2: data_yPUNEW -0.2816256  0.1625917 -0.4881996

*Note: This could be done in one line, but opted to create a data table rather than using setDT in case you don't want to alter the original dataframe.

我很OK 2025-01-18 11:58:43

一种可能的解决方案,基于 tidyverse 并之前将数据转换为数据帧:

library(tidyverse)

df %>% as.data.frame %>% 
  rownames_to_column("coefficients") %>% 
  filter(str_detect(coefficients, "^data_y"))

#>   coefficients        PC1        PC2        PC3
#> 1   data_yFYFF -1.3781197 -1.6628885  1.1667077
#> 2  data_yPUNEW -0.2816256  0.1625917 -0.4881996

A possible solution, based on tidyverse with previous conversion of the data to dataframe:

library(tidyverse)

df %>% as.data.frame %>% 
  rownames_to_column("coefficients") %>% 
  filter(str_detect(coefficients, "^data_y"))

#>   coefficients        PC1        PC2        PC3
#> 1   data_yFYFF -1.3781197 -1.6628885  1.1667077
#> 2  data_yPUNEW -0.2816256  0.1625917 -0.4881996
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文