选择矩阵中满足条件的行
在带有矩阵的 R 中:
one two three four
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 11 18
[4,] 4 9 11 19
[5,] 5 10 15 20
我想提取其行具有第三列 = 11 的子矩阵。也就是说:
one two three four
[1,] 1 6 11 16
[3,] 3 8 11 18
[4,] 4 9 11 19
我想在不循环的情况下执行此操作。我是 R 新手,所以这可能非常明显,但是 文档通常比较简洁。
In R with a matrix:
one two three four
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 11 18
[4,] 4 9 11 19
[5,] 5 10 15 20
I want to extract the submatrix whose rows have column three = 11. That is:
one two three four
[1,] 1 6 11 16
[3,] 3 8 11 18
[4,] 4 9 11 19
I want to do this without looping. I am new to R so this is probably very obvious but the
documentation is often somewhat terse.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
以下命令将选择上面矩阵的第一行。
这将选择最后三个。
在这两种情况下,结果都是矩阵。
如果您想使用列名称来选择列,那么您最好将其转换为数据框,
然后您可以选择,
或者,您可以使用子集命令。
The following command will select the first row of the matrix above.
And this will select the last three.
The result will be a matrix in both cases.
If you want to use column names to select columns then you would be best off converting it to a dataframe with
Then you can select with
Or, you could use the subset command.
我将选择使用 dplyr 包的简单方法。
如果数据框是数据。
I will choose a simple approach using the dplyr package.
If the dataframe is data.
Subset是一个非常慢的函数,我个人觉得它没什么用。
我假设您有一个名为
Mat
的 data.frame、数组、矩阵,其中A
、B
、C
作为列姓名;那么您需要做的就是:在一列上有一个条件的情况下,假设 A 列
如果不同列上有多个条件,您可以创建一个虚拟变量。假设条件为
A = 10
、B = 5
和C > 1。 2
,那么我们有:通过使用
system.time
测试速度优势,which
方法比subset
快10倍方法。Subset is a very slow function , and I personally find it useless.
I assume you have a data.frame, array, matrix called
Mat
withA
,B
,C
as column names; then all you need to do is:In the case of one condition on one column, lets say column A
In the case of multiple conditions on different column, you can create a dummy variable. Suppose the conditions are
A = 10
,B = 5
, andC > 2
, then we have:By testing the speed advantage with
system.time
, thewhich
method is 10x faster than thesubset
method.如果您的矩阵名为
m
,只需使用:If your matrix is called
m
, just use :如果数据集称为data,则所有行满足列“pm2.5”的值>“pm2.5”的条件。 300 可以通过 -
If the dataset is called data, then all the rows meeting a condition where value of column 'pm2.5' > 300 can be received by -
如果使用 as.data.frame() 将矩阵转换为数据框,这会更容易做到。在这种情况下,前面的答案(使用子集或 m$三)将起作用,否则将不起作用。
要对矩阵执行操作,您可以按名称定义列:
或按数字定义列:
请注意,如果只有一行匹配,则结果是整数向量,而不是矩阵。
This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.
To perform the operation on a matrix, you can define a column by name:
Or by number:
Note that if only one row matches, the result is an integer vector, not a matrix.