如何按一列降序和一升序对数据框进行排序?
我有一个数据框,看起来像这样:
P1 P2 P3 T1 T2 T3 I1 I2
1 2 3 5 52 43 61 6 "b"
2 6 4 3 72 NA 59 1 "a"
3 1 5 6 55 48 60 6 "f"
4 2 4 4 65 64 58 2 "b"
我想按 I1 降序对其进行排序,并按 I2 按升序对 I1 中具有相同值的行进行排序,以 1 3 4 2< 的顺序获取行/代码>。但
order
函数似乎只采用一个 decreasing
参数,对于所有排序向量,该参数为 TRUE
或 FALSE
立刻。如何正确排序?
I have a data frame, which looks like that:
P1 P2 P3 T1 T2 T3 I1 I2
1 2 3 5 52 43 61 6 "b"
2 6 4 3 72 NA 59 1 "a"
3 1 5 6 55 48 60 6 "f"
4 2 4 4 65 64 58 2 "b"
I want to sort it by I1 in descending order, and rows with the same value in I1 by I2 in ascending order, getting the rows in the order 1 3 4 2
. But the order
function seems to only take one decreasing
argument, which is then TRUE
or FALSE
for all ordering vectors at once. How do I get my sort correct?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
我使用这段代码来产生您想要的输出。这就是你所追求的吗?
I used this code to produce your desired output. Is this what you were after?
我使用
排名
:I use
rank
:恐怕 Roman Luštrik 的答案是错误的。它偶然作用于这个输入。
例如,考虑其在非常相似的输入上的输出(具有与原始第 3 行类似的附加行,其中 I2 列中带有“c”):
这不是所需的结果:I2 的前三个值是 fb c 而不是
bc f
,这是预期的,因为辅助排序是按升序排列的 I2。要获得 I2 的相反顺序,您希望较大的值较小,反之亦然。对于数值乘以 -1 就可以了,但对于字符则有点棘手。字符/字符串的一般解决方案是遍历因子,反转级别(使大值变小,使小值变大)并将因子更改回字符:
I'm afraid Roman Luštrik's answer is wrong. It works on this input by chance.
Consider for example its output on a very similar input (with an additional line similar to the original line 3 with "c" in the I2 column):
This is not the desired result: the first three values of I2 are
f b c
instead ofb c f
, which would be expected since the secondary sort is I2 in ascending order.To get the reverse order of I2, you want the large values to be small and vice versa. For numeric values multiplying by -1 will do it, but for characters its a bit more tricky. A general solution for characters/strings would be to go through factors, reverse the levels (to make large values small and small values large) and change the factor back to characters:
设 df 为具有 2 个字段 A 和 B 的数据框
情况 1:如果您的字段 A 和 B 是数字
df[order(df[,1],df[,2]),] - 按升序对字段 A 和 B 进行排序
df[order(df[,1],-df[,2]),] - 按升序对字段 A 进行排序,按降序对字段 B 进行排序
优先权给A。
情况 2:如果字段 A 或 B 为非数字,则为因子或字符
在我们的例子中,如果 B 是字符并且我们想按相反顺序排序
df[order(df[,1],-as.numeric(as.factor(df[,2]))),] ->这会将字段 A(数字)按升序排序,将字段 B(字符)按降序排序。
优先级给予 A。
其想法是,您可以仅在数字上应用 -sign 顺序函数。因此,要按降序对字符串进行排序,您必须将它们强制转换为数字。
Let df be the data frame with 2 fields A and B
Case 1: if your field A and B are numeric
df[order(df[,1],df[,2]),] - sorts fields A and B in ascending order
df[order(df[,1],-df[,2]),] - sorts fields A in ascending and B in descending order
priority is given to A.
Case 2: if field A or B is non numeric say factor or character
In our case if B is character and we want to sort in reverse order
df[order(df[,1],-as.numeric(as.factor(df[,2]))),] -> this sorts field A(numerical) in ascending and field B(character) in descending.
priority is given to A.
The idea is that you can apply -sign in order function ony on numericals. So for sorting character strings in descending order you have to coerce them to numericals.
简单的没有等级的:
Simple one without rank :
默认排序是稳定的,因此我们排序两次:首先按次键,然后按主键
The default sort is stable, so we sort twice: First by the minor key, then by the major key
正确的做法是:
The correct way is:
在@dudusan的示例中,您还可以反转 I1 的顺序,然后升序排序:
这似乎有点短,您不会反转 I2 的顺序两次。
In @dudusan's example, you could also reverse the order of I1, and then sort ascending:
This seems a bit shorter, you don't reverse the order of I2 twice.
你可以使用令人惊奇的包 dplyr
有一个函数叫排列。
您只需考虑您选择的层次结构来设置数据框和要排序的列。默认为升序。但如果您想按降序排列,请使用命令 desc。
朗姆酒 <- read.table(textConnection("P1 P2 P3 T1 T2 T3 I1 I2
2 3 5 52 43 61 6 b
6 4 3 72 不适用 59 1 个
1 5 6 55 48 60 6 f
2 4 4 65 64 58 2 b"), header = TRUE)
库(dplyr)
安排(朗姆酒,desc(I1),I2)
you can use the amazing package dplyr
there is a function called arrange.
you just set the data-frame and the columns you want to order considering the hierarchy you choose. the defualt is ascending order. but if you want in descreasing order you use the command desc.
rum <- read.table(textConnection("P1 P2 P3 T1 T2 T3 I1 I2
2 3 5 52 43 61 6 b
6 4 3 72 NA 59 1 a
1 5 6 55 48 60 6 f
2 4 4 65 64 58 2 b"), header = TRUE)
library(dplyr)
arrange(rum,desc(I1),I2)
一般来说,
xtfrm()
是获取数值向量的通用函数排序类似于给定的输入向量。然后可以通过减少排序来完成
使用 xtfrm() 的负值进行排序。 (这正是例如
dplyr 的
desc()
已实现。)例如,对于有问题的数据:
这种方法可以推广到基本 R 函数中进行排序
按给定列的数据帧,也接受向量值
递减
争论。从我的回答到
最近的问题:
通过当前的示例数据,我们(当然)得到:
In general,
xtfrm()
is the generic function to get a numeric vector thatsorts like the given input vector. Decreasing sorting can then be done by
sorting with the negated value of
xtfrm()
. (This is exactly how e.g.dplyr’s
desc()
is implemented.)For example, with the data in question:
This approach can be generalized into a base R function to sort
data frames by given columns, that also accepts a vector-valued
decreasing
argument. From my answer to
this recent question:
And with the current example data, we (of course) get: