如何对数据框中的列进行重新排序?
如何更改此输入(顺序为:时间、输入、输出、文件):
Time In Out Files
1 2 3 4
2 3 4 5
到此输出(顺序为:时间、输出、输入、文件)?
Time Out In Files
1 3 2 4
2 4 3 5
这是虚拟 R 数据:
table <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5))
table
## Time In Out Files
##1 1 2 3 4
##2 2 3 4 5
How would one change this input (with the sequence: time, in, out, files):
Time In Out Files
1 2 3 4
2 3 4 5
To this output (with the sequence: time, out, in, files)?
Time Out In Files
1 3 2 4
2 4 3 5
Here's the dummy R data:
table <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5))
table
## Time In Out Files
##1 1 2 3 4
##2 2 3 4 5
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
您的数据框有四列,例如
df[,c(1,2,3,4)]
。请注意,第一个逗号表示保留所有行,1,2,3,4 表示保留所有列。
要按照上述问题更改顺序,请执行
df2[,c(1,3,2,4)]
如果您想将此文件输出为 csv,请执行
write.csv( df2,文件=“somedf.csv”)
Your dataframe has four columns like so
df[,c(1,2,3,4)]
.Note the first comma means keep all the rows, and the 1,2,3,4 refers to the columns.
To change the order as in the above question do
df2[,c(1,3,2,4)]
If you want to output this file as a csv, do
write.csv(df2, file="somedf.csv")
您还可以使用子集函数:
您应该像其他答案中那样更好地使用 [] 运算符,但知道您可以在单个命令中执行子集和列重新排序操作可能会很有用。
更新:
您还可以使用 dplyr 包中的 select 函数:
我不确定效率,但由于 dplyr 的语法,这个解决方案应该更灵活,特别是如果您有很多列。例如,以下代码将以相反的顺序对 mtcars 数据集的列进行重新排序:
以下代码将仅对某些列进行重新排序,并丢弃其他列:
阅读有关 dplyr 的选择语法。
You can also use the subset function:
You should better use the [] operator as in the other answers, but it may be useful to know that you can do a subset and a column reorder operation in a single command.
Update:
You can also use the select function from the dplyr package:
I am not sure about the efficiency, but thanks to dplyr's syntax this solution should be more flexible, specially if you have a lot of columns. For example, the following will reorder the columns of the mtcars dataset in the opposite order:
And the following will reorder only some columns, and discard others:
Read more about dplyr's select syntax.
正如此评论中提到的,重新排序的标准建议在
data.frame
中对列进行排序通常很麻烦且容易出错,尤其是当您有很多列时。此函数允许按位置重新排列列:指定变量名称和所需位置,而不必担心其他列。
现在,OP 的请求变得如此简单:
要另外交换
Time
和Files
列,您可以执行以下操作:As mentioned in this comment, the standard suggestions for re-ordering columns in a
data.frame
are generally cumbersome and error-prone, especially if you have a lot of columns.This function allows to re-arrange columns by position: specify a variable name and the desired position, and don't worry about the other columns.
Now the OP's request becomes as simple as this:
To additionally swap
Time
andFiles
columns you can do this:dplyr
版本1.0.0
包含relocate()
函数,可以轻松地对列进行重新排序:或
dplyr
version1.0.0
includes therelocate()
function to easily reorder columns:or
dplyr
解决方案(tidyverse
包集)是使用select
:A
dplyr
solution (part of thetidyverse
package set) is to useselect
:也许您想要的列顺序恰好具有按字母降序排列的列名称,这可能是一个巧合。既然是这种情况,你就可以这样做:
当我有包含许多列的大文件时,我就使用这种方法。
Maybe it's a coincidence that the column order you want happens to have column names in descending alphabetical order. Since that's the case you could just do:
That's what I use when I have large files with many columns.
您可以使用 data.table 包:
如何对 data.table 列重新排序(不复制)
You can use the data.table package:
How to reorder data.table columns (without copying)
三个 评价最高< /a> 答案有一个弱点。
如果您的数据框看起来像这样
,那么它是一个糟糕的解决方案
,它可以完成这项工作,但您刚刚引入了对输入中列的顺序的依赖。
应避免这种脆弱的编程风格。
列的显式命名是一个更好的解决方案
另外,如果您打算在更通用的设置中重用代码,您可以简单地这样做,
这也非常好,因为它完全隔离了文字。相比之下,如果您使用 dplyr 的
select
那么您就会设置那些稍后会阅读您的代码的人(包括您自己),以进行一些欺骗。列名被用作文字,而不出现在代码中。
The three top-rated answers have a weakness.
If your dataframe looks like this
then it's a poor solution to use
It does the job, but you have just introduced a dependence on the order of the columns in your input.
This style of brittle programming is to be avoided.
The explicit naming of the columns is a better solution
Plus, if you intend to reuse your code in a more general setting, you can simply
which is also quite nice because it fully isolates literals. By contrast, if you use dplyr's
select
then you'd be setting up those who will read your code later, yourself included, for a bit of a deception. The column names are being used as literals without appearing in the code as such.
Dplyr 具有允许您将特定列移动到其他列之前或之后的功能。当您使用大数据框架时,这是一个关键工具(如果是 4 列,则使用前面提到的 select 会更快)。
https://dplyr.tidyverse.org/reference/relocate.html
在您的情况下,那就是:
简洁、优雅。它还允许您将多个列移动到一起并将其移动到开头或结尾:
同样:当您使用大数据框时超级强大:)
Dplyr has a function that allows you to move specific columns to before or after other columns. That is a critical tool when you work with big data frameworks (if it is 4 columns, it's faster to use select as mentioned before).
https://dplyr.tidyverse.org/reference/relocate.html
In your case, it would be:
Simple and elegant. It also allows you to move several columns together and move it to the beginning or to the end:
Again: super powerful when you work with big dataframes :)
我见过的唯一一个效果好的来自这里。
像这样使用:
就像一个魅力。
The only one I have seen work well is from here.
Use like this:
Works like a charm.