R sort() 数据.frame

发布于 2024-10-31 04:29:55 字数 2167 浏览 0 评论 0原文

我有以下数据框

head(stockdatareturnpercent)
                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 

实际上还有更多股票,但为了说明目的我不得不将其削减。每个月我都想知道表现最好到最差(或最差到最好)的人。我尝试了 sort() 函数,这就是我想到的。

N <- dim(stockdatareturnpercent)[1]  
for (i in 1:N) {  
    s <- sort(stockdatareturnpercent[i,])  
    print(s)  
}  

                 UPS     FDX      XLP      XLU      XLV     DIA      IWM      SPY      XLE      XLB      XLI      OIH      XLK      SMH     MSFT
2001-04-02 0.6481585 0.93135 1.923136 4.712996 7.122751 7.83498 7.935566 8.198549 9.826701 10.13465 10.82522 13.97566 14.98789 21.22383 21.41436
                 SMH       FDX       OIH       XLK        XLE        SPY       XLU      XLP      DIA     MSFT      IWM     UPS      XLV      XLB      XLI
2001-05-01 -10.90494 -5.045544 -4.565291 -4.182041 -0.9492803 -0.5621328 0.6987724 1.457579 1.719876 2.088734 2.141846 3.73587 3.748309 3.774033 4.099748
                 OIH       XLE       XLI     XLU     XLP       XLB      DIA       UPS       SPY       XLV       FDX      XLK     IWM      SMH     MSFT
2001-06-01 -23.24101 -10.02403 -6.594324 -5.8602 -5.0532 -3.955192 -3.58381 -2.814685 -2.695798 -1.177474 0.4987542 1.935544 2.78625 4.671762 5.374764
                MSFT       OIH      XLK       IWM       SMH       XLV       UPS       XLE       SPY        XLU        XLB        XLI        DIA      FDX
2001-07-02 -9.793005 -9.161594 -7.17351 -5.725078 -3.354391 -2.016818 -1.692442 -1.159914 -1.024809 -0.9029407 -0.2723560 -0.2078283 -0.1997433 2.868898
                XLP
2001-07-02 2.998604

这是一种非常低效且廉价的查看结果的方式。最好创建一个存储这些数据的对象。但是,如果我在 R 提示符中键入 's',我只会获得最后一行的值,因为 for 循环的每次后续迭代都会替换以前的数据。

我将非常感谢一些指导。谢谢您。

I have the following data frame

head(stockdatareturnpercent)
                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 

Actually there are more stocks but for purposes of illustration I had to cut it down. In each month I want to know the best to worst (or worst to best) performers. I played around with the sort() function and this is what I came up with.

N <- dim(stockdatareturnpercent)[1]  
for (i in 1:N) {  
    s <- sort(stockdatareturnpercent[i,])  
    print(s)  
}  

                 UPS     FDX      XLP      XLU      XLV     DIA      IWM      SPY      XLE      XLB      XLI      OIH      XLK      SMH     MSFT
2001-04-02 0.6481585 0.93135 1.923136 4.712996 7.122751 7.83498 7.935566 8.198549 9.826701 10.13465 10.82522 13.97566 14.98789 21.22383 21.41436
                 SMH       FDX       OIH       XLK        XLE        SPY       XLU      XLP      DIA     MSFT      IWM     UPS      XLV      XLB      XLI
2001-05-01 -10.90494 -5.045544 -4.565291 -4.182041 -0.9492803 -0.5621328 0.6987724 1.457579 1.719876 2.088734 2.141846 3.73587 3.748309 3.774033 4.099748
                 OIH       XLE       XLI     XLU     XLP       XLB      DIA       UPS       SPY       XLV       FDX      XLK     IWM      SMH     MSFT
2001-06-01 -23.24101 -10.02403 -6.594324 -5.8602 -5.0532 -3.955192 -3.58381 -2.814685 -2.695798 -1.177474 0.4987542 1.935544 2.78625 4.671762 5.374764
                MSFT       OIH      XLK       IWM       SMH       XLV       UPS       XLE       SPY        XLU        XLB        XLI        DIA      FDX
2001-07-02 -9.793005 -9.161594 -7.17351 -5.725078 -3.354391 -2.016818 -1.692442 -1.159914 -1.024809 -0.9029407 -0.2723560 -0.2078283 -0.1997433 2.868898
                XLP
2001-07-02 2.998604

This is a very inefficient and cheap way to see the results. It would be nice to create an object that stores this data. However if I type 's' in the R prompt I only get the value of the last row as each subsequent iteration of the for loop replaces the previous data.

I would greatly appreciate some guidance. Thank you kindly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

莫相离 2024-11-07 04:29:55

为此,请使用 order() ,因为 sort() 在使用 *apply 时会删除名称:

id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])

使用 order 的结果id 矩阵中的 code> 还允许您执行以下操作:

matrix(names(Data)[id],ncol=ncol(Data))
     [,1]  [,2]  [,3]  [,4]  [,5] 
[1,] "DIA" "IWM" "SPY" "OIH" "SMH"
[2,] "SMH" "OIH" "SPY" "DIA" "IWM"
[3,] "OIH" "DIA" "SPY" "IWM" "SMH"
[4,] "OIH" "IWM" "SMH" "SPY" "DIA"
[5,] "OIH" "SMH" "SPY" "DIA" "IWM"
[6,] "SMH" "OIH" "IWM" "DIA" "SPY"

找出在给定时刻哪些是最好的。

如果你想使用循环,你可以使用列表。正如 Joshua 所说,你在每个循环中都覆盖 s 。首先初始化一个列表来存储结果。此循环给出的结果与上面使用 lapply() 的代码相同,但没有 id 矩阵。尽管使用 apply 有其他好处,但速度没有提高:

N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
    s[[i]] <- sort(Data[i,])
}

我使用以下示例数据测试了代码(请在将来提供您自己的数据,使用此示例或例如 dput()):

zz <- textConnection(" SPY         DIA        IWM        SMH        OIH
  8.1985485   7.8349806   7.935566  21.223832  13.975655
 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291
 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009
 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594
 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695
 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 ")

Data <- read.table(zz,header=T)
close(zz)

Use order() for this, as sort() drops the names when using *apply :

id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])

Using the results of order in an id matrix also allows you to do eg :

matrix(names(Data)[id],ncol=ncol(Data))
     [,1]  [,2]  [,3]  [,4]  [,5] 
[1,] "DIA" "IWM" "SPY" "OIH" "SMH"
[2,] "SMH" "OIH" "SPY" "DIA" "IWM"
[3,] "OIH" "DIA" "SPY" "IWM" "SMH"
[4,] "OIH" "IWM" "SMH" "SPY" "DIA"
[5,] "OIH" "SMH" "SPY" "DIA" "IWM"
[6,] "SMH" "OIH" "IWM" "DIA" "SPY"

To find out wich ones were the best at a given moment.

If you want to use your loop, you could use lists. as Joshua said, you overwrite s in every loop. Initialize a list to store the results first. This loop gives the same results as the above code with lapply(), but without the id matrix. There's no gain in speed, although using apply has other benefits :

N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
    s[[i]] <- sort(Data[i,])
}

I tested the code using following sample data (please provide your own in the future, using either this example or eg dput()) :

zz <- textConnection(" SPY         DIA        IWM        SMH        OIH
  8.1985485   7.8349806   7.935566  21.223832  13.975655
 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291
 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009
 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594
 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695
 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 ")

Data <- read.table(zz,header=T)
close(zz)
渡你暖光 2024-11-07 04:29:55

使用原始代码将每个排序行保存在 list 中:

stockdatareturnpercent <- read.table(textConnection("                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913"))

x <- vector("list", nrow(stockdatareturnpercent))

## use unlist to drop the data.frame structure
for (i in 1:nrow(stockdatareturnpercent)) {  
    x[[i]] <- sort(unlist(stockdatareturnpercent[i,])  )
} 
## use the row names to name each list element
names(x) <- rownames(stockdatareturnpercent)

x

使用原始代码将每个排序行保存在 list 中:

2001-04-02` DIA IWM SPY OIH SMH 7.834981 7.935566 8.198548 13.975655 21.223832

使用原始代码将每个排序行保存在 list 中:

2001-05-01` SMH OIH SPY DIA IWM -10.9049360 -4.5652910 -0.5621328 1.7198760 2.1418460

使用原始代码将每个排序行保存在 list 中:

2001-06-01` OIH DIA SPY IWM SMH -23.241009 -3.583810 -2.695798 2.786250 4.671762

使用原始代码将每个排序行保存在 list 中:

2001-07-02` OIH IWM SMH SPY DIA -9.1615940 -5.7250780 -3.3543910 -1.0248091 -0.1997433

使用原始代码将每个排序行保存在 list 中:

2001-08-01` OIH SMH SPY DIA IWM -13.956695 -6.218129 -6.116556 -5.027656 -2.461728

使用原始代码将每个排序行保存在 list 中:

2001-09-04` SMH OIH IWM DIA SPY -39.321172 -16.902913 -15.760037 -12.266327 -8.890063

直接使用 apply 对每行进行排序,但不保留元素名称:

apply(stockdatareturnpercent, 1, sort)

返回一个矩阵其中每一列都是排序后的行。然后转置:

sortmat <- t(apply(stockdatareturnpercent, 1, sort))

如果您需要将结果作为 data.frame,则将其作为.data.frame:

sortdf <- as.data.frame(sortmat)

最后,所有这些都在一行中

sortdf <- as.data.frame(t(apply(stockdatareturnpercent, 1, sort)))

Using your original code to save each sorted row in a list:

stockdatareturnpercent <- read.table(textConnection("                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913"))

x <- vector("list", nrow(stockdatareturnpercent))

## use unlist to drop the data.frame structure
for (i in 1:nrow(stockdatareturnpercent)) {  
    x[[i]] <- sort(unlist(stockdatareturnpercent[i,])  )
} 
## use the row names to name each list element
names(x) <- rownames(stockdatareturnpercent)

x

Using your original code to save each sorted row in a list:

2001-04-02` DIA IWM SPY OIH SMH 7.834981 7.935566 8.198548 13.975655 21.223832

Using your original code to save each sorted row in a list:

2001-05-01` SMH OIH SPY DIA IWM -10.9049360 -4.5652910 -0.5621328 1.7198760 2.1418460

Using your original code to save each sorted row in a list:

2001-06-01` OIH DIA SPY IWM SMH -23.241009 -3.583810 -2.695798 2.786250 4.671762

Using your original code to save each sorted row in a list:

2001-07-02` OIH IWM SMH SPY DIA -9.1615940 -5.7250780 -3.3543910 -1.0248091 -0.1997433

Using your original code to save each sorted row in a list:

2001-08-01` OIH SMH SPY DIA IWM -13.956695 -6.218129 -6.116556 -5.027656 -2.461728

Using your original code to save each sorted row in a list:

2001-09-04` SMH OIH IWM DIA SPY -39.321172 -16.902913 -15.760037 -12.266327 -8.890063

For a direct use of apply to sort each row, but does not preserve the element names:

apply(stockdatareturnpercent, 1, sort)

That returns a matrix where each column is the sorted row. Then transpose:

sortmat <- t(apply(stockdatareturnpercent, 1, sort))

If you need the result as a data.frame, as.data.frame it:

sortdf <- as.data.frame(sortmat)

Finally, all that in one line

sortdf <- as.data.frame(t(apply(stockdatareturnpercent, 1, sort)))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文