将时间序列数据从宽格式重塑为高格式(用于绘图)

发布于 2024-07-29 05:56:28 字数 756 浏览 4 评论 0原文

我有一个包含多个时间序列的回报的数据框,存储在列中。

第一列包含日期,后续列是独立的时间序列,每个列都有一个名称。 列标题是变量名称。

## I have a data frame like this
t <- seq(as.Date('2009-01-01'),by='days',length=10)
X <- rnorm(10,0,1)
Y <- rnorm(10,0,2)
Z <- rnorm(10,0,4)

dat <- data.frame(t,X,Y,Z)

## which appears as
           t          X          Y         Z
1 2009-01-01 -1.8763317 -0.1885183 -6.655663
2 2009-01-02 -1.3566227 -2.1851226 -3.863576
3 2009-01-03 -1.3447188  2.4180249 -1.543931

我想将每个时间序列绘制为网格中单独图上的一条线,每个图都用变量名称标记。

要使用点阵绘制此图,数据必须采用高格式,如下所示:

           t symbol       price
1 2009-01-01      X -1.8763317
2 2009-01-02      Y -0.1885183
2 2009-01-02      Z -6.655663

执行此操作的良好函数调用是什么?

I have a data frame containing multiple time series of returns, stored in columns.

The first column contains dates, and subsequent columns are independent time series each with a name. The column headers are the variable names.

## I have a data frame like this
t <- seq(as.Date('2009-01-01'),by='days',length=10)
X <- rnorm(10,0,1)
Y <- rnorm(10,0,2)
Z <- rnorm(10,0,4)

dat <- data.frame(t,X,Y,Z)

## which appears as
           t          X          Y         Z
1 2009-01-01 -1.8763317 -0.1885183 -6.655663
2 2009-01-02 -1.3566227 -2.1851226 -3.863576
3 2009-01-03 -1.3447188  2.4180249 -1.543931

I want to plot each time series as a line on a separate plot, in a lattice, with each plot labeled by the variable names.

To plot this with lattice, the data must be in a tall format, as such:

           t symbol       price
1 2009-01-01      X -1.8763317
2 2009-01-02      Y -0.1885183
2 2009-01-02      Z -6.655663

What is a good function call to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

江城子 2024-08-05 05:56:28

您还可以使用“reshape”库中的melt()(我认为它比reshape()本身更容易使用)-这将为您节省必须将时间列添加回的额外步骤...

> library(reshape)
> m <- melt(dat,id="t",variable_name="symbol")
> names(m) <- sub("value","price",names(m))
> head(m)
           t symbol       price
1 2009-01-01      X -1.14945096
2 2009-01-02      X -0.07619870
3 2009-01-03      X  0.01547395
4 2009-01-04      X -0.31493143
5 2009-01-05      X  1.26985167
6 2009-01-06      X  1.31492397
> class(m$t)
[1] "Date"
> library(lattice)                                                              
> xyplot( price ~ t | symbol, data=m ,type ="l", layout = c(1,3) )

对于这个特定的情况但是,我会考虑使用“zoo”库,它不需要您重塑数据框架:

> library(zoo)                                                                  
> zobj <- zoo(dat[,-1],dat[,1])                                                 
> plot(zobj,col=rainbow(ncol(zobj))) 

R 开发人员/贡献者(在本例中为 Gabor 和 Hadley)为我们提供了许多不错的选择。 (并且不能忘记格子包的 Deepayan)

you can also use melt() from the 'reshape' library (I think it's easier to use than reshape() itself) - that'll save you the extra step of having to add the time column back in...

> library(reshape)
> m <- melt(dat,id="t",variable_name="symbol")
> names(m) <- sub("value","price",names(m))
> head(m)
           t symbol       price
1 2009-01-01      X -1.14945096
2 2009-01-02      X -0.07619870
3 2009-01-03      X  0.01547395
4 2009-01-04      X -0.31493143
5 2009-01-05      X  1.26985167
6 2009-01-06      X  1.31492397
> class(m$t)
[1] "Date"
> library(lattice)                                                              
> xyplot( price ~ t | symbol, data=m ,type ="l", layout = c(1,3) )

For this particular task, however, I would consider using the 'zoo' library, which would not require you to reshape the data frame:

> library(zoo)                                                                  
> zobj <- zoo(dat[,-1],dat[,1])                                                 
> plot(zobj,col=rainbow(ncol(zobj))) 

R developers/contributors (Gabor and Hadley in this case) have blessed us with many great choices. (and can't forget Deepayan for the lattice package)

讽刺将军 2024-08-05 05:56:28

来自 tidyr 收集帮助页面:

示例

library(tidyr)
library(dplyr)
# From http://stackoverflow.com/questions/1181060
stocks <- data.frame(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)

gather(stocks, stock, price, -time)
stocks %>% gather(stock, price, -time)

From tidyr gather help page:

Examples

library(tidyr)
library(dplyr)
# From http://stackoverflow.com/questions/1181060
stocks <- data.frame(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)

gather(stocks, stock, price, -time)
stocks %>% gather(stock, price, -time)
遮了一弯 2024-08-05 05:56:28

如果是多元时间序列,可以考虑使用同名包将其存储为zoo对象。 这使得索引、合并、子集化变得更加容易 --- 请参阅动物园小插图。

但当你问到格子图时——这也是可以做到的。 在此示例中,我们构建了一个简单的“长”data.frame,其中包含日期列、值列“val”和变量 id 列“var”:

> set.seed(42)
> D <- data.frame(date=rep(seq(as.Date("2009-01-01"),Sys.Date(),by="week"),2),\
                  val=c(cumsum(rnorm(30)), cumsum(rnorm(30))), \
                  var=c(rep("x1",30), rep("x2",30)))

给定该数据集,根据您的描述进行绘图由 xyplot 完成格子包,要求绘制“按变量分组给定数据的值”图,我们在每个面板中打开线条:

> library(lattice)
> xyplot(val ~ date | var, data=D, panel=panel.lines)

If it is a multivariate time series, consider storing it as a zoo object by using the package of the same name. This makes indexing, merging, subseting a lot easier --- see the zoo vignettes.

But as you asked about lattice plots -- and this can also be done. In this example, we construct a simple 'long' data.frame with a date column, as well as a value column 'val' and a variable id column 'var':

> set.seed(42)
> D <- data.frame(date=rep(seq(as.Date("2009-01-01"),Sys.Date(),by="week"),2),\
                  val=c(cumsum(rnorm(30)), cumsum(rnorm(30))), \
                  var=c(rep("x1",30), rep("x2",30)))

Given that dataset, plotting per your description is done by xyplot from the lattice package by asking for a plot of 'value given data grouped by variable' where we turn on lines in each panel:

> library(lattice)
> xyplot(val ~ date | var, data=D, panel=panel.lines)
魂ガ小子 2024-08-05 05:56:28

对于第一列中包含日期且其他各列中包含值的数据框“temp”:

> par(mfrow=c(3,4)) # 3x4 grid of plots
> mapply(plot,temp[,-1],main=names(temp)[-1],MoreArgs=list(x=temp[,1],xlab="Date",type="l",ylab="Value") )

For a dataframe 'temp' with the date in the first column and values in each of the other columns:

> par(mfrow=c(3,4)) # 3x4 grid of plots
> mapply(plot,temp[,-1],main=names(temp)[-1],MoreArgs=list(x=temp[,1],xlab="Date",type="l",ylab="Value") )
放肆 2024-08-05 05:56:28

非常感谢各位的回答 - 德克的回答是正确的。

事实证明,缺少的步骤是使用“stack()”函数将数据帧从宽格式转换为长格式。 我知道可能有一种更简单的方法可以使用 reshape() 函数来执行此操作,如果有人想发布它,很高兴看到一个示例。

所以这就是我最终使用问题中提到的“dat”数据框所做的事情:

## use stack() to reshape the data frame to a long format
## <time> <stock> <price>
stackdat <- stack(dat,select=-t) 
names(stackdat) <- c('price','symbol')

## create a column of date & bind to the new data frame
nsymbol <- length(levels(stackdat$symbol))  
date <- rep(dat$t, nsymbol)  
newdat <- cbind(date,stackdat)

## plot it with lattice
library(lattice)
xyplot(price ~ date | symbol,  ## model conditions on 'symbol' to lattice
       data=newdat,            ## data source
       type='l',               ## line
       layout=c(nsymbol,1))    ## put it on a single line

## or plot it with ggplot2
library(ggplot2)
qplot(date, price, data = newdat, geom="line") + facet_grid(. ~ symbol)

Many thanks for the answers folks - Dirk's answer was on mark.

The missing step turned out to be using "stack()" function to convert the data frame from a wide to a long format. I'm aware there may be an easier way to do this with the reshape() function, happy to see an example if someone wants to post it.

So here's what I ended up doing, using the 'dat' dataframe mentioned in the question:

## use stack() to reshape the data frame to a long format
## <time> <stock> <price>
stackdat <- stack(dat,select=-t) 
names(stackdat) <- c('price','symbol')

## create a column of date & bind to the new data frame
nsymbol <- length(levels(stackdat$symbol))  
date <- rep(dat$t, nsymbol)  
newdat <- cbind(date,stackdat)

## plot it with lattice
library(lattice)
xyplot(price ~ date | symbol,  ## model conditions on 'symbol' to lattice
       data=newdat,            ## data source
       type='l',               ## line
       layout=c(nsymbol,1))    ## put it on a single line

## or plot it with ggplot2
library(ggplot2)
qplot(date, price, data = newdat, geom="line") + facet_grid(. ~ symbol)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文