将 ksmooth 应用于时间序列

发布于 2024-10-05 03:52:57 字数 2094 浏览 0 评论 0原文

我有以下问题:

我有一个数据框“测试”,看起来或多或少像这样:

Date         return     price      vol   
20100902     0.3        15         8.5
20100902     0.4        17         8.6
20100902     0.6        19         8.7
.....
20100903     0.2        13         8.2
20100903     0.4        17         8.6
20100903     0.8        21         9.0
.....

所以我给了每个日期的值(每天 10 个)。我现在想做的是在每个日期应用 ksmooth() ,因此例如每个日期 ksmooth(return, Price, n.points = 50) 。这应该为每个日期提供 50 个观察结果。此外,我想要插值的时间戳。因此,生成的框架应该

Date         return     price         
20100620     0.3        15  
20100620     0.31       15.2
20100620     0.32       15.3 
20100620     0.4        17         
20100620     0.6        19        
.....
20100621     0.2        13     
20100621     0.21       13.1
20100621     0.22       13.2
20100621     0.4        17         
20100621     0.8        21     
etc.

每天有 50 个观察值。 所以这就是我要寻找的:获取前 10 个观测值(例如日期 1 = 20102006),进行插值并在插值值上添加时间戳(20100620)。然后,获取第二个 10 个观测值(日期 = 20100621),进行插值和在插值上加上时间戳(20100621)等等,

我对 R 很陌生,但这就是我尝试使用的zoo()函数,在实现任何东西之前,我想这样做。我的日期条目是唯一的,所以我只是在每个条目中添加了小时

test <- read.zoo("test.txt", format = "%Y%m%d")
test <- zoo(test, as.POSIXct(time(test)) + 1:26)

这可能有问题,因为 R 抱怨。 然后我想到使用 rollapply() 函数。

roll.test <- rollapply(test, 10, FUN = function(x,y) ksmooth(test$return,    
+ test$price, "normal", bandwidth = 20, n.points = 50) )

不幸的是,结果非常令人困惑。并且 by.column = FALSE 需求不起作用。

我非常感谢一些帮助。它根本不必建立在我的“试用版”之上。 非常感谢 丹尼

我的数据如下所示:

"date" "days" "return" "price" 
"66" 20100620 91 0.18 1389.373 
"67" 20100620 91 0.19 1370.57 
"68" 20100620 91 0.19 1353.122 
"69" 20100620 91 0.19 1336.291 
"70" 20100620 91 0.20 1319.774 
"71" 20100620 91 0.20 1303.341 
"72" 20100620 91 0.21 1286.656 
"326" 20100621 91 0.18 1386.28 
"327" 20100621 91 0.18 1367.694 
"328" 20100621 91 0.19 1350.375 
"329" 20100621 91 0.19 1333.615 
"330" 20100621 91 0.20 1317.164 
"331" 20100621 91 0.20 1300.783 
"332" 20100621 91 0.21 1284.113 

I have the following problem:

I have a data frame "test" that looks more or less like this:

Date         return     price      vol   
20100902     0.3        15         8.5
20100902     0.4        17         8.6
20100902     0.6        19         8.7
.....
20100903     0.2        13         8.2
20100903     0.4        17         8.6
20100903     0.8        21         9.0
.....

So I have given values for each date (10 per day). What I would like to do now is apply ksmooth() on each date, so e.g. ksmooth(return, price, n.points = 50) for each date. This should give me 50 observations for each date. In addition, I would like a time stamp for the interpolated values. So the resulting frame should like

Date         return     price         
20100620     0.3        15  
20100620     0.31       15.2
20100620     0.32       15.3 
20100620     0.4        17         
20100620     0.6        19        
.....
20100621     0.2        13     
20100621     0.21       13.1
20100621     0.22       13.2
20100621     0.4        17         
20100621     0.8        21     
etc.

with 50 observations per day.
So here is what I'm looking for: take the first 10 observations (e.g. date 1 = 20102006, interpolate and put a time stamp on the interpolated values (20100620). Then, take second 10 observations (date = 20100621), interpolate and put a time stamp on the interpolated values (20100621) and so on.

I'm quite new to R, but this is what I tried. I thought of using the zoo() function to that. Before implementing anything, I wanted to make my date entries unique, so I just added hours to each entry

test <- read.zoo("test.txt", format = "%Y%m%d")
test <- zoo(test, as.POSIXct(time(test)) + 1:26)

There probably is something wrong with that, because R complained.
Then I thought of using the rollapply() function.

roll.test <- rollapply(test, 10, FUN = function(x,y) ksmooth(test$return,    
+ test$price, "normal", bandwidth = 20, n.points = 50) )

Unfortunately the result is very confusing. And the by.column = FALSE demand does not work.

I would very appreciate some help. It does not have to build upon my "trial version" at all.
Thank you very much
Dani

My data looks like this:

"date" "days" "return" "price" 
"66" 20100620 91 0.18 1389.373 
"67" 20100620 91 0.19 1370.57 
"68" 20100620 91 0.19 1353.122 
"69" 20100620 91 0.19 1336.291 
"70" 20100620 91 0.20 1319.774 
"71" 20100620 91 0.20 1303.341 
"72" 20100620 91 0.21 1286.656 
"326" 20100621 91 0.18 1386.28 
"327" 20100621 91 0.18 1367.694 
"328" 20100621 91 0.19 1350.375 
"329" 20100621 91 0.19 1333.615 
"330" 20100621 91 0.20 1317.164 
"331" 20100621 91 0.20 1300.783 
"332" 20100621 91 0.21 1284.113 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

茶底世界 2024-10-12 03:52:57

问题是 ksmooth 函数将返回一个列表,并且这些列表由 rollaplly 保存为列表。顺便说一句,我认为您甚至不想使用 rollaplly,因为它不会对每个日期执行此操作,而是“滚动”数据框。从你的解释来看,我相信这不是理想的行为。

我无法真正使用动物园对象来解决这个问题,因为该对象具有相当的限制性。也许其他人会告诉你这一点。您可以使用 plyr 包中的 ddply 函数构造该数据帧:

tt <- ddply(test,.(Date),
  function(x) { 
       as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
  })

然后可以将 tt 转换为动物园对象,使用

tt2 <- zoo(tt, as.POSIXct(tt$Date) + 1:50)

或者,您可以使用一点手动完成列表操作。同样,生成的 tt 可以通过上面的行转换为动物园对象。

tt <- split(test,test$Date)

tt <- lapply(tt,function(x){
        as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
      })

tt <- do.call(rbind,tt)
names(tt) <- c("return","price")
tt$Date <- as.Date(gsub("\\.\\d+","",rownames(tt)))

请注意,我使用 read.table() 来构造测试:

zz <- textConnection(
"Date    ,     return ,    price  ,    vol
20100902 ,    0.3  ,      15   ,      8.5
20100902 ,    0.4  ,      17   ,      8.6
20100902 ,    0.6  ,      19   ,      8.7
20100903 ,    0.2  ,      13   ,      8.2
20100903 ,    0.4  ,      17   ,      8.6
20100903 ,    0.8  ,      21   ,      9.0"
)
test <- read.table(zz,header=T,sep=",")
test$Date <- as.Date(as.character(test$Date),format="%Y%m%d")
close(zz)

Problem is that the ksmooth function will return a list, and those lists are saved as that by rollaplly. By the way, I don't think you even want to use rollaplly, as that does not do this for each date but "rolls" over the dataframe. I believe from your explanation that is not the desired behaviour.

I couldn't really work it out using a zoo object, as that one is quite restrictive. Maybe somebody else will show you that. You can construct that dataframe using the ddply function from the plyr package :

tt <- ddply(test,.(Date),
  function(x) { 
       as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
  })

tt can then be transformed to a zoo object, using

tt2 <- zoo(tt, as.POSIXct(tt$Date) + 1:50)

Alternatively, you could do it by hand using a bit of list manipulation. again, the resulting tt can be converted by the line above to a zoo object.

tt <- split(test,test$Date)

tt <- lapply(tt,function(x){
        as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
      })

tt <- do.call(rbind,tt)
names(tt) <- c("return","price")
tt$Date <- as.Date(gsub("\\.\\d+","",rownames(tt)))

Mind you, I used read.table() to construct test :

zz <- textConnection(
"Date    ,     return ,    price  ,    vol
20100902 ,    0.3  ,      15   ,      8.5
20100902 ,    0.4  ,      17   ,      8.6
20100902 ,    0.6  ,      19   ,      8.7
20100903 ,    0.2  ,      13   ,      8.2
20100903 ,    0.4  ,      17   ,      8.6
20100903 ,    0.8  ,      21   ,      9.0"
)
test <- read.table(zz,header=T,sep=",")
test$Date <- as.Date(as.character(test$Date),format="%Y%m%d")
close(zz)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文