R xts 和 data.table

发布于 2025-01-06 01:13:12 字数 854 浏览 0 评论 0原文

我可以将 data.table 转换为 xts 对象,就像处理 data.frame 一样:

> df = data.frame(x = c("a", "b", "c", "d"), v = rnorm(4))
> dt = data.table(x = c("a", "b", "c", "d"), v = rnorm(4))
> xts(df, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" "-1.2232283"
2011-01-02 15:30:00 "b" "-0.1654551"
2011-01-03 15:50:50 "c" "-0.4456202"
2011-01-04 15:30:00 "d" "-0.9416562"
> xts(dt, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" " 1.3089579"
2011-01-02 15:30:00 "b" "-1.7681071"
2011-01-03 15:50:50 "c" "-1.4375100"
2011-01-04 15:30:00 "d" "-0.2467274"

将 data.table 与 xts 一起使用是否存在任何问题?

I can convert a data.table to an xts object just as I do with a data.frame:

> df = data.frame(x = c("a", "b", "c", "d"), v = rnorm(4))
> dt = data.table(x = c("a", "b", "c", "d"), v = rnorm(4))
> xts(df, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" "-1.2232283"
2011-01-02 15:30:00 "b" "-0.1654551"
2011-01-03 15:50:50 "c" "-0.4456202"
2011-01-04 15:30:00 "d" "-0.9416562"
> xts(dt, as.POSIXlt(c("2011-01-01 15:30:00", "2011-01-02 15:30:00", "2011-01-03 15:50:50", "2011-01-04 15:30:00")))
                    x   v           
2011-01-01 15:30:00 "a" " 1.3089579"
2011-01-02 15:30:00 "b" "-1.7681071"
2011-01-03 15:50:50 "c" "-1.4375100"
2011-01-04 15:30:00 "d" "-0.2467274"

Is there any issue in using data.table with xts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

川水往事 2025-01-13 01:13:12

只是为了解决一个悬而未决的问题。

正如文森特在评论中指出的那样,这没有问题。

它包含在 data.table 1.9.5 中。下面是类似的内容:

as.data.table.xts <- function(x, keep.rownames = TRUE){
  stopifnot(requireNamespace("xts") || !missing(x) || xts::is.xts(x))
  r = setDT(as.data.frame(x), keep.rownames = keep.rownames)
  if(!keep.rownames) return(r[])
  setnames(r,"rn","index")
  setkeyv(r,"index")[]
}

as.xts.data.table <- function(x){
  stopifnot(requireNamespace("xts") || !missing(x) || is.data.table(x) || any(class(x[[1]] %in% c("POSIXct","Date"))))
  colsNumeric = sapply(x, is.numeric)[-1] # exclude first col, xts index
  if(any(!colsNumeric)){
    warning(paste("Following columns are not numeric and will be omitted:",paste(names(colsNumeric)[!colsNumeric],collapse=", ")))
  }
  r = setDF(x[,.SD,.SDcols=names(colsNumeric)[colsNumeric]])
  rownames(r) <- x[[1]]
  xts::as.xts(r)
}

Just to resolve an open question.

As Vincent point in the comment there is no issue about that.

It is included in data.table 1.9.5. Below is the similar content:

as.data.table.xts <- function(x, keep.rownames = TRUE){
  stopifnot(requireNamespace("xts") || !missing(x) || xts::is.xts(x))
  r = setDT(as.data.frame(x), keep.rownames = keep.rownames)
  if(!keep.rownames) return(r[])
  setnames(r,"rn","index")
  setkeyv(r,"index")[]
}

as.xts.data.table <- function(x){
  stopifnot(requireNamespace("xts") || !missing(x) || is.data.table(x) || any(class(x[[1]] %in% c("POSIXct","Date"))))
  colsNumeric = sapply(x, is.numeric)[-1] # exclude first col, xts index
  if(any(!colsNumeric)){
    warning(paste("Following columns are not numeric and will be omitted:",paste(names(colsNumeric)[!colsNumeric],collapse=", ")))
  }
  r = setDF(x[,.SD,.SDcols=names(colsNumeric)[colsNumeric]])
  rownames(r) <- x[[1]]
  xts::as.xts(r)
}
冷︶言冷语的世界 2025-01-13 01:13:12

由于 quantmod,通常有 xts 符号嵌入在所有列名称中。 (例如“SPY.Open”、“SPY.High”等)。因此,这是 Jan 的 as.data.table.xts 的替代方案,它将符号放在单独的列中,这在 data.table 中更自然(因为您在进行任何分析之前,可能会重新绑定其中的一些)。

as.data.table.xts <- function(x, ...) {
  cn <- colnames(x)
  sscn <- strsplit(cn, "\\.")  
  indexClass(x) <- c('POSIXct', 'POSIXt') #coerce index to POSIXct
  DT <- data.table(time=index(x), coredata(x))
  #DT <- data.table(IDateTime(index(x)), coredata(x))

  ## If there is a Symbol embedded in the colnames, strip it out and make it a 
  ## column
  if (all(sapply(sscn, "[", 1) == sscn[[1]][1])) {
    Symbol <- sscn[[1]][1]
    setnames(DT, names(DT)[-1], sub(paste0(Symbol, "."), "", cn))
    DT <- DT[, Symbol:=Symbol]
    setkey(DT, Symbol, time)[]
  } else {
    setkey(DT, time)[]
  }
}

library(quantmod)
getSymbols("SPY")
as.data.table(SPY)
            time   Open   High    Low  Close   Volume Adjusted Symbol
   1: 2007-01-03 142.25 142.86 140.57 141.37 94807600   120.36    SPY
   2: 2007-01-04 141.23 142.05 140.61 141.67 69620600   120.61    SPY
   3: 2007-01-05 141.33 141.40 140.38 140.54 76645300   119.65    SPY
   4: 2007-01-08 140.82 141.41 140.25 141.19 71655000   120.20    SPY
   5: 2007-01-09 141.31 141.60 140.40 141.07 75680100   120.10    SPY
  ---                                                                
1993: 2014-12-01 206.30 206.60 205.38 205.64 12670100   205.64    SPY
1994: 2014-12-02 205.81 207.34 205.78 207.09 72105500   207.09    SPY
1995: 2014-12-03 207.30 208.15 207.10 207.89 69450000   207.89    SPY
1996: 2014-12-04 207.54 208.27 206.70 207.66 89928200   207.66    SPY
1997: 2014-12-05 207.87 208.47 207.55 208.00 85031000   208.00    SPY

Because of quantmod, it is common to have an xts with the symbol embedded in all the column names. (e.g. "SPY.Open", "SPY.High", etc.). So, here is an alternative to Jan's as.data.table.xts that puts the symbol in a separate column, which is more natural in data.tables (since you're probably going to rbind a bunch of these before doing any analysis).

as.data.table.xts <- function(x, ...) {
  cn <- colnames(x)
  sscn <- strsplit(cn, "\\.")  
  indexClass(x) <- c('POSIXct', 'POSIXt') #coerce index to POSIXct
  DT <- data.table(time=index(x), coredata(x))
  #DT <- data.table(IDateTime(index(x)), coredata(x))

  ## If there is a Symbol embedded in the colnames, strip it out and make it a 
  ## column
  if (all(sapply(sscn, "[", 1) == sscn[[1]][1])) {
    Symbol <- sscn[[1]][1]
    setnames(DT, names(DT)[-1], sub(paste0(Symbol, "."), "", cn))
    DT <- DT[, Symbol:=Symbol]
    setkey(DT, Symbol, time)[]
  } else {
    setkey(DT, time)[]
  }
}

library(quantmod)
getSymbols("SPY")
as.data.table(SPY)
            time   Open   High    Low  Close   Volume Adjusted Symbol
   1: 2007-01-03 142.25 142.86 140.57 141.37 94807600   120.36    SPY
   2: 2007-01-04 141.23 142.05 140.61 141.67 69620600   120.61    SPY
   3: 2007-01-05 141.33 141.40 140.38 140.54 76645300   119.65    SPY
   4: 2007-01-08 140.82 141.41 140.25 141.19 71655000   120.20    SPY
   5: 2007-01-09 141.31 141.60 140.40 141.07 75680100   120.10    SPY
  ---                                                                
1993: 2014-12-01 206.30 206.60 205.38 205.64 12670100   205.64    SPY
1994: 2014-12-02 205.81 207.34 205.78 207.09 72105500   207.09    SPY
1995: 2014-12-03 207.30 208.15 207.10 207.89 69450000   207.89    SPY
1996: 2014-12-04 207.54 208.27 206.70 207.66 89928200   207.66    SPY
1997: 2014-12-05 207.87 208.47 207.55 208.00 85031000   208.00    SPY
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文