colClasses 日期和时间 read.csv

发布于 2024-10-21 06:02:14 字数 580 浏览 1 评论 0原文

我有一些以下形式的数据:

date,time,val1,val2
20090503,0:05:12,107.25,1
20090503,0:05:17,108.25,20
20090503,0:07:45,110.25,5
20090503,0:07:56,106.25,5

来自 csv 文件。我对 R 比较陌生,所以我尝试

data <-read.csv("sample.csv", header = TRUE, sep = ",")

colClasses 参数中使用 POSIXlt 以及 POSIXct ,但我似乎不能能够从我的日期和时间数据中创建一列或“变量”。我想这样做,这样我就可以选择任意时间范围来计算运行统计数据,例如最大值、最小值、平均值(然后是箱线图等)。

我还认为我可以将其转换为时间序列并以这种方式绕过它,

dataTS <-ts(data) 

但尚未能够利用开始、结束和频率来发挥我的优势。感谢您的帮助。

I have some data of the form:

date,time,val1,val2
20090503,0:05:12,107.25,1
20090503,0:05:17,108.25,20
20090503,0:07:45,110.25,5
20090503,0:07:56,106.25,5

that comes from a csv file. I am relatively new to R, so I tried

data <-read.csv("sample.csv", header = TRUE, sep = ",")

and using POSIXlt, as well as POSIXct in the colClasses argument, but I cant seem to be able to create one column or 'variable' out of my date and time data. I want to do so, so I can then choose arbitrary timeframes over which to calculate running statistics such as max, min, mean (and then boxplots, etc.).

I also thought that I might convert it to a time series and get around it that way,

dataTS <-ts(data) 

but have yet been able to use the start, end, and frequency to my advantage. Thanks for your help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

感性 2024-10-28 06:02:14

使用 colClasses 参数将数据读入 R 时无法执行此操作,因为数据跨越 CSV 文件中的两个“列”。相反,加载数据并将 datetime 列处理到单个 POSIXlt 变量中:

dat <- read.csv(textConnection("date,time,val1,val2
                               20090503,0:05:12,107.25,1
                               20090503,0:05:17,108.25,20
                               20090503,0:07:45,110.25,5
                               20090503,0:07:56,106.25,5"))
dat <- within(dat, Datetime <- as.POSIXlt(paste(date, time),
                                          format = "%Y%m%d %H:%M:%S"))

[我认为它是年月日??,如果不使用 "%Y%d%m %H:%M:%S"]

则给出:

> head(dat)
      date    time   val1 val2            Datetime
1 20090503 0:05:12 107.25    1 2009-05-03 00:05:12
2 20090503 0:05:17 108.25   20 2009-05-03 00:05:17
3 20090503 0:07:45 110.25    5 2009-05-03 00:07:45
4 20090503 0:07:56 106.25    5 2009-05-03 00:07:56
> str(dat)
'data.frame':   4 obs. of  5 variables:
 $ date    : int  20090503 20090503 20090503 20090503
 $ time    : Factor w/ 4 levels "0:05:12","0:05:17",..: 1 2 3 4
 $ val1    : num  107 108 110 106
 $ val2    : int  1 20 5 5
 $ Datetime: POSIXlt, format: "2009-05-03 00:05:12" "2009-05-03 00:05:17" ...

如果您愿意,您现在可以删除 date 和 `time:

> dat <- dat[, -(1:2)]
> head(dat)
    val1 val2            Datetime
1 107.25    1 2009-05-03 00:05:12
2 108.25   20 2009-05-03 00:05:17
3 110.25    5 2009-05-03 00:07:45
4 106.25    5 2009-05-03 00:07:56

You can't do this upon reading the data in to R using the colClasses argument because the data span two "columns" in the CSV file. Instead, load the data and process the date and time columns into a single POSIXlt variable:

dat <- read.csv(textConnection("date,time,val1,val2
                               20090503,0:05:12,107.25,1
                               20090503,0:05:17,108.25,20
                               20090503,0:07:45,110.25,5
                               20090503,0:07:56,106.25,5"))
dat <- within(dat, Datetime <- as.POSIXlt(paste(date, time),
                                          format = "%Y%m%d %H:%M:%S"))

[I presume it is year month day??, If not use "%Y%d%m %H:%M:%S"]

Which gives:

> head(dat)
      date    time   val1 val2            Datetime
1 20090503 0:05:12 107.25    1 2009-05-03 00:05:12
2 20090503 0:05:17 108.25   20 2009-05-03 00:05:17
3 20090503 0:07:45 110.25    5 2009-05-03 00:07:45
4 20090503 0:07:56 106.25    5 2009-05-03 00:07:56
> str(dat)
'data.frame':   4 obs. of  5 variables:
 $ date    : int  20090503 20090503 20090503 20090503
 $ time    : Factor w/ 4 levels "0:05:12","0:05:17",..: 1 2 3 4
 $ val1    : num  107 108 110 106
 $ val2    : int  1 20 5 5
 $ Datetime: POSIXlt, format: "2009-05-03 00:05:12" "2009-05-03 00:05:17" ...

You can now delete date and `time if you wish:

> dat <- dat[, -(1:2)]
> head(dat)
    val1 val2            Datetime
1 107.25    1 2009-05-03 00:05:12
2 108.25   20 2009-05-03 00:05:17
3 110.25    5 2009-05-03 00:07:45
4 106.25    5 2009-05-03 00:07:56
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文