R 图密度平滑时间序列
我希望对一些时间序列数据进行概率分布。我的数据采用以下格式
00:00, 3
01:00, 50
05:00, 13
10:00, 34
17:00, 80
21:00, 100
时间列有一些缺失值,R 必须对其进行插值。我想要一条漂亮的平滑曲线来突出繁忙的时期。我尝试过 ts、密度和绘图,但这些并没有产生我想要的结果。例如,
data1 <- read.csv(file="c:\\abc\\ts.csv", head=FALSE, sep=",")
data1$V1 <- strptime(data1$V1, format="%H:%M")
plot(data1$V2, density(data1$V1), type="l")
但这给了我以疯狂的顺序和概率分布绘制的线条。
I wish to make a probability distribution of some time series data. My data is in the following format
00:00, 3
01:00, 50
05:00, 13
10:00, 34
17:00, 80
21:00, 100
The time column has some missing values that R will have to interpolate. I want to get a nice smooth curve to highlight the busy periods. I have tried with ts
, density
and plot
but these don't produce what I'm after. For example,
data1 <- read.csv(file="c:\\abc\\ts.csv", head=FALSE, sep=",")
data1$V1 <- strptime(data1$V1, format="%H:%M")
plot(data1$V2, density(data1$V1), type="l")
But this gives me lines drawn in crazy order and as a probability distribution.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我想你肯定在寻找包zoo,它有几个功能来处理 NA。另请参阅
na.aggregate
、na.approx
和na.locf
。I think you are definitely after package zoo, which has several functions to deal with NAs. See
na.aggregate
,na.approx
andna.locf
also.你让事情变得比你想象的要困难一些。我现在会在您的时间前面添加一个日期,让事情变得更容易。
另外,我添加了一个变量“texinp”和一个 textConnection() 语句,以便您可以剪切/粘贴以下代码并直接运行它。数据被加载到变量 texinp 中,并由 read.zoo 语句以与读取 .csv 文件类似的方式读取。现在,这将允许您绘制内容并让您了解如何使用 read.zoo 读取 .csv 文件。
从你的问题来看,你谈到了“繁忙期”。我可能是错的,但我假设 21:00 时的值 100 是“最繁忙的时段”。如果这是真的,那么您不需要密度图,上面的图就是您想要的。
如果我错了请告诉我。
You made it a little harder than you might realize. I'll make it easier for now by adding a date in front of your times.
Also, I added a variable "texinp" and a textConnection() statement so you can cut/paste the following code and run it directly. The data is loaded into variable texinp and is read by the read.zoo statement in a similar way to reading a .csv file. For now, this will allow you to plot things and gives you an idea of how to read .csv files using read.zoo.
From your question, you talked about "busy periods". I may be wrong, but I'm assuming that the value of 100 at time 21:00 is the "busiest period". If that's true, then you don't need a density plot, and the above plot is what you're after.
Let me know if I'm wrong.