研发ggplot：如何使用自定义平滑器（高斯过程）

发布于 2024-09-03 22:00:48 字数 1265 浏览 9 评论 0原文

我正在使用 R。我在 15 个时间点上有 25 个变量，每个时间点每个变量有 3 个或更多重复。我已将其melt编辑成data.frame，我可以使用（除其他外）ggplot 的facet_wrap() 命令愉快地绘制它。我的融化数据框名为lis；这是它的头部和尾部，因此您可以了解数据：

> head(lis)
  time variable    value
1   10     SELL 8.170468
2   10     SELL 8.215892
3   10     SELL 8.214246
4   15     SELL 8.910654
5   15     SELL 7.928537
6   15     SELL 8.777784
> tail(lis)
    time variable    value
145    1     GAS5 10.92248
146    1     GAS5 11.37983
147    1     GAS5 10.95310
148    1     GAS5 11.60476
149    1     GAS5 11.69092
150    1     GAS5 11.70777

我可以使用以下 ggplot2 命令获得所有时间序列的漂亮图，以及拟合样条线和 95% 置信区间

p <- ggplot(lis, aes(x=time, y=value)) + facet_wrap(~variable)
p <- p + geom_point() + stat_smooth(method = "lm", formula = y ~ ns(x,3))

：不符合我的喜好——95% 的置信区间相差甚远。我想使用高斯过程（GP）来更好地回归和估计时间序列的协方差。

拟合 GP

library(tgp) 
out <- bgp(X, Y, XX = seq(0, 200, length = 100))

我可以使用类似需要时间 X、观察 Y 并在 XX 中的每个点进行预测的方法来。对象 out 包含许多有关这些预测的内容，包括我可以使用的协方差矩阵来代替我从 ns() 得到的 95% 置信区间（我认为？）代码>.

问题是我不知道如何包装这个函数以使其与 ggplot2::stat_smooth() 接口。任何有关如何进行的想法或指示将不胜感激！

原文

I'm using R. I have 25 variables over 15 time points, with 3 or more replicates per variable per time point. I've melted this into a data.frame, which I can plot happily using (amongst other things) ggplot's facet_wrap() command. My melted data frame is called lis; here's its head and tail, so you get an idea of the data:

> head(lis)
  time variable    value
1   10     SELL 8.170468
2   10     SELL 8.215892
3   10     SELL 8.214246
4   15     SELL 8.910654
5   15     SELL 7.928537
6   15     SELL 8.777784
> tail(lis)
    time variable    value
145    1     GAS5 10.92248
146    1     GAS5 11.37983
147    1     GAS5 10.95310
148    1     GAS5 11.60476
149    1     GAS5 11.69092
150    1     GAS5 11.70777

I can get a beautiful plot of all the time series, along with a fitted spline and 95% confidence intervals using the following ggplot2 commands:

p <- ggplot(lis, aes(x=time, y=value)) + facet_wrap(~variable)
p <- p + geom_point() + stat_smooth(method = "lm", formula = y ~ ns(x,3))

The trouble is that the smoother is not to my liking - the 95% confidence intervals are way off. I would like to use Gaussian Processes (GP) to get a better regression and estimate of covariance for my time series.

I can fit a GP using something like

library(tgp) 
out <- bgp(X, Y, XX = seq(0, 200, length = 100))

which takes time X, observations Y and makes predictions at each point in XX. The object out contains a bunch of things about those predictions, including a covariance matrix I can use in place of the 95% confidence interval I get (I think?) from ns().

The trouble is I'm not how to wrap this function to make it interface with ggplot2::stat_smooth(). Any ideas or pointers as to how to proceed would be greatly appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

少钕鈤記 2024-09-10 22:00:48

看起来 bgp 不遵循标准的 R 函数建模风格。这意味着您无法在 geom_smooth 内部使用它，并且需要在 ggplot2 调用之外调整模型。您可能还想向 tgp 包作者发送电子邮件，鼓励他们遵循 R 标准。

回复收藏 0 原文

眉黛浅 2024-09-10 22:00:48

Stat_smooth 具有 y、ymin 和 ymax 美学，您可以将它们与自定义平滑器一起使用，如下所述：http://had.co.nz/ggplot2/stat_smooth.html。您可以使用自定义平滑器中的预测和 CI 创建一个数据框，并直接在 stat_smooth 中使用它（指定新的数据参数）。您也许可以使用 stat_smooth(method="tgp::bgp",XX=seq(0,200,length=100)) 但我还没有尝试过。