分段回归以在许多不同的组上找到断点 - 总结
我本质上有一个GLMM模型,我试图分解,以评估在多年数据中许多不同站点损害数据的最佳断点。我预计只有一个断点,人口从下降到增加/稳定。附件是一些虚构的数据:
df2 <-data.frame(site=c('bh', 'bh', 'bh', 'bh', 'bh', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'ha', 'ha', 'ha', 'ha', 'ha', 'ha', 'ha', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'sb', 'sb', 'sb', 'sb', 'sb', 'sb', 'sb', 'wm', 'wm', 'wm', 'wm', 'wm'),
ysw=c(0, 2, 3, 5, 12, 0, 1, 2, 3, 4, 5, 6, 13, 0, 1, 2, 3, 4, 5, 8, 0, 1, 2, 3, 4, 5, 6, 13, 1, 2, 3, 4, 5, 6, 13, 1, 2, 3, 5, 12),
lam=c(-1.94246067889122, 0.674238379117774, 0.123986076653627, 1.42751957163926, -1.30153581808348, -0.424812155072339, -0.775553269663735, 0.845160077651946, -0.366964611913739, -0.175440305426086, -0.300162274132754, 0.602168551378997, 0.35237549500052, 0.0568625203766551, 0.0310733148322787, -0.300162274132754, -0.0866196954685079, -0.213169737066003, -0.0409152237837699, 0.0417873189717518, -0.111072044919427, -0.9810987245661, 0.301247088636211, 0.243286146083446, 0.155205859770556, -1.46428403001449, 1.00004342727686, 0.699056854547668, -0.351206453188803, 0.0972573096934199, 0.431524584187451, -0.526810501182215, -0.424812155072339, 0.602168551378997, 0.628491104967123, 0.778223626766096, -0.300162274132754, -0.475820321699244, 0.477265995424853, 0.301247088636211
))
我本质上想分别对每个站点进行分割的回归分析,然后创建一个不同估计的断点的数据框架,以便我平均找出大多数站点从下降中恢复的“ YSW”是什么。
我看过这个 执行lm()和在r <中的多个列上进行lm()和分段() /a>和如何在使用带有DPLYR软件包的数据帧以执行分段线性回归吗?
我现在需要一种方法来放置输出(至少断点应在哪里每个站点)在数据框中。任何帮助都将不胜感激。
#Attempt 1:
out <- df2 %>%
nest_by(site) %>%
mutate(my.lm = list(lm(lam ~ ysw, data = data)),
my.seg = list(tryCatch(segmented(my.lm, seg.Z = ~ ysw),
error = function(e) list(NA))))
I essentially have a glmm model that I am trying to break up to assess where the best breakpoint should be for the data which is compromised of many different sites over multiple years of data. I expect there to be just one breakpoint where the population changes from declining to increasing/stable. Attached are some made-up data:
df2 <-data.frame(site=c('bh', 'bh', 'bh', 'bh', 'bh', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'h1', 'ha', 'ha', 'ha', 'ha', 'ha', 'ha', 'ha', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'ho', 'sb', 'sb', 'sb', 'sb', 'sb', 'sb', 'sb', 'wm', 'wm', 'wm', 'wm', 'wm'),
ysw=c(0, 2, 3, 5, 12, 0, 1, 2, 3, 4, 5, 6, 13, 0, 1, 2, 3, 4, 5, 8, 0, 1, 2, 3, 4, 5, 6, 13, 1, 2, 3, 4, 5, 6, 13, 1, 2, 3, 5, 12),
lam=c(-1.94246067889122, 0.674238379117774, 0.123986076653627, 1.42751957163926, -1.30153581808348, -0.424812155072339, -0.775553269663735, 0.845160077651946, -0.366964611913739, -0.175440305426086, -0.300162274132754, 0.602168551378997, 0.35237549500052, 0.0568625203766551, 0.0310733148322787, -0.300162274132754, -0.0866196954685079, -0.213169737066003, -0.0409152237837699, 0.0417873189717518, -0.111072044919427, -0.9810987245661, 0.301247088636211, 0.243286146083446, 0.155205859770556, -1.46428403001449, 1.00004342727686, 0.699056854547668, -0.351206453188803, 0.0972573096934199, 0.431524584187451, -0.526810501182215, -0.424812155072339, 0.602168551378997, 0.628491104967123, 0.778223626766096, -0.300162274132754, -0.475820321699244, 0.477265995424853, 0.301247088636211
))
I essentially want to run a segmented regression analysis for each site separately and then create a dataframe of the different estimated breakpoints so I can figure out on average what 'ysw' is when most sites recover from a decline.
I have looked at this
Performing lm() and segmented() on multiple columns in R and How to use segmented package when working with data frames with dplyr package to perform piecewise linear regression?
I now need a way to put the output (at least where the breakpoint should be for every site) in a dataframe. Any help would be much appreciated.
#Attempt 1:
out <- df2 %>%
nest_by(site) %>%
mutate(my.lm = list(lm(lam ~ ysw, data = data)),
my.seg = list(tryCatch(segmented(my.lm, seg.Z = ~ ysw),
error = function(e) list(NA))))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论