将随机效应纳入随机森林回归的时间序列数据
我有三个住宅的时间序列空气污染数据(例如PM2.5,CO2,TEMP,室外PM2.5),以及居民以二进制格式记录的活动日记(例如,在活动中进行1时,1时进行了1个不是),我想将所有三个位置的数据纳入PM2.5的随机森林预测模型,其主要目标是查看哪些活动最有力地预测了PM2.5水平。
我能够分别对这些住宅进行建模,但目前正在尝试解决一种将这三个模型合并在一起的方法。我曾想过要尝试应用某种随机效果,而每个住宅都是一组数据,但是我不确定如何在R中实现它并获取可以将其应用于RF的数据。
本质上,我的问题是,我如何将来自相同变量的三个住宅的时间序列数据包括在同一变量上(外部空气污染测量除外,每个房屋是独特的),这是一个模型变量?
I have time series air pollution data (e.g. PM2.5, CO2, temp, outdoor PM2.5) from three residences, and activity diaries recorded by the residents in binary format (e.g. cooking, 1 when activity is taking place and 0 when it is not) and I want to incorporate data from all three locations into a random forest prediction model for PM2.5 with the main goal of seeing which activities are most strongly predictive of the PM2.5 levels.
I am able to model these residences separately but am currently trying to work out a way to incorporate all three in one model. I have thought of trying to apply some sort of random effects where each of the residences is a group of data, but I am unsure how to implement this in R and get data that could then be applied to the RF.
Essentially, my question is how can I include time series data from three residences over the same variables (except from the external air pollution measurement which is unique to each house) into one model, accounting for the variation between houses in each of their respective explanatory variables?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
reemtree
r软件包将混合效应模型的结构与基于树的估计方法结合在一起。这里发表了一篇论文:https://link.springer.com/article.com/article.com/article.10.1007/s10.1007/s10994--s10994--- 011-5258-3
The
REEMtree
R package combines the structure of mixed effects model with tree-based estimation methods. There's a paper on it published here:https://link.springer.com/article/10.1007/s10994-011-5258-3