面板数据:使用 plm 处理滞后和二元因变量
我正在尝试使用面板数据和二元因变量运行汇总逻辑回归。由于我想滞后一些变量,因此我使用 plm 包来创建它们。当我尝试其他方法时,我遇到了问题。我不能使用滞后或嵌入,因为它是面板数据。
hybridsubsidies <-pdata.frame(reduced, c("state","year"))
lagee<-(lag(hybridsubsidies$eespending,1))
lagratio<-(lag(hybridsubsidies$ratio, 1))
laggopvote<-(lag(hybridsubsidies$gopvote, 1))
laggasoline<-(lag(hybridsubsidies$gasoline, 1))
在运行汇总分析之前,我想将所有变量放入原始数据框(混合补贴)中。我很确定我不需要这样做,但我是一个视觉型的人,并且想在运行任何分析之前验证数据的格式是否合适。
从下面的输出来看,一切似乎都正确完成。
头(滞后(混合补贴$espending,1))
阿拉巴马-1999 阿拉巴马-2000 阿拉巴马-2001 阿拉巴马-2002 阿拉巴马-2003 阿拉巴马-2004
<前><代码> NA 58294 55378 26982 28264 2566头(混合补贴$espending)
阿拉巴马-1999 阿拉巴马-2000 阿拉巴马-2001 阿拉巴马-2002 阿拉巴马-2003 阿拉巴马-2004
<前><代码> 58294 55378 26982 28264 2566 26906
我的问题是,当我尝试将此滞后变量分配为数据框中的向量时,这样,
hybridsubsidies$lagee<-(lag(hybridsubsidies$eespending,1))
它会这样做(当我调用数据框中的名称时,它们会被包含在内),但随后我不能再查看数据框。 R 对我说:
edit.data.frame(get(subx, envir = Parent), title = subx, ...) 中出现错误: 只能处理向量和因子元素
如何解决这个问题,以便在运行分析之前可以查看数据框?我想看看它,因为看起来我必须使用 glm 而不是 plm (池)来进行此分析,因为因变量是二进制变量并且 plm 不支持这些 dv
这已经给我带来了一段时间的问题。
col1 ST YR EELAG EE
[1,]1 1 NA 58294
[2,]1 2 58294 55378
[3,]1 3 55378 26982
[4,]1 4 26982 28264
[5,]1 5 28264 2566
[6,]1 6 2566 26906
[7,]1 7 26906 29466
[8,]2 1 NA 355
[9,]2 2 355 259
[10,]2 3 259 224
[11,]2 4 224 217
[12,]2 5 217 241
[13,]2 6 241 231
[14,]2 7 231 231
[15,]3 1 NA 5111
[16,]3 2 5111 3753
[17,]3 3 3753 2211
[18,]3 4 2211 1452
[19,]3 5 1452 2913
[20,]3 6 2913 3128
[21,]3 7 3128 7132
[22,]4 1 NA 1597
[23,]4 2 1597 905
I am attempting to run a pooled logistic regression with panel data and a binary dependent variable. Since I wanted to lag some of the variables, I used the plm package to create them. When I tried to do it other ways, I ran into problems. I can't use lag or embed, because it is panel data.
hybridsubsidies <-pdata.frame(reduced, c("state","year"))
lagee<-(lag(hybridsubsidies$eespending,1))
lagratio<-(lag(hybridsubsidies$ratio, 1))
laggopvote<-(lag(hybridsubsidies$gopvote, 1))
laggasoline<-(lag(hybridsubsidies$gasoline, 1))
I wanted to put all the variables into the original data frame (hybridsubsidies) before I ran the pooled analysis. I'm pretty sure I don't need to, but I'm a visual person, and would like to verify the format of the data is appropriate before running any analysis.
From the output below, it looks like everything is done correctly.
head(lag(hybridsubsidies$eespending,1))
ALABAMA-1999 ALABAMA-2000 ALABAMA-2001 ALABAMA-2002 ALABAMA-2003 ALABAMA-2004
NA 58294 55378 26982 28264 2566
head(hybridsubsidies$eespending)
ALABAMA-1999 ALABAMA-2000 ALABAMA-2001 ALABAMA-2002 ALABAMA-2003 ALABAMA-2004
58294 55378 26982 28264 2566 26906
My problem is that when I try and assign this lag variable as a vector in the data frame, this way,
hybridsubsidies$lagee<-(lag(hybridsubsidies$eespending,1))
it does so(when I call the names in the dataframe, they are included), but then I can no longer view the dataframe. R says to me:
Error in edit.data.frame(get(subx, envir = parent), title = subx, ...) :
can only handle vector and factor elements
How can I solve this so that I can view the data frame before I run the analysis? I want to look at it, since it looks like I will have to use glm instead of plm (pooling) for this analysis since the dependent variable is a binary variable and plm does not support these d.v.'s
This has been giving me problems for awhile now.
col1 ST YR EELAG EE
[1,] 1 1 NA 58294
[2,] 1 2 58294 55378
[3,] 1 3 55378 26982
[4,] 1 4 26982 28264
[5,] 1 5 28264 2566
[6,] 1 6 2566 26906
[7,] 1 7 26906 29466
[8,] 2 1 NA 355
[9,] 2 2 355 259
[10,] 2 3 259 224
[11,] 2 4 224 217
[12,] 2 5 217 241
[13,] 2 6 241 231
[14,] 2 7 231 231
[15,] 3 1 NA 5111
[16,] 3 2 5111 3753
[17,] 3 3 3753 2211
[18,] 3 4 2211 1452
[19,] 3 5 1452 2913
[20,] 3 6 2913 3128
[21,] 3 7 3128 7132
[22,] 4 1 NA 1597
[23,] 4 2 1597 905
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
lag
返回一个时间序列对象。 有效吗?
lag
returns a time series object. Doeswork?