对 x 和 y 变量列表进行循环面板数据回归
我正在使用面板数据,并希望从变量列表中进行一系列双变量固定效应回归。
我的数据集的一小部分看起来像这样:
library(plm)
library(dplyr)
structure(list(v2x_libdem = c(0.876, 0.88, 0.763, 0.779), v2x_partipdem = c(0.679,
0.68, 0.647, 0.652), v2x_frassoc_thick = c(0.937, 0.937, 0.9,
0.899), country_name = c("Sweden", "Sweden", "Hungary", "Hungary"
), year = c(2000, 2008, 2000, 2008)), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"))
# A tibble: 4 × 5
v2x_libdem v2x_partipdem v2x_frassoc_thick country_name year
<dbl> <dbl> <dbl> <chr> <dbl>
1 0.876 0.679 0.937 Sweden 2000
2 0.88 0.68 0.937 Sweden 2008
3 0.763 0.647 0.9 Hungary 2000
4 0.779 0.652 0.899 Hungary 2008
我的变量列表看起来像这样:
vars <- c("v2x_libdem", "v2x_partipdem", "v2x_frassoc_thick")
这些变量应该是 x 和 y,因此 x 和 y 等变量的每个组合都应该处于回归中。
我编写了一个进行面板数据建模的函数:
paneldata_function <- function (y, x) { #this function will do a PLM
plm(paste(y, "~", x),
data = Vdem_west,
model = "within",
index = c("year", "country_name")) #for a given x & y variable
}
我需要循环遍历 x 和 y 的每个组合。最好,我希望输出是四个值向量,我可以将它们转换成数据集;一项用于系数,一项用于 x 变量,一项用于 y 变量,一项用于 p 值。
如果我做一个模型,我可以轻松访问这些:
e <- paneldata_function("v2x_libdem", "v2x_partipdem")
p_value <- e[["fstatistic"]][["p.value"]]
x_var <- names(e[["model"]])[2]
y_var <- names(e[["model"]])[1]
我如何编写循环以便将这些向量作为输出?
非常感谢任何帮助,我希望我尽可能清楚地表达我的问题。
I am working with paneldata and wish to conduct a series of bivariate fixed effects regressions from a list of variables.
A small part of my dataset looks like this:
library(plm)
library(dplyr)
structure(list(v2x_libdem = c(0.876, 0.88, 0.763, 0.779), v2x_partipdem = c(0.679,
0.68, 0.647, 0.652), v2x_frassoc_thick = c(0.937, 0.937, 0.9,
0.899), country_name = c("Sweden", "Sweden", "Hungary", "Hungary"
), year = c(2000, 2008, 2000, 2008)), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"))
# A tibble: 4 × 5
v2x_libdem v2x_partipdem v2x_frassoc_thick country_name year
<dbl> <dbl> <dbl> <chr> <dbl>
1 0.876 0.679 0.937 Sweden 2000
2 0.88 0.68 0.937 Sweden 2008
3 0.763 0.647 0.9 Hungary 2000
4 0.779 0.652 0.899 Hungary 2008
My list of variables looks something like this:
vars <- c("v2x_libdem", "v2x_partipdem", "v2x_frassoc_thick")
These variables ought to be both x and y, and thus every combination of the variables as x and y should be in a regression.
I have written a function that conducts the paneldata modelling:
paneldata_function <- function (y, x) { #this function will do a PLM
plm(paste(y, "~", x),
data = Vdem_west,
model = "within",
index = c("year", "country_name")) #for a given x & y variable
}
It is this function I need to loop over with each combination of x and y. Preferably, I would want the output to a be four vectors of values which I might turn into a dataset; one for the coefficient, one for the x variable, one for the y variable and one for the p-value.
If I do one single model, I can access these easily:
e <- paneldata_function("v2x_libdem", "v2x_partipdem")
p_value <- e[["fstatistic"]][["p.value"]]
x_var <- names(e[["model"]])[2]
y_var <- names(e[["model"]])[1]
How might I write the loop so that I get these vectors as an output?
Any help is very appreciated, and I hope I phrased my question as clearly as possible.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
要循环遍历可能的字段,首先,您需要能够提供该信息的东西。无论您的数据集大小如何,此函数都将起作用,但您必须调整要保留的字段。 (例如,这里需要排除最后两个字段,所以我选择1:3。制作这个列表的方法有很多,但很多会重复(或者两个字段相同,但一次是左右下一个是从右到左)。此方法使用
RcppAlgos
包。2
表示您想要的变量数量。 code> 仅渲染返回唯一对。使用您现在的函数,我在
map
调用中调用它,它位于tidyverse
中。您没有询问模型名称,但我还添加了
一个调用,以便您也可以查看模型性能。打电话,但你没有要求这部分(实际上我先做了这部分。并意识到我没有回答您的问题。)
您可以使用相同的方法来测试您的模型并查看假设。
To loop through the possible fields, first, you need something that gives you that information. This function will work regardless of your dataset size, you'll have to adjust what fields to keep, though. (For example, here you needed to exclude the last two fields, so I select 1:3. There are a lot of ways to make this list, but many make duplicates (or the same two fields, but one time is left-right and the next is right-left). This method uses the package
RcppAlgos
. The2
is for how many variables you would like. The F orFALSE
renders the return unique pairs only.Then using your function as it is now, I call it in this
map
call.map
is intidyverse
.For the coefficient data, I wrote the output to a data frame. You didn't ask for the model name, but I added that as well.
I also put together a call, so that you could look at the model performance, as well. This could be combined with the previous call, but you didn't ask for this part. (I actually did this part first and realized I wasn't answering your question.)
You could use this same approach to test your model and look at the assumptions.