2个单独的预测（）的输入返回相同的拟合值集

发布于 2025-02-10 18:35:07 字数 895 浏览 2 评论 0原文

认罪：我昨天试图问这个问题，但使用了一个示例，一致的数据集，类似于我的“真实”数据，希望这对这里的读者来说会更方便。解决了一个问题，但另一个遗体似乎不可变。

我的目标是创建两个预测向量的线性模型：“ yc.hat”和“ yt.hat”，这些模型旨在为PRI2000V的唯一观察值投影平均值作为平均贫困级别的函数的平均效果。 = 0）和治疗（治疗= 1）条件。

2）在控制下（治疗结果。开始。

这里

q6rega <- lm(pri2000v ~ treatment + I(log(pobtot1994)) + I(avgpoverty^2)
   #interactions
   + treatment:avgpoverty + treatment:I(avgpoverty^2), data = pga)

## predicted PRI support under the Treatment condition
q6.yT.hat <- predict(q6rega,
data = data.frame(I(avgpoverty^2) = 9:25, treatment = 1))
## predicted PRI support rate under the Control condition
q6.yC.hat <- predict(q6rega,
data = data.frame(I(avgpoverty^2) = 9:25, treatment = 0))

q6.yC.hat == q6.yT.hat

TRUE[417]

理解问题从 A>已在我的github上发布，如果需要的话，

编辑：上面的代码有一些错误，但没有指定POBTOT1994以某种方式导致R将其视为省略NewData。由于我是统计数据的新手，因此我将拟合的值与实际试图实现的预测输出相混淆。我本来希望意外的输入是产生错误。

原文

Confession: I attempted to ask this question yesterday, but used a sample, congruent dataset which resembles the my "real" data in hopes this would be more convenient for readers here. One issue was resolved, but another remains that appears immutable.

My objective is creating a linear model of two predicted vectors: "yC.hat", and "yT.hat" which are meant to project average effects for unique observed values of pri2000v as a function of the average poverty level "I(avgpoverty^2) under control (treatment = 0) and treatment (treatment = 1) conditions.

While I appear to have no issues running the regression itself, the inputs of my data argument have no effect on predict(), and only the object itself affects the output. As a result, treatment = 0 and treatment = 1 in the data argument result in the same fitted values. In fact, I can plug in ANY value into the data argument and it makes do difference. So I suspect my failure to understand issue starts here.

Here is my code:

q6rega <- lm(pri2000v ~ treatment + I(log(pobtot1994)) + I(avgpoverty^2)
   #interactions
   + treatment:avgpoverty + treatment:I(avgpoverty^2), data = pga)

## predicted PRI support under the Treatment condition
q6.yT.hat <- predict(q6rega,
data = data.frame(I(avgpoverty^2) = 9:25, treatment = 1))
## predicted PRI support rate under the Control condition
q6.yC.hat <- predict(q6rega,
data = data.frame(I(avgpoverty^2) = 9:25, treatment = 0))

q6.yC.hat == q6.yT.hat

TRUE[417]

dput(pga has been posted on my github, if needed

EDIT: There were a few things wrong with my code above, but not specifying pobtot1994 somehow resulted in R treating it as newdata being omitted. Since I'm fairly new to statistics, I confused fitted values with the prediction output that I was actually trying to achieve. I would have expected that an unexpected input is to produce an error instead.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黎夕旧梦 2025-02-17 18:35:07

令我惊讶的是，当您缺少所需的变量（pobtot1994）时，您可以在新的数据框架中进行预测。

无论如何，您需要创建一个新的数据框架，其中三个变量在模型中使用的未转换形式。由于您有兴趣将avgpoverty的拟合值 3至5 for 处理 1和0，因此您需要强制第三个变量pobtot1994作为一个常数。为简单起见，我在这里使用pobtot9994的平均值。

newdat <- expand.grid(avgpoverty=3:5, treatment=factor(c(0,1)), pobtot1994=mean(pga$pobtot1994))

  avgpoverty treatment pobtot1994
1          3         0   2037.384
2          4         0   2037.384
3          5         0   2037.384
4          3         1   2037.384
5          4         1   2037.384
6          5         1   2037.384

预测将向您显示这两个条件的不同值。

newdat$fitted <- predict(q6rega, newdata=newdat)

  avgpoverty treatment pobtot1994   fitted
1          3         0   2037.384 38.86817
2          4         0   2037.384 50.77476
3          5         0   2037.384 55.67832
4          3         1   2037.384 51.55077
5          4         1   2037.384 49.03148
6          5         1   2037.384 59.73910

I'm surprised you are able to run a prediction when it is lacking the required variable (pobtot1994) for your model in the new data frame for prediction.

Anyway, you would need to create a new data frame with the three variables in untransformed form used in the model. Since you are interested to compare the fitted values of avgpoverty 3 to 5 for treatment 1 and 0, you need to force the third variable pobtot1994 as a constant. I use the mean of pobtot9994 here for simplicity.

newdat <- expand.grid(avgpoverty=3:5, treatment=factor(c(0,1)), pobtot1994=mean(pga$pobtot1994))

  avgpoverty treatment pobtot1994
1          3         0   2037.384
2          4         0   2037.384
3          5         0   2037.384
4          3         1   2037.384
5          4         1   2037.384
6          5         1   2037.384

The prediction will show you the different values for the two conditions.

newdat$fitted <- predict(q6rega, newdata=newdat)

  avgpoverty treatment pobtot1994   fitted
1          3         0   2037.384 38.86817
2          4         0   2037.384 50.77476
3          5         0   2037.384 55.67832
4          3         1   2037.384 51.55077
5          4         1   2037.384 49.03148
6          5         1   2037.384 59.73910

回复收藏 0 原文

~没有更多了~