计算回归线和数据点之间的距离

发布于 2024-11-27 14:31:03 字数 316 浏览 1 评论 0原文

我想知道是否有一种方法可以计算图中的 abline 和数据点之间的距离?例如,concentration == 40signal == 643(元素 5)与 abline 之间的距离是多少?

concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)

I wonder if there is a way to calculate the distance between a abline in a plot and a datapoint? For example, what is the distance between concentration == 40 with signal == 643 (element 5) and the abline?

concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

日久见人心 2024-12-04 14:31:03

您基本上是在要求残差

R> residuals(res)
      1       2       3       4       5       6 
 192.61   12.57 -185.48 -205.52  -26.57  212.39 

顺便说一句,当您拟合线性回归时,残差之和为 0:

R> sum(residuals(res))
[1] 8.882e-15

如果模型正确,则应遵循正态分布 - qqnorm(res)

我发现使用标准化残差更容易。

> rstandard(res)
       1        2        3        4        5        6 
 1.37707  0.07527 -1.02653 -1.13610 -0.15845  1.54918 

这些残差已缩放至均值为零、方差(大约)等于 1 且呈正态分布。外围标准化残差是那些大于 +/- 2 的残差。

You are basically asking for the residuals.

R> residuals(res)
      1       2       3       4       5       6 
 192.61   12.57 -185.48 -205.52  -26.57  212.39 

As an aside, when you fit a linear regression, the sum of the residuals is 0:

R> sum(residuals(res))
[1] 8.882e-15

and if the model is correct, should follow a Normal distribution - qqnorm(res).

I find working with the standardised residuals easier.

> rstandard(res)
       1        2        3        4        5        6 
 1.37707  0.07527 -1.02653 -1.13610 -0.15845  1.54918 

These residuals have been scaled to have mean zero, variance (approximately) equal to one and have a Normal distribution. Outlying standardised residuals are those larger that +/- 2.

时光病人 2024-12-04 14:31:03

您可以使用以下函数:

http://paulbourke.net/geometry/pointlineplane/pointline.r

然后只需提取斜率和截距:

> coef(res)
  (Intercept) concentration 
   -210.61098      22.00441

所以你的最终答案是:

concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)

plot

cfs <- coef(res)
distancePointLine(y=signal[5], x=concentration[5], slope=cfs[2], intercept=cfs[1])

如果你想要一个更通用的要找到特定点的解决方案,concentration == 40 返回长度为 length(concentration) 的布尔向量。您可以使用该向量来选择点。

pt.sel <- ( concentration == 40 )
> pt.sel
[1] FALSE FALSE FALSE FALSE TRUE FALSE
> distancePointLine(y=signal[pt.sel], x=concentration[pt.sel], slope=cfs["concentration"], intercept=cfs["(Intercept)"])
     1.206032

不幸的是,distancePointLine 似乎没有被矢量化(或者它确实被矢量化,但当您向它传递矢量时它会返回警告)。否则,只需将 [] 选择器从 x 和 y 参数中删除即可获得所有点的答案。

You can use the function below:

http://paulbourke.net/geometry/pointlineplane/pointline.r

Then just extract the slope and intercept:

> coef(res)
  (Intercept) concentration 
   -210.61098      22.00441

So your final answer would be:

concentration <- c(1,10,20,30,40,50)
signal <- c(4, 22, 44, 244, 643, 1102)
plot(concentration, signal)
res <- lm(signal ~ concentration)
abline(res)

plot

cfs <- coef(res)
distancePointLine(y=signal[5], x=concentration[5], slope=cfs[2], intercept=cfs[1])

If you want a more general solution to finding a particular point, concentration == 40 returns a Boolean vector of length length(concentration). You can use that vector to select points.

pt.sel <- ( concentration == 40 )
> pt.sel
[1] FALSE FALSE FALSE FALSE TRUE FALSE
> distancePointLine(y=signal[pt.sel], x=concentration[pt.sel], slope=cfs["concentration"], intercept=cfs["(Intercept)"])
     1.206032

Unfortunately distancePointLine doesn't appear to be vectorized (or it does, but it returns a warning when you pass it a vector). Otherwise you could get answers for all points just by leaving the [] selector off the x and y arguments.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文