从 GSL 库中获取 C gsl_fit_linear() 函数中线性回归的 p 值

发布于 2024-10-27 23:04:22 字数 635 浏览 6 评论 0原文

我正在尝试用 C 语言重现 R 中的一些代码，因此我尝试使用 gsl_fit_linear() 函数来拟合线性回归。

在 R 中，我使用 lm() 函数，它使用以下代码返回拟合的 p 值：

lmAvgs<- lm( c(1.23,   11.432,   14.653, 21.6534) ~ c(1970, 1980, 1990, 2000) )
summary(lmAvgs)

我不知道如何从 C 输出到 p 值，我的代码看起来像这样到目前为止：

int main(void)
{
int i, n = 4;
double x[4] = { 1970, 1980, 1990, 2000 };
double y[4] = {1.23,   11.432,   14.653, 21.6534};
double c0, c1, cov00, cov01, cov11, sumsq;
gsl_fit_linear (x, 1, y, 1, n, &c0, &c1, &cov00, &cov01, &cov11, &sumsq);
}

这似乎可以正确计算斜率和截距，但我不知道如何获得 p 值。我是统计和 C 方面的新手！

原文

I'm trying to reporduce some code from R in C, so I'm trying to fit a linear regression using the gsl_fit_linear() function.

In R I'd use the lm() function, which returns a p-value for the fit using this code:

lmAvgs<- lm( c(1.23,   11.432,   14.653, 21.6534) ~ c(1970, 1980, 1990, 2000) )
summary(lmAvgs)

I've no idea though how to go from the C output to a p-value, my code looks something like this so far:

int main(void)
{
int i, n = 4;
double x[4] = { 1970, 1980, 1990, 2000 };
double y[4] = {1.23,   11.432,   14.653, 21.6534};
double c0, c1, cov00, cov01, cov11, sumsq;
gsl_fit_linear (x, 1, y, 1, n, &c0, &c1, &cov00, &cov01, &cov11, &sumsq);
}

This seems to correctly calculate slope and intercept but I don't know how to get a p-value. I'm novice at stats and C!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乱了心跳 2024-11-03 23:04:22

一切都在：http://en.wikipedia.org/wiki/Ordinary_least_squares。但这里有一段代码显示类似于 R 中的摘要（lmAvgs）的输出。要运行它，您需要 GSL 库：

int n = 4;
double x[4] = { 1970, 1980, 1990, 2000};
double y[4] = {1.23,   11.432,   14.653, 21.6534};
double c0, c1, cov00, cov01, cov11, sumsq;
gsl_fit_linear (x, 1, y, 1, n, &c0, &c1, &cov00, &cov01, &cov11, &sumsq);

cout<<"Coefficients\tEstimate\tStd. Error\tt value\tPr(>|t|)"<<endl;

double stdev0=sqrt(cov00);
double t0=c0/stdev0;
double pv0=t0<0?2*(1-gsl_cdf_tdist_P(-t0,n-2)):2*(1-gsl_cdf_tdist_P(t0,n-2));//This is the p-value of the constant term
cout<<"Intercept\t"<<c0<<"\t"<<stdev0<<"\t"<<t0<<"\t"<<pv0<<endl;

double stdev1=sqrt(cov11);
double t1=c1/stdev1;
double pv1=t1<0?2*(1-gsl_cdf_tdist_P(-t1,n-2)):2*(1-gsl_cdf_tdist_P(t1,n-2));//This is the p-value of the linear term
cout<<"x\t"<<c1<<"\t"<<stdev1<<"\t"<<t1<<"\t"<<pv1<<endl;

double dl=n-2;//degrees of liberty
double ym=0.25*(y[0]+y[1]+y[2]+y[3]); //Average of vector y
double sct=pow(y[0]-ym,2)+pow(y[1]-ym,2)+pow(y[2]-ym,2)+pow(y[3]-ym,2); // sct = sum of total squares
double R2=1-sumsq/sct;
cout<<"Multiple R-squared: "<<R2<<",    Adjusted R-squared: "<<1-double(n-1)/dl*(1-R2)<<endl;
double F=R2*dl/(1-R2);
double p_value=1-gsl_cdf_fdist_P(F,1,dl);
cout<<"F-statistic:  "<<F<<" on 1 and "<<n-2<<" DF,  p-value: "<<p_value<<endl;

给出：

Coefficients    Estimate    Std. Error  t value Pr(>|t|)
Intercept   -1267.91    181.409 -6.98922    0.0198633
x   0.644912    0.0913886   7.05681 0.0194956
Multiple R-squared: 0.961389,   Adjusted R-squared: 0.942083
F-statistic:  49.7986 on 1 and 2 DF,  p-value: 0.0194956

R 给出：

Coefficients:
Estimate Std. Error t value Pr(>|t|)  
(Intercept)               -1.268e+03  1.814e+02  -6.989   0.0199 *
c(1970, 1980, 1990, 2000)  6.449e-01  9.139e-02   7.057   0.0195 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 2.044 on 2 degrees of freedom
Multiple R-squared: 0.9614, Adjusted R-squared: 0.9421 
F-statistic:  49.8 on 1 and 2 DF,  p-value: 0.01950

Everything is on : http://en.wikipedia.org/wiki/Ordinary_least_squares. but here is a piece of code which display an output similar to summary(lmAvgs) in R. To run this, you need the GSL Library :

int n = 4;
double x[4] = { 1970, 1980, 1990, 2000};
double y[4] = {1.23,   11.432,   14.653, 21.6534};
double c0, c1, cov00, cov01, cov11, sumsq;
gsl_fit_linear (x, 1, y, 1, n, &c0, &c1, &cov00, &cov01, &cov11, &sumsq);

cout<<"Coefficients\tEstimate\tStd. Error\tt value\tPr(>|t|)"<<endl;

double stdev0=sqrt(cov00);
double t0=c0/stdev0;
double pv0=t0<0?2*(1-gsl_cdf_tdist_P(-t0,n-2)):2*(1-gsl_cdf_tdist_P(t0,n-2));//This is the p-value of the constant term
cout<<"Intercept\t"<<c0<<"\t"<<stdev0<<"\t"<<t0<<"\t"<<pv0<<endl;

double stdev1=sqrt(cov11);
double t1=c1/stdev1;
double pv1=t1<0?2*(1-gsl_cdf_tdist_P(-t1,n-2)):2*(1-gsl_cdf_tdist_P(t1,n-2));//This is the p-value of the linear term
cout<<"x\t"<<c1<<"\t"<<stdev1<<"\t"<<t1<<"\t"<<pv1<<endl;

double dl=n-2;//degrees of liberty
double ym=0.25*(y[0]+y[1]+y[2]+y[3]); //Average of vector y
double sct=pow(y[0]-ym,2)+pow(y[1]-ym,2)+pow(y[2]-ym,2)+pow(y[3]-ym,2); // sct = sum of total squares
double R2=1-sumsq/sct;
cout<<"Multiple R-squared: "<<R2<<",    Adjusted R-squared: "<<1-double(n-1)/dl*(1-R2)<<endl;
double F=R2*dl/(1-R2);
double p_value=1-gsl_cdf_fdist_P(F,1,dl);
cout<<"F-statistic:  "<<F<<" on 1 and "<<n-2<<" DF,  p-value: "<<p_value<<endl;

Which gives :

Coefficients    Estimate    Std. Error  t value Pr(>|t|)
Intercept   -1267.91    181.409 -6.98922    0.0198633
x   0.644912    0.0913886   7.05681 0.0194956
Multiple R-squared: 0.961389,   Adjusted R-squared: 0.942083
F-statistic:  49.7986 on 1 and 2 DF,  p-value: 0.0194956

R gives :

Coefficients:
Estimate Std. Error t value Pr(>|t|)  
(Intercept)               -1.268e+03  1.814e+02  -6.989   0.0199 *
c(1970, 1980, 1990, 2000)  6.449e-01  9.139e-02   7.057   0.0195 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 2.044 on 2 degrees of freedom
Multiple R-squared: 0.9614, Adjusted R-squared: 0.9421 
F-statistic:  49.8 on 1 and 2 DF,  p-value: 0.01950

回复收藏 0 原文

~没有更多了~