如何忽略R中的空值？

发布于 2024-09-28 16:12:21 字数 882 浏览 7 评论 0原文

我有一个数据集，其中一个字段中有一些空值。当我尝试运行线性回归时，它将字段中的整数视为类别指示器，而不是数字。

例如，对于不包含空值的字段...

summary(lm(rank ~ num_ays, data=a)),

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 10.607597   0.019927 532.317  < 2e-16 ***
num_ays      0.021955   0.007771   2.825  0.00473 **

但是当我在具有空值的字段上运行相同的模型时，我得到：

Coefficients:
              Estimate Std. Error  t value Pr(>|t|)    

(Intercept)  1.225e+01  1.070e+00   11.446  < 2e-16 ***
num_azs0    -1.780e+00  1.071e+00   -1.663  0.09637 .  
num_azs1    -1.103e+00  1.071e+00   -1.030  0.30322    
num_azs10   -9.297e-01  1.080e+00   -0.861  0.38940    
num_azs100   1.750e+00  5.764e+00    0.304  0.76141    
num_azs101  -6.250e+00  4.145e+00   -1.508  0.13161

处理此问题的最佳和/或最有效的方法是什么，以及权衡是什么？

原文

I have a data set with some null values in one field. When I try to run a linear regression, it treats the integers in the field as category indicators, not numbers.

E.g., for a field that contains no null values...

summary(lm(rank ~ num_ays, data=a)),

Returns:

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 10.607597   0.019927 532.317  < 2e-16 ***
num_ays      0.021955   0.007771   2.825  0.00473 **

But when I run the same model on a field with null values, I get:

Coefficients:
              Estimate Std. Error  t value Pr(>|t|)    

(Intercept)  1.225e+01  1.070e+00   11.446  < 2e-16 ***
num_azs0    -1.780e+00  1.071e+00   -1.663  0.09637 .  
num_azs1    -1.103e+00  1.071e+00   -1.030  0.30322    
num_azs10   -9.297e-01  1.080e+00   -0.861  0.38940    
num_azs100   1.750e+00  5.764e+00    0.304  0.76141    
num_azs101  -6.250e+00  4.145e+00   -1.508  0.13161

What's the best and/or most efficient way to handle this, and what are the tradeoffs?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

淡淡離愁欲言轉身 2024-10-05 16:12:21

您可以忽略空值，如下所示：

a[!is.null(a$num_ays),]

You can ignore null values like so:

a[!is.null(a$num_ays),]

回复收藏 0 原文

淡墨 2024-10-05 16:12:21

并以 Shane 的答案为基础：您可以在 lm() 的 data= 参数中使用它：

summary(lm(rank ~ num_ays, data=a[!is.null(a$num_ays),]))

And to build on Shane's answer: you can use that in the data= argument of lm():

summary(lm(rank ~ num_ays, data=a[!is.null(a$num_ays),]))

回复收藏 0 原文

~没有更多了~

关于作者

过期以后

暂无简介

文章

28 人气

关注发私信

微信用户

文章 0 评论 0

关注

夜夜流光相皎洁

文章 0 评论 0

关注

零度℉

文章 0 评论 0

关注

百度③文鱼

文章 0 评论 0

关注

qq_O3Ao6frw

文章 0 评论 0

关注

Wugswg

文章 0 评论 0

友情链接

文江博客

如何忽略R中的空值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

如何忽略R中的空值？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

微信用户

夜夜流光相皎洁

零度℉

百度③文鱼

qq_O3Ao6frw

Wugswg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。