线性模型函数 lm() 错误:外部函数调用中的 NA/NaN/Inf (arg 1)
假设我有 data.frame a
我使用
m.fit <- lm(col2 ~ col3 * col4, na.action = na.exclude)
col2
有一些 NA
值,col3
和 col4< /code> 的值小于 1。
我一直收到消息
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in foreign function call (arg 1)
,我已经检查了邮件列表,看来这是因为 col2
中的 NA
,但我尝试使用na.action=na.exclude/omit/pass
但它们似乎都不起作用。我在前 10 个条目上再次测试了 lm
,绝对不是因为 NA
。这个警告的问题是每个谷歌结果似乎都指向NA
。
我是否误解了该错误,或者我错误地使用了 lm
?
数据位于 kaggle。我正在使用线性回归对 MonthlyIncome 数据进行建模(因为我无法让某个 glm
系列发挥作用)。我已经创建了自己的变量来使用,但如果您尝试使用已存在的变量对 MonthlyIncome 进行建模,则会失败。
Say I have data.frame a
I use
m.fit <- lm(col2 ~ col3 * col4, na.action = na.exclude)
col2
has some NA
values, col3
and col4
have values less than 1.
I keep getting
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in foreign function call (arg 1)
I've checked the mailing list and it appears that it is because of the NA
s in col2
but I tried using na.action=na.exclude/omit/pass
but none of them seem to work. I've tested lm
again on first 10 entries, definitely not because of the NA
s. Problem with this warning is every google results seem to be pointing at NA
.
Did I misinterpret the error or am I using lm
wrongly?
Data is at kaggle. I'm modelling MonthlyIncome data using linear regression (as I couldn't get a certain glm
family to work). I've created my own variables to use but if you try to model MonthlyIncome with variables already present it fails.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
我知道这个线程确实很旧,但答案似乎并不完整,我刚刚遇到了同样的问题。
我遇到的问题是因为 NA 列也有 NaN 和 Inf。删除它们并重试。具体来说:
希望对您 18 个月大的问题有所帮助!
I know this thread is really old, but the answers don't seem complete, and I just ran into the same problem.
The problem I was having was because the NA columns also had NaN and Inf. Remove those and try it again. Specifically:
Hope that helps your 18 month old question!
你应该读一读这本书R 初学者指南 对此有完整的解释。具体来说,它提到了以下错误:
解决方案是向 Intensity 数据添加一个小的常量值,例如 1。请注意,正在进行的讨论在统计界关于添加一个小值。尽管如此,在 R 中进行计算时不能使用零的对数。
You should have a read the book A Beginner’s Guide to R for a complete explanation on this. Specifically, it mentions the following error:
The solution is to add a small constant value to the Intensity data, for example, 1. Note that there is an on-going discussion in the statistical community concerning adding a small value. Be that as it may, you cannot use the log of zero when doing calculations in R.
在所有可能的
na.omit
和na.exclude
检查之后,我只是遇到了另一种可能性。我采用了类似的方法:
lm(log(x) ~ log(y), data = ...)
没有注意到,对于我的数据集中的某些值,x 或 y 可能为零:
log(0) = -Inf
所以还有一件事需要注意!
I just suffered another possibility, after all posible
na.omit
andna.exclude
checks.I was taking something like:
lm(log(x) ~ log(y), data = ...)
Without noticing that, for some values in my dataset, x or y could be zero:
log(0) = -Inf
So just another thing to watch out for!
我通过重置选项解决了此类问题。
选项(na.action="na.exclude")
或者
options(na.action="na.omit")
我检查了我的设置,之前已将选项更改为
“na.pass”并没有放弃我对 NA 的 y 观察(其中 y~x )。
I solved this type of problem by resetting my options.
options(na.action="na.exclude")
or
options(na.action="na.omit")
I checked my settings and had previously changed the option to
"na.pass" which didn't drop my y observations with NAs (where
y~x
).尝试更改 col2 的类型(以及所有其他变量)
Try changing the type of col2 (and all other variables)
我刚刚遇到了同样的问题。使用获得有限元
I just encountered the same problem. get the finite elements using
当我的 col2 的等价物是整数64而不是整数并且使用自然和多项式样条、样条::bs和样条:ns时,我遇到了这个错误:
转换为标准整数对我有用:
I encountered this error when my equivalent of col2 was an integer64 rather than an integer and when using natural and polynomial splines, splines::bs and splines:ns for example:
Converting to a standard integer worked for me:
当我在调用
reformulate
时反转参数并在不检查的情况下在lm
调用中使用公式时,出现此错误,因此我得到了错误的预测变量和响应变量。I got this error when I inverted the arguments when calling
reformulate
and use the formula in mylm
call without checking, so I had the wrong predictor and response variable.另一件需要注意的事情是使用 log() 或 sin() 等函数使 x 和 y 的 inf 。例如。 log 0 = 0 或 sin(pi) = 0。
Another thing to watch out for is using functions like log() or sin() make your x's and y's inf. eg. log 0 = 0 or sin(pi) = 0.
这对我的案例有帮助。我解析了已经排除 NA 和 INF 的数据。
This is what helped in my case. I parsed the data that already exclude NAs and INFs.
确保因变量中没有任何 0。
Make sure you don't have any 0 in your dependent variable.