R 有没有办法找到 Inf/-Inf 值?
我正在尝试在大型数据集(5000x300)上运行随机森林。不幸的是,我收到如下错误消息:
> RF <- randomForest(prePrior1, postPrior1[,6]
+ ,,do.trace=TRUE,importance=TRUE,ntree=100,,forest=TRUE)
Error in randomForest.default(prePrior1, postPrior1[, 6], , do.trace = TRUE, :
NA/NaN/Inf in foreign function call (arg 1)
因此,我尝试使用 : 查找任何 NA
> df2 <- prePrior1[is.na(prePrior1)]
> df2
character(0)
> df2 <- postPrior1[is.na(postPrior1[,6])]
> df2
numeric(0)
,这使我相信 Inf 才是问题所在,因为似乎没有任何 NA。
关于如何根除 Inf 有什么建议吗?
I'm trying to run a randomForest on a large-ish data set (5000x300). Unfortunately I'm getting an error message as follows:
> RF <- randomForest(prePrior1, postPrior1[,6]
+ ,,do.trace=TRUE,importance=TRUE,ntree=100,,forest=TRUE)
Error in randomForest.default(prePrior1, postPrior1[, 6], , do.trace = TRUE, :
NA/NaN/Inf in foreign function call (arg 1)
So I try to find any NA's using :
> df2 <- prePrior1[is.na(prePrior1)]
> df2
character(0)
> df2 <- postPrior1[is.na(postPrior1[,6])]
> df2
numeric(0)
which leads me to believe that it's Inf's that are the problem as there don't seem to be any NA's.
Any suggestions for how to root out Inf's?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可能正在寻找
is.finite
,尽管我不能 100% 确定问题出在您的输入数据中的 Infs。请务必仔细阅读
is.finite
的帮助,了解它会选择哪些缺失、无限等组合。具体来说,是这样的:其中一件事情与其他事情不同。毫不奇怪,还有一个 is.nan 函数。
You're probably looking for
is.finite
, though I'm not 100% certain that the problem is Infs in your input data.Be sure to read the help for
is.finite
carefully about which combinations of missing, infinite, etc. it picks out. Specifically, this:One of these things is not like the others. Not surprisingly, there's an
is.nan
function as well.randomForest 的“外部函数调用中的 NA/NaN/Inf” 通常是一个错误警告,并且非常令人恼火:
我的快速而肮脏的技巧来缩小范围,对变量列表进行二进制搜索,并使用像
ntree=2
ntree=2 这样的标记参数代码> 得到一个变量子集即时通过/失败:randomForest's 'NA/NaN/Inf in foreign function call' is often a false warning, and really irritating:
My fast-and-dirty trick to narrow things down, do a binary-search on your variable list, and use token parameters like
ntree=2
to get an instant pass/fail on the subset of variables:与
is.na
类似,您可以使用is.infinite
来查找无穷大的出现。In analogy to
is.na
, you can useis.infinite
to find occurrences of infinites.看看
with
,例如:Take a look at
with
, e.g.:乔兰的回答是你想要的并且内容丰富。有关
is.na()
和is.infinite()
的更多详细信息,您应该查看 https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/is.na-methods.html此外,在获得表示原始向量的每个元素是否为 NA/Inf 的逻辑向量后,您可以使用
which()
函数来获取索引,就像这样:文档
which()
在这里 https://stat.ethz.ch/R-manual/R-devel/library/base/html/any.htmljoran's answer is what you want and informative. For more details about
is.na()
andis.infinite()
, you should check out https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/is.na-methods.htmland besides, after you get the logical vector which says whether each element of the original vector is NA/Inf, you can use the
which()
function to get the indices, just like this:the document for
which()
is here https://stat.ethz.ch/R-manual/R-devel/library/base/html/any.html