如何处理“ valueerror:数组不得包含infs或nans”。在Python运行回归时
我有一个具有生长变量的DF,通常某些初始值为0,在这种情况下,当值从零移动到非均方体时,它会产生无限的值。
即,
.. some variables... var1 var2 var1_growth var2_growth
0 0 NaN NaN
0 1 NaN inf
1 2 inf 1
1.5 2.2 0.5 0.1
...
当我运行Panelols时,我会收到一条错误消息,
ValueError: array must not contain infs or NaNs
是否有一种方法可以忽略这些条目继续进行回归,而无需删除它们并创建其他数据集?
如果没有,最好的方法是什么?我应该在两个列中删除带有“ INF”值的应用行?有一个简单的方法吗?谢谢。
I have a df with growth variables and often some initial values are 0, in which case it produces infinite values when the value moves from zero to non-zeros.
i.e.
.. some variables... var1 var2 var1_growth var2_growth
0 0 NaN NaN
0 1 NaN inf
1 2 inf 1
1.5 2.2 0.5 0.1
...
when i run PanelOLS, i get an error message
ValueError: array must not contain infs or NaNs
Is there a way to ignore these entries to continue with the regression without having to drop them and create a different dataset?
If not, what would be the best way to proceed? should I drop app rows with 'inf' values in both columns? is there an easy way to do this? thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不,您不能忽略这些条目。在训练模型之前,需要处理此问题,如果没有,您将无法训练。
根据您的数据和应用程序,优先采用其他方法来处理这些
nan
和inf
。在这种情况下,我们正在删除所有具有
inf
或nan
值的行。No, you can't ignore these entries. This issue need to be handle before training the model, if not, you can not train it.
Depending on your data and application a different method is preferred to handle these
NaN
andinf
. One example of code that is posted in this SO question:In this case, we are removing all rows that have
inf
orNaN
values.