LinAlgError:使用逐步的奇异矩阵
我正在尝试逐步进行逻辑回归,但我得到了一个奇异矩阵错误,这在这里没有任何意义
log_res = sm.Logit(regdata.iloc[:,8],regdata.iloc[:,np.r_[0:8]]).fit()
print(log_res.summary())
regdata['churn']=log_res.predict(regdata.iloc[:,np.r_[0:8]])
forward=stepwise.forwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
到目前为止没有问题,它在
#Bacward selection algorithm in order to have our final model
backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
backward[2].summary()
Out:
Character Variables (Dummies Generated, First Dummies Dropped): []
Optimization terminated successfully.
Current function value: 0.376811
Iterations 6
Optimization terminated successfully.
Current function value: 0.375715
Iterations 6
Optimization terminated successfully.
Current function value: 0.368832
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.376569
Iterations 6
Optimization terminated successfully.
Current function value: 0.376726
Iterations 6
Optimization terminated successfully.
Current function value: 0.376594
Iterations 6
Optimization terminated successfully.
Current function value: 0.376811
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Entered : fallbonus_x AIC : 3407.4576859056674
Optimization terminated successfully.
Current function value: 0.360770
Iterations 6
Optimization terminated successfully.
Current function value: 0.353790
Iterations 6
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.361579
Iterations 6
Optimization terminated successfully.
Current function value: 0.361790
Iterations 6
Optimization terminated successfully.
Current function value: 0.361649
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Optimization terminated successfully.
Current function value: 0.353790
Iterations 6
Entered : Playsession AIC : 3333.750680420261
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Optimization terminated successfully.
Current function value: 0.000003
Iterations 34
Optimization terminated successfully.
Current function value: 0.353789
Iterations 6
Optimization terminated successfully.
Current function value: 0.353719
Iterations 6
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
Optimization terminated successfully.
Current function value: 0.353539
Iterations 6
Optimization terminated successfully.
Current function value: 0.353767
Iterations 6
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Entered : CustomerID AIC : 3326.9000578047444
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.352848
Iterations 6
Optimization terminated successfully.
Current function value: 0.352783
Iterations 6
Optimization terminated successfully.
Current function value: 0.352595
Iterations 6
Optimization terminated successfully.
Current function value: 0.352822
Iterations 6
Break : Significance Level
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Logit Regression Results
==============================================================================
Dep. Variable: churn No. Observations: 4703
Model: Logit Df Residuals: 4699
Method: MLE Df Model: 3
Date: Mon, 28 Feb 2022 Pseudo R-squ.: 0.06359
Time: 13:04:45 Log-Likelihood: -1659.5
converged: True LL-Null: -1772.1
Covariance Type: nonrobust LLR p-value: 1.374e-48
===============================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------
intercept 2.6289 0.119 22.076 0.000 2.395 2.862
fallbonus_x -1.1698 0.095 -12.285 0.000 -1.356 -0.983
Playsession -0.1260 0.014 -8.842 0.000 -0.154 -0.098
CustomerID 9.337e-05 3.15e-05 2.968 0.003 3.17e-05 0.000
===============================================================================
AIC: 3326.9000578047444
BIC: 3352.723881332525
Final Variables: ['intercept', 'fallbonus_x', 'Playsession', 'CustomerID']
Character Variables (Dummies Generated, First Dummies Dropped): []
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : intercept
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : CustomerType
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
Eliminated : Age
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : Income
Warning: Maximum number of iterations has been exceeded.
Current function value: inf
Iterations: 35
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1863: RuntimeWarning: divide by zero encountered in log
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
/var/folders/vj/pxgh1vcx2zsf_cs2hld5dxvm0000gn/T/ipykernel_23105/4005659379.py in <module>
5
6 #Bacward selection algorithm in order to have our final model
----> 7 backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
8
9 backward[2].summary()
~/Downloads/stepwise.py in backwardSelection(X, y, model_type, elimination_criteria, varchar_process, sl)
97 """
98 X = __varcharProcessing__(X,varchar_process = varchar_process)
---> 99 return __backwardSelectionRaw__(X, y, model_type = model_type,elimination_criteria = elimination_criteria , sl=sl)
100
101 def __varcharProcessing__(X, varchar_process = "dummy_dropfirst"):
~/Downloads/stepwise.py in __backwardSelectionRaw__(X, y, model_type, elimination_criteria, sl)
254 if elimination_criteria == "aic":
255 criteria = model.aic
--> 256 new_model = regressor(y,X)
257 new_criteria = new_model.aic
258 if criteria < new_criteria:
~/Downloads/stepwise.py in regressor(y, X, model_type)
244 regressor = sm.OLS(y, X).fit()
245 elif model_type == "logistic":
--> 246 regressor = sm.Logit(y, X).fit()
247 else:
248 print("\nWrong Model Type : "+ model_type +"\nLinear model type is seleted.")
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
1972 def fit(self, start_params=None, method='newton', maxiter=35,
1973 full_output=1, disp=1, callback=None, **kwargs):
-> 1974 bnryfit = super().fit(start_params=start_params,
1975 method=method,
1976 maxiter=maxiter,
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
225 pass # TODO: make a function factory to have multiple call-backs
226
--> 227 mlefit = super().fit(start_params=start_params,
228 method=method,
229 maxiter=maxiter,
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py in fit(self, start_params, method, maxiter, full_output, disp, fargs, callback, retall, skip_hessian, **kwargs)
532 Hinv = cov_params_func(self, xopt, retvals)
533 elif method == 'newton' and full_output:
--> 534 Hinv = np.linalg.inv(-retvals['Hessian']) / nobs
535 elif not skip_hessian:
536 H = -1 * self.hessian(xopt)
<__array_function__ internals> in inv(*args, **kwargs)
~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
543 signature = 'D->D' if isComplexType(t) else 'd->d'
544 extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 545 ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
546 return wrap(ainv.astype(result_t, copy=False))
547
~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
86
87 def _raise_linalgerror_singular(err, flag):
---> 88 raise LinAlgError("Singular matrix")
89
90 def _raise_linalgerror_nonposdef(err, flag):
LinAlgError: Singular matrix
**
I am试图使用逐步运行逻辑回归之后开始,我很漂亮确保数据帧没问题,因为我之前遇到了错误并对其进行了调试。
我检查了我的变量之间的相关性(我看到一篇文章,其中有人删除了非常低的相关性,尽管它不是逐步的,但它对他有用)并且我有一些负相关性,但我不确定是否应该删除任何东西,因为老师告诉我们要使用尽可能多的数据以获得更好的结果。
我第一次遇到这样的错误,请告诉我是否有办法修复它。
I am trying to do a logistic regression with stepwise but I get a singular matrix error which doesn't make any sense here
log_res = sm.Logit(regdata.iloc[:,8],regdata.iloc[:,np.r_[0:8]]).fit()
print(log_res.summary())
regdata['churn']=log_res.predict(regdata.iloc[:,np.r_[0:8]])
forward=stepwise.forwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
Up to here no problem, it starts after
#Bacward selection algorithm in order to have our final model
backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
backward[2].summary()
Out:
Character Variables (Dummies Generated, First Dummies Dropped): []
Optimization terminated successfully.
Current function value: 0.376811
Iterations 6
Optimization terminated successfully.
Current function value: 0.375715
Iterations 6
Optimization terminated successfully.
Current function value: 0.368832
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.376569
Iterations 6
Optimization terminated successfully.
Current function value: 0.376726
Iterations 6
Optimization terminated successfully.
Current function value: 0.376594
Iterations 6
Optimization terminated successfully.
Current function value: 0.376811
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Entered : fallbonus_x AIC : 3407.4576859056674
Optimization terminated successfully.
Current function value: 0.360770
Iterations 6
Optimization terminated successfully.
Current function value: 0.353790
Iterations 6
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.361579
Iterations 6
Optimization terminated successfully.
Current function value: 0.361790
Iterations 6
Optimization terminated successfully.
Current function value: 0.361649
Iterations 6
Optimization terminated successfully.
Current function value: 0.361839
Iterations 6
Optimization terminated successfully.
Current function value: 0.353790
Iterations 6
Entered : Playsession AIC : 3333.750680420261
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Optimization terminated successfully.
Current function value: 0.000003
Iterations 34
Optimization terminated successfully.
Current function value: 0.353789
Iterations 6
Optimization terminated successfully.
Current function value: 0.353719
Iterations 6
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
Optimization terminated successfully.
Current function value: 0.353539
Iterations 6
Optimization terminated successfully.
Current function value: 0.353767
Iterations 6
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Entered : CustomerID AIC : 3326.9000578047444
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Optimization terminated successfully.
Current function value: 0.352848
Iterations 6
Optimization terminated successfully.
Current function value: 0.352783
Iterations 6
Optimization terminated successfully.
Current function value: 0.352595
Iterations 6
Optimization terminated successfully.
Current function value: 0.352822
Iterations 6
Break : Significance Level
Optimization terminated successfully.
Current function value: 0.352849
Iterations 6
Logit Regression Results
==============================================================================
Dep. Variable: churn No. Observations: 4703
Model: Logit Df Residuals: 4699
Method: MLE Df Model: 3
Date: Mon, 28 Feb 2022 Pseudo R-squ.: 0.06359
Time: 13:04:45 Log-Likelihood: -1659.5
converged: True LL-Null: -1772.1
Covariance Type: nonrobust LLR p-value: 1.374e-48
===============================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------
intercept 2.6289 0.119 22.076 0.000 2.395 2.862
fallbonus_x -1.1698 0.095 -12.285 0.000 -1.356 -0.983
Playsession -0.1260 0.014 -8.842 0.000 -0.154 -0.098
CustomerID 9.337e-05 3.15e-05 2.968 0.003 3.17e-05 0.000
===============================================================================
AIC: 3326.9000578047444
BIC: 3352.723881332525
Final Variables: ['intercept', 'fallbonus_x', 'Playsession', 'CustomerID']
Character Variables (Dummies Generated, First Dummies Dropped): []
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : intercept
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : CustomerType
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
Eliminated : Age
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.000003
Iterations: 35
Eliminated : Income
Warning: Maximum number of iterations has been exceeded.
Current function value: inf
Iterations: 35
/Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1863: RuntimeWarning: divide by zero encountered in log
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
/var/folders/vj/pxgh1vcx2zsf_cs2hld5dxvm0000gn/T/ipykernel_23105/4005659379.py in <module>
5
6 #Bacward selection algorithm in order to have our final model
----> 7 backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
8
9 backward[2].summary()
~/Downloads/stepwise.py in backwardSelection(X, y, model_type, elimination_criteria, varchar_process, sl)
97 """
98 X = __varcharProcessing__(X,varchar_process = varchar_process)
---> 99 return __backwardSelectionRaw__(X, y, model_type = model_type,elimination_criteria = elimination_criteria , sl=sl)
100
101 def __varcharProcessing__(X, varchar_process = "dummy_dropfirst"):
~/Downloads/stepwise.py in __backwardSelectionRaw__(X, y, model_type, elimination_criteria, sl)
254 if elimination_criteria == "aic":
255 criteria = model.aic
--> 256 new_model = regressor(y,X)
257 new_criteria = new_model.aic
258 if criteria < new_criteria:
~/Downloads/stepwise.py in regressor(y, X, model_type)
244 regressor = sm.OLS(y, X).fit()
245 elif model_type == "logistic":
--> 246 regressor = sm.Logit(y, X).fit()
247 else:
248 print("\nWrong Model Type : "+ model_type +"\nLinear model type is seleted.")
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
1972 def fit(self, start_params=None, method='newton', maxiter=35,
1973 full_output=1, disp=1, callback=None, **kwargs):
-> 1974 bnryfit = super().fit(start_params=start_params,
1975 method=method,
1976 maxiter=maxiter,
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
225 pass # TODO: make a function factory to have multiple call-backs
226
--> 227 mlefit = super().fit(start_params=start_params,
228 method=method,
229 maxiter=maxiter,
~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py in fit(self, start_params, method, maxiter, full_output, disp, fargs, callback, retall, skip_hessian, **kwargs)
532 Hinv = cov_params_func(self, xopt, retvals)
533 elif method == 'newton' and full_output:
--> 534 Hinv = np.linalg.inv(-retvals['Hessian']) / nobs
535 elif not skip_hessian:
536 H = -1 * self.hessian(xopt)
<__array_function__ internals> in inv(*args, **kwargs)
~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
543 signature = 'D->D' if isComplexType(t) else 'd->d'
544 extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 545 ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
546 return wrap(ainv.astype(result_t, copy=False))
547
~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
86
87 def _raise_linalgerror_singular(err, flag):
---> 88 raise LinAlgError("Singular matrix")
89
90 def _raise_linalgerror_nonposdef(err, flag):
LinAlgError: Singular matrix
**
I am trying to run a logistic regression using stepwise, I am pretty sure the dataframe is fine because I had an error previously and debugged it.
I checked the correlation between my variable (I saw a post where someone removed very low correlations and it worked for him though it wasn't with stepwise) and I have some negative correlations but I am not sure if I should remove anything since the teacher told us to use as much data as possible for better results.
Here's what I have for correlation
First time I have an error like this, please let me know if there is a way I can fix it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论