LinAlgError:使用逐步的奇异矩阵

发布于 2025-01-11 00:46:03 字数 13199 浏览 0 评论 0原文

我正在尝试逐步进行逻辑回归,但我得到了一个奇异矩阵错误,这在这里没有任何意义

log_res = sm.Logit(regdata.iloc[:,8],regdata.iloc[:,np.r_[0:8]]).fit()
print(log_res.summary())
regdata['churn']=log_res.predict(regdata.iloc[:,np.r_[0:8]])

forward=stepwise.forwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)

到目前为止没有问题,它在

#Bacward selection algorithm in order to have our final model

backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)

backward[2].summary()

Out:

    Character Variables (Dummies Generated, First Dummies Dropped): []
    Optimization terminated successfully.
             Current function value: 0.376811
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.375715
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.368832
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.376569
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376726
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376594
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376811
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Entered : fallbonus_x   AIC : 3407.4576859056674
    Optimization terminated successfully.
             Current function value: 0.360770
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353790
             Iterations 6
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.361579
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361790
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361649
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353790
             Iterations 6
    Entered : Playsession   AIC : 3333.750680420261
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.000003
             Iterations 34
    Optimization terminated successfully.
             Current function value: 0.353789
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353719
             Iterations 6
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    Optimization terminated successfully.
             Current function value: 0.353539
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353767
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
    Entered : CustomerID    AIC : 3326.9000578047444
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.352848
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352783
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352595
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352822
             Iterations 6
    Break : Significance Level
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
                               Logit Regression Results                           
    ==============================================================================
    Dep. Variable:                  churn   No. Observations:                 4703
    Model:                          Logit   Df Residuals:                     4699
    Method:                           MLE   Df Model:                            3
    Date:                Mon, 28 Feb 2022   Pseudo R-squ.:                 0.06359
    Time:                        13:04:45   Log-Likelihood:                -1659.5
    converged:                       True   LL-Null:                       -1772.1
    Covariance Type:            nonrobust   LLR p-value:                 1.374e-48
    ===============================================================================
                      coef    std err          z      P>|z|      [0.025      0.975]
    -------------------------------------------------------------------------------
    intercept       2.6289      0.119     22.076      0.000       2.395       2.862
    fallbonus_x    -1.1698      0.095    -12.285      0.000      -1.356      -0.983
    Playsession    -0.1260      0.014     -8.842      0.000      -0.154      -0.098
    CustomerID   9.337e-05   3.15e-05      2.968      0.003    3.17e-05       0.000
    ===============================================================================
    AIC: 3326.9000578047444
    BIC: 3352.723881332525
    Final Variables: ['intercept', 'fallbonus_x', 'Playsession', 'CustomerID']
    Character Variables (Dummies Generated, First Dummies Dropped): []
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : intercept
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : CustomerType
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    Eliminated : Age
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : Income
    Warning: Maximum number of iterations has been exceeded.
             Current function value: inf
             Iterations: 35
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1863: RuntimeWarning: divide by zero encountered in log
    ---------------------------------------------------------------------------
    LinAlgError                               Traceback (most recent call last)
    /var/folders/vj/pxgh1vcx2zsf_cs2hld5dxvm0000gn/T/ipykernel_23105/4005659379.py in <module>
          5 
          6 #Bacward selection algorithm in order to have our final model
    ----> 7 backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
          8 
          9 backward[2].summary()
    ~/Downloads/stepwise.py in backwardSelection(X, y, model_type, elimination_criteria, varchar_process, sl)
         97     """
         98     X = __varcharProcessing__(X,varchar_process = varchar_process)
    ---> 99     return __backwardSelectionRaw__(X, y, model_type = model_type,elimination_criteria = elimination_criteria , sl=sl)
        100 
        101 def __varcharProcessing__(X, varchar_process = "dummy_dropfirst"):
    ~/Downloads/stepwise.py in __backwardSelectionRaw__(X, y, model_type, elimination_criteria, sl)
        254             if elimination_criteria == "aic":
        255                 criteria = model.aic
    --> 256                 new_model = regressor(y,X)
        257                 new_criteria = new_model.aic
        258                 if criteria < new_criteria:
    ~/Downloads/stepwise.py in regressor(y, X, model_type)
        244             regressor = sm.OLS(y, X).fit()
        245         elif model_type == "logistic":
    --> 246             regressor = sm.Logit(y, X).fit()
        247         else:
        248             print("\nWrong Model Type : "+ model_type +"\nLinear model type is seleted.")
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
       1972     def fit(self, start_params=None, method='newton', maxiter=35,
       1973             full_output=1, disp=1, callback=None, **kwargs):
    -> 1974         bnryfit = super().fit(start_params=start_params,
       1975                               method=method,
       1976                               maxiter=maxiter,
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
        225             pass  # TODO: make a function factory to have multiple call-backs
        226 
    --> 227         mlefit = super().fit(start_params=start_params,
        228                              method=method,
        229                              maxiter=maxiter,
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py in fit(self, start_params, method, maxiter, full_output, disp, fargs, callback, retall, skip_hessian, **kwargs)
        532             Hinv = cov_params_func(self, xopt, retvals)
        533         elif method == 'newton' and full_output:
    --> 534             Hinv = np.linalg.inv(-retvals['Hessian']) / nobs
        535         elif not skip_hessian:
        536             H = -1 * self.hessian(xopt)
    <__array_function__ internals> in inv(*args, **kwargs)
    ~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
        543     signature = 'D->D' if isComplexType(t) else 'd->d'
        544     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
    --> 545     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
        546     return wrap(ainv.astype(result_t, copy=False))
        547 
    ~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
         86 
         87 def _raise_linalgerror_singular(err, flag):
    ---> 88     raise LinAlgError("Singular matrix")
         89 
         90 def _raise_linalgerror_nonposdef(err, flag):
    LinAlgError: Singular matrix

**

I am试图使用逐步运行逻辑回归之后开始,我很漂亮确保数据帧没问题,因为我之前遇到了错误并对其进行了调试。

我检查了我的变量之间的相关性(我看到一篇文章,其中有人删除了非常低的相关性,尽管它不是逐步的,但它对他有用)并且我有一些负相关性,但我不确定是否应该删除任何东西,因为老师告诉我们要使用尽可能多的数据以获得更好的结果。

这是我的相关性 输入图片这里的描述

我第一次遇到这样的错误,请告诉我是否有办法修复它。

I am trying to do a logistic regression with stepwise but I get a singular matrix error which doesn't make any sense here

log_res = sm.Logit(regdata.iloc[:,8],regdata.iloc[:,np.r_[0:8]]).fit()
print(log_res.summary())
regdata['churn']=log_res.predict(regdata.iloc[:,np.r_[0:8]])

forward=stepwise.forwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)

Up to here no problem, it starts after

#Bacward selection algorithm in order to have our final model

backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)

backward[2].summary()

Out:

    Character Variables (Dummies Generated, First Dummies Dropped): []
    Optimization terminated successfully.
             Current function value: 0.376811
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.375715
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.368832
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.376569
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376726
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376594
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.376811
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Entered : fallbonus_x   AIC : 3407.4576859056674
    Optimization terminated successfully.
             Current function value: 0.360770
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353790
             Iterations 6
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.361579
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361790
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361649
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.361839
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353790
             Iterations 6
    Entered : Playsession   AIC : 3333.750680420261
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.000003
             Iterations 34
    Optimization terminated successfully.
             Current function value: 0.353789
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353719
             Iterations 6
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    Optimization terminated successfully.
             Current function value: 0.353539
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.353767
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
    Entered : CustomerID    AIC : 3326.9000578047444
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Optimization terminated successfully.
             Current function value: 0.352848
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352783
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352595
             Iterations 6
    Optimization terminated successfully.
             Current function value: 0.352822
             Iterations 6
    Break : Significance Level
    Optimization terminated successfully.
             Current function value: 0.352849
             Iterations 6
                               Logit Regression Results                           
    ==============================================================================
    Dep. Variable:                  churn   No. Observations:                 4703
    Model:                          Logit   Df Residuals:                     4699
    Method:                           MLE   Df Model:                            3
    Date:                Mon, 28 Feb 2022   Pseudo R-squ.:                 0.06359
    Time:                        13:04:45   Log-Likelihood:                -1659.5
    converged:                       True   LL-Null:                       -1772.1
    Covariance Type:            nonrobust   LLR p-value:                 1.374e-48
    ===============================================================================
                      coef    std err          z      P>|z|      [0.025      0.975]
    -------------------------------------------------------------------------------
    intercept       2.6289      0.119     22.076      0.000       2.395       2.862
    fallbonus_x    -1.1698      0.095    -12.285      0.000      -1.356      -0.983
    Playsession    -0.1260      0.014     -8.842      0.000      -0.154      -0.098
    CustomerID   9.337e-05   3.15e-05      2.968      0.003    3.17e-05       0.000
    ===============================================================================
    AIC: 3326.9000578047444
    BIC: 3352.723881332525
    Final Variables: ['intercept', 'fallbonus_x', 'Playsession', 'CustomerID']
    Character Variables (Dummies Generated, First Dummies Dropped): []
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : intercept
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : CustomerType
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py:566: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1810: RuntimeWarning: overflow encountered in exp
    Eliminated : Age
    Warning: Maximum number of iterations has been exceeded.
             Current function value: 0.000003
             Iterations: 35
    Eliminated : Income
    Warning: Maximum number of iterations has been exceeded.
             Current function value: inf
             Iterations: 35
    /Users/nesrinetiar/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py:1863: RuntimeWarning: divide by zero encountered in log
    ---------------------------------------------------------------------------
    LinAlgError                               Traceback (most recent call last)
    /var/folders/vj/pxgh1vcx2zsf_cs2hld5dxvm0000gn/T/ipykernel_23105/4005659379.py in <module>
          5 
          6 #Bacward selection algorithm in order to have our final model
    ----> 7 backward=stepwise.backwardSelection(regdata.iloc[:,np.r_[0:8]], regdata.iloc[:,8],model_type ="logistic",elimination_criteria = "aic", sl=0.05)
          8 
          9 backward[2].summary()
    ~/Downloads/stepwise.py in backwardSelection(X, y, model_type, elimination_criteria, varchar_process, sl)
         97     """
         98     X = __varcharProcessing__(X,varchar_process = varchar_process)
    ---> 99     return __backwardSelectionRaw__(X, y, model_type = model_type,elimination_criteria = elimination_criteria , sl=sl)
        100 
        101 def __varcharProcessing__(X, varchar_process = "dummy_dropfirst"):
    ~/Downloads/stepwise.py in __backwardSelectionRaw__(X, y, model_type, elimination_criteria, sl)
        254             if elimination_criteria == "aic":
        255                 criteria = model.aic
    --> 256                 new_model = regressor(y,X)
        257                 new_criteria = new_model.aic
        258                 if criteria < new_criteria:
    ~/Downloads/stepwise.py in regressor(y, X, model_type)
        244             regressor = sm.OLS(y, X).fit()
        245         elif model_type == "logistic":
    --> 246             regressor = sm.Logit(y, X).fit()
        247         else:
        248             print("\nWrong Model Type : "+ model_type +"\nLinear model type is seleted.")
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
       1972     def fit(self, start_params=None, method='newton', maxiter=35,
       1973             full_output=1, disp=1, callback=None, **kwargs):
    -> 1974         bnryfit = super().fit(start_params=start_params,
       1975                               method=method,
       1976                               maxiter=maxiter,
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/discrete/discrete_model.py in fit(self, start_params, method, maxiter, full_output, disp, callback, **kwargs)
        225             pass  # TODO: make a function factory to have multiple call-backs
        226 
    --> 227         mlefit = super().fit(start_params=start_params,
        228                              method=method,
        229                              maxiter=maxiter,
    ~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/base/model.py in fit(self, start_params, method, maxiter, full_output, disp, fargs, callback, retall, skip_hessian, **kwargs)
        532             Hinv = cov_params_func(self, xopt, retvals)
        533         elif method == 'newton' and full_output:
    --> 534             Hinv = np.linalg.inv(-retvals['Hessian']) / nobs
        535         elif not skip_hessian:
        536             H = -1 * self.hessian(xopt)
    <__array_function__ internals> in inv(*args, **kwargs)
    ~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in inv(a)
        543     signature = 'D->D' if isComplexType(t) else 'd->d'
        544     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
    --> 545     ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
        546     return wrap(ainv.astype(result_t, copy=False))
        547 
    ~/opt/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
         86 
         87 def _raise_linalgerror_singular(err, flag):
    ---> 88     raise LinAlgError("Singular matrix")
         89 
         90 def _raise_linalgerror_nonposdef(err, flag):
    LinAlgError: Singular matrix

**

I am trying to run a logistic regression using stepwise, I am pretty sure the dataframe is fine because I had an error previously and debugged it.

I checked the correlation between my variable (I saw a post where someone removed very low correlations and it worked for him though it wasn't with stepwise) and I have some negative correlations but I am not sure if I should remove anything since the teacher told us to use as much data as possible for better results.

Here's what I have for correlation
enter image description here

First time I have an error like this, please let me know if there is a way I can fix it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文