Python logit 回归矩阵形状错误“ValueError:endog 和 exog 矩阵大小不同”
基本设置:我试图在 python 中对创建一家企业的概率(创始人变量)运行 logit 回归,外生变量是年份、年龄、edu_cat(教育类别)和性别。
X 矩阵为 (4, 650),y 矩阵为(1, 650)。 x 矩阵中的所有变量都有 650 个非 NaN 观测值。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
x=np.array ([ df_all['Year'], df_all['Age'], df_all['Edu_cat'], df_all['sex']])
y= np.array([df_all['founder']])
logit_model = sm.Logit(y, x)
result = logit_model.fit()
print(result)
所以我跟踪形状是好的,但 python 告诉我不然。我错过了一些基本的东西吗?
Basic setup: I'm trying to run a logit regression in python on the probability of founding a business (founder variable) the exogenous variables are year, age, edu_cat (education category), and sex.
The X matrix is (4, 650), and the y matrix(1, 650). All of the variables within the x matrix have 650 non-NaN observations.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
x=np.array ([ df_all['Year'], df_all['Age'], df_all['Edu_cat'], df_all['sex']])
y= np.array([df_all['founder']])
logit_model = sm.Logit(y, x)
result = logit_model.fit()
print(result)
So I'm tracking that the shape is good, but python is telling me otherwise. Am I missing something basic?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我相信问题出在 Y 数组上,它是 [650,1],而它应该是 [650,],它默认为。另外,我需要通过转置来制作 x 数组 [650,4]。
I believe the issue is with the Y array, being [650,1], when it should be [650,], which it defaults to. Additionally I needed to make the x array [650,4] through a transpose.