系数相反的“标志”对于两个逻辑回归

发布于 2025-01-21 07:09:32 字数 1024 浏览 0 评论 0原文

我正在尝试使用距离（来自目标）作为特征构建XG模型，而目标变量是一个虚拟可观的，该变量指示射击是否导致目标。因此，我正在尝试进行简单的逻辑回归。我试图复制一个模型，其中拟合使用STATSMODELS包装，这导致正系数为0.16，截距为-0.5。

当我使用Scikit -Learn拟合线时，系数为-0.16。截距也发生了同样的情况，约为0.5。因此，系数以某种方式“翻转”。

数据集示例：


Goal    X   Y   C       Distance    Angle
1       12  41  9.0     13.891814   0.474451
0       15  52  2.0     15.803560   0.453823
0       19  33  17.0    22.805811   0.280597
0       25  30  20.0    29.292704   0.223680
0       10  39  11.0    12.703248   0.479051

Scikit-Learn代码：

feature_cols = ['Distance']

X = shots_model[feature_cols] # Features
y = shots_model['Goal'] # Target
y = y.astype('category')
m1 = LogisticRegression()
m1.fit(X_train, y_train)

StatsModels代码：

test_model = smf.glm(formula="Goal ~ " + model, data=shots_model, 
                           family=sm.families.Binomial()).fit()
print(test_model.summary())        
b=test_model.params

我可能缺少一些简单的东西，因为我对机器学习非常陌生，这已经使我感到困惑了一段时间。请帮忙。

原文

I am trying to build an xG-model using Distance (from goal) as feature and the target variable is a dummy-variable indicating whether the shot resulted in a goal or not. So I am trying to make a simple logistic regression. I tried to replicate a model where the fitting was done with the statsmodels-package, which resulted in a positive coefficient of 0.16 and an intercept of -0.5.

When I fitted the line using scikit-learn the coefficient was -0.16. The same happened with the intercept, which was around 0.5. So somehow the coefficients have "flipped".

Dataset example:


Goal    X   Y   C       Distance    Angle
1       12  41  9.0     13.891814   0.474451
0       15  52  2.0     15.803560   0.453823
0       19  33  17.0    22.805811   0.280597
0       25  30  20.0    29.292704   0.223680
0       10  39  11.0    12.703248   0.479051

scikit-learn code:

feature_cols = ['Distance']

X = shots_model[feature_cols] # Features
y = shots_model['Goal'] # Target
y = y.astype('category')
m1 = LogisticRegression()
m1.fit(X_train, y_train)

statsmodels code:

test_model = smf.glm(formula="Goal ~ " + model, data=shots_model, 
                           family=sm.families.Binomial()).fit()
print(test_model.summary())        
b=test_model.params

I am probably missing something simple, as I am pretty new to Machine Learning, and this has been puzzling me for some time now. Please help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

滥情哥ㄟ 2025-01-28 07:09:32

我不确定您的输出是什么。但是，您现在可以做的是在新的测试数据上测试模型。获得的预测是分数值（0到1之间），表示导致目标的概率。然后将这些值舍入以获得1或0的离散值。在此之后，您可以使用混淆矩阵或精确函数函数来测试模型的准确性。有关更详细的代码，您可以参考本文。 https://www.geeksforgeeks.orgss.orgs.org/logistic-logistic-rogission-regression-regression-using-statsmodels/ <-statsmodels/ /a>

我认为，如果您可以获得两个模块的相应二进制结果，并且两个型号的准确性很接近，那么您就不必担心翻转系数。基本上，我的想法是，如果您可以在两种方法上获得准确的预测（1或0），那么一切都很好。希望我的答案对您有帮助！

回复收藏 0 原文