如何将所有变量从StatsModel(等效于R GLM)中的Python中的逻辑回归用于逻辑回归

发布于 2025-01-17 12:36:25 字数 388 浏览 2 评论 0原文

我想在Python进行逻辑回归。

我在R中的参考是

model_1 <- glm(status_1 ~., data = X_train, family=binomial)
summary(model_1)

我正在尝试将其转换为Python。但不太确定如何抓住所有变量。

import statsmodels.api as sm
model = sm.formula.glm("status_1 ~ ", family=sm.families.Binomial(), data=train).fit()
print(model.summary())

如何使用所有变量,这意味着在status_1之后我需要输入什么?

I would like to conduct Logistic Regression in Python.

My reference in R is

model_1 <- glm(status_1 ~., data = X_train, family=binomial)
summary(model_1)

I'm trying to convert this into Python. But not so sure how to grab all variables.

import statsmodels.api as sm
model = sm.formula.glm("status_1 ~ ", family=sm.families.Binomial(), data=train).fit()
print(model.summary())

How can I use all variables, which means what do I need to input after status_1?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

同尘 2025-01-24 12:36:25

statsmodels 使逻辑回归变得非常简单,如下所示:

import statsmodels.api as sm

Xtrain = df[['gmat', 'gpa', 'work_experience']]
ytrain = df[['admitted']]

log_reg = sm.Logit(ytrain, Xtrain).fit()

其中 gmatgpawork_experience 是您独立的变量。

statsmodels makes it pretty straightforward to do logistic regression, as such:

import statsmodels.api as sm

Xtrain = df[['gmat', 'gpa', 'work_experience']]
ytrain = df[['admitted']]

log_reg = sm.Logit(ytrain, Xtrain).fit()

Where gmat, gpa and work_experience are your independent variables.

护你周全 2025-01-24 12:36:25

根据您的问题,我了解您有二项式数据,并且想要使用 logit 作为链接函数创建广义线性模型。另外,正如您在 this 线程(jseabold 的回答)您提到的功能在 patsy 中尚不存在。因此,我将向您展示如何在拥有二项式数据时使用 sm.GLM() 函数创建广义线性模型。

#Imports

import numpy as np

import pandas as pd

import statsmodels.api as sm

#Suppose that your train data is in a dataframe called data_train

#Let's split the data into dependent and independent variables

在这个阶段,我想提一下,我们的因变量应该是一个包含两列的二维数组 statsmodels GLM 函数的帮助 建议:

二项式族模型接受具有两列的二维数组。如果提供,则每个观察结果预计为[成功、失败]。

#Let's create the array which holds the dependent variable

y = data_train[["the name of the column of successes","the name of the column of failures"]]

#Let's create the array which holds the independent variables

X = data_train.drop(columns = ["the name of the column of successes","the name of the column of failures"])

#We have to add a constant in the array of the independent variables because by default constants
#aren't included in the model

X = sm.add_constant(X)

#It's time to create our model

logit_model = sm.GLM(
    endog = y,
    exog = X,
    family = sm.families.Binomial(link=sm.families.links.Logit())).fit())

#Let's see some information about our model

logit_model.summary()

According to your question, I understand that you have binomial data and you want to create a Generalised Linear Model using logit as link function. Also, as you can see in this thread (jseabold's answer) the feature you mentioned doesn't exist in patsy yet. So I will show you how to create a Generalised Linear Model when you have Binomial data by using sm.GLM() function.

#Imports

import numpy as np

import pandas as pd

import statsmodels.api as sm

#Suppose that your train data is in a dataframe called data_train

#Let's split the data into dependent and independent variables

In this phase I want to mention that our dependent variable should be a 2d array with two columns as the help for the statsmodels GLM function suggests:

Binomial family models accept a 2d array with two columns. If supplied, each observation is expected to be [success, failure].

#Let's create the array which holds the dependent variable

y = data_train[["the name of the column of successes","the name of the column of failures"]]

#Let's create the array which holds the independent variables

X = data_train.drop(columns = ["the name of the column of successes","the name of the column of failures"])

#We have to add a constant in the array of the independent variables because by default constants
#aren't included in the model

X = sm.add_constant(X)

#It's time to create our model

logit_model = sm.GLM(
    endog = y,
    exog = X,
    family = sm.families.Binomial(link=sm.families.links.Logit())).fit())

#Let's see some information about our model

logit_model.summary()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文