为什么我无法通过 sklearn LogisticRegression 和coefficients_sgd 方法获得结果?
from math import exp
import numpy as np
from sklearn.linear_model import LogisticRegression
我使用了下面的代码 How To Implement Logistic Regression From Scratch in Python
def predict(row, coefficients):
yhat = coefficients[0]
for i in range(len(row)-1):
yhat += coefficients[i + 1] * row[i]
return 1.0 / (1.0 + exp(-yhat))
def coefficients_sgd(train, l_rate, n_epoch):
coef = [0.0 for i in range(len(train[0]))]
for epoch in range(n_epoch):
sum_error = 0
for row in train:
yhat = predict(row, coef)
error = row[-1] - yhat
sum_error += error**2
coef[0] = coef[0] + l_rate * error * yhat * (1.0 - yhat)
for i in range(len(row)-1):
coef[i + 1] = coef[i + 1] + l_rate * error * yhat * (1.0 - yhat) * row[i]
return coef
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
l_rate = 0.3
n_epoch = 100
coef = coefficients_sgd(dataset, l_rate, n_epoch)
print(coef)
<代码>[-0.39233141593823756, 1.4791536027917747, -2.316697087065274]
x = np.array(dataset)[:,:2]
y = np.array(dataset)[:,2]
model = LogisticRegression(penalty="none")
model.fit(x,y)
print(model.intercept_.tolist() + model.coef_.ravel().tolist())
[-3.233238244349982, 6.374828107647225, -9.631487530388092]
我应该更改什么才能获得相同或更接近的系数?如何建立初始系数、学习率、n_epoch?
from math import exp
import numpy as np
from sklearn.linear_model import LogisticRegression
I used code below from How To Implement Logistic Regression From Scratch in Python
def predict(row, coefficients):
yhat = coefficients[0]
for i in range(len(row)-1):
yhat += coefficients[i + 1] * row[i]
return 1.0 / (1.0 + exp(-yhat))
def coefficients_sgd(train, l_rate, n_epoch):
coef = [0.0 for i in range(len(train[0]))]
for epoch in range(n_epoch):
sum_error = 0
for row in train:
yhat = predict(row, coef)
error = row[-1] - yhat
sum_error += error**2
coef[0] = coef[0] + l_rate * error * yhat * (1.0 - yhat)
for i in range(len(row)-1):
coef[i + 1] = coef[i + 1] + l_rate * error * yhat * (1.0 - yhat) * row[i]
return coef
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
l_rate = 0.3
n_epoch = 100
coef = coefficients_sgd(dataset, l_rate, n_epoch)
print(coef)
[-0.39233141593823756, 1.4791536027917747, -2.316697087065274]
x = np.array(dataset)[:,:2]
y = np.array(dataset)[:,2]
model = LogisticRegression(penalty="none")
model.fit(x,y)
print(model.intercept_.tolist() + model.coef_.ravel().tolist())
[-3.233238244349982, 6.374828107647225, -9.631487530388092]
What should I change to get the same or closer coefficients ? How can I establish initial coefficients , learning rate , n_epoch ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,这里有很多细微差别
Well, there are many nuances here ????
First, recall that estimating coefficients of logistic regression with (negative) log-likelihood is possible using various optimization methods, including SGD you implemented, but there is no exact, closed-form solution. So even if you implement an exact copy of scikit-learn's
LogisticRegression
, you will need to set the same hyperparameters (number of epochs, learning rate, etc.) and random state to obtain the same coefficients.Second,
LogisticRegression
offers five different optimization methods (solver
parameter). You runLogisticRegression(penalty="none")
with its default parameters and the default forsolver
is'lbfgs'
, not SGD; so depending on your data and hyperparameters, you may get significantly different results.I would suggest comparing your implementation with
SGDClassifier(loss='log')
first, sinceLogisticRegression
does not offer SGD solver. Although keep in mind that scikit-learn's implementation is more sophisticated, in particular having more hyperparameters for early stopping liketol
.Typically, coefficients for SGD are initialized randomly (e.g.,
uniform(-1/(2n), 1/(2n))
), using some data statistics (e.g.,dot(y, w)/(dot(w, w)
for every coefficientw
), or with pre-trained model's parameters. On the contrary, there is no golden rule for learning rate or number of epochs. Usually, we set a big number of epochs and some other stopping criterion (e.g., whether norm between current and previous coefficients is smaller than some smalltol
), a moderate learning rate, and every iteration we reduce the learning rate following some rule (seelearning_rate
parameter ofSGDClassifier
or User Guide) and check the stopping criterion.