logisticRegress()vs logisticRegressioncv()及其CS超参数
我已经使用logisticRegress()
构建了一个模型,并且在网格搜索之后,数据建议我的正规化强度,c = .0000001
是“最佳”值做出我的预测。
此参数适用于logisticRegression()
,但是当我想跨validate时,我决定使用logisticRegressioncv()
等效 c parameter这表示为cs
,但是当我尝试传递相同的变量cs = .0000001
时,我会遇到一个错误:
797 warm_start_sag = {"coef": np.expand_dims(w0, axis=1)}
799 coefs = list()
--> 800 n_iter = np.zeros(len(Cs), dtype=np.int32)
801 for i, C in enumerate(Cs):
802 if solver == "lbfgs":
TypeError: object of type 'float' has no len()
当引用 documents 似乎对于logistic> logistic> logistic> logistic> logistic> logistic> logistic>
如果CS是INT,则在A中选择CS值的网格 1E-4和1E4之间的对数尺度。
然后,我仍将如何输入CS = .0000001的值?我对如何进行感到困惑。
I've built a model using LogisticRegression()
and after a grid search the data suggests for my inverse of regularization strength, C = .0000001
is the "best" value to make my predictions.
This parameter works fine for LogisticRegression()
, but seeing as I want to cross-validate I decide to use LogisticRegressionCV()
the equivalent c
parameter here is denoted as Cs
, yet when I try to pass the same variable Cs = .0000001
, I get an error:
797 warm_start_sag = {"coef": np.expand_dims(w0, axis=1)}
799 coefs = list()
--> 800 n_iter = np.zeros(len(Cs), dtype=np.int32)
801 for i, C in enumerate(Cs):
802 if solver == "lbfgs":
TypeError: object of type 'float' has no len()
When referring to the documents it seems that for LogisticRegressionCV()
:
If Cs is as an int, then a grid of Cs values are chosen in a
logarithmic scale between 1e-4 and 1e4.
How would I then still input a value of Cs = .0000001? I'm confused about how to proceed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
logisticRegressioncv
不仅仅是交叉验证量表的逻辑回归;它是一个高参数调整(通过交叉验证)逻辑回归。也就是说,它尝试了几种不同的正规化强度,并使用交叉验证分数选择最佳的强度(然后使用最佳c
在整个训练集上改进单个模型)。cs
可以是尝试c
的值列表,也可以是让Sklearn为您创建列表的整数(如您的引用文档中)。如果您只想用固定的
c
为模型评分,请使用cross_val_score
或cross_validate
。(您可能 can 使用
logisticRegressioncv
,设置cs = [0.0000001]
,但这不是正确的语义用法。)LogisticRegressionCV
is not meant to be just cross-validation-scored logistic regression; it is a hyperparameter-tuned (by cross-validation) logistic regression. That is, it tries several different regularization strengths, and selects the best one using cross-validation scores (then refits a single model on the entire training set, using that bestC
).Cs
can be a list of values to try forC
, or an integer to let sklearn create a list for you (as in your quoted doc).If you just want to score your model with fixed
C
, usecross_val_score
orcross_validate
.(You probably can use
LogisticRegressionCV
, settingCs=[0.0000001]
, but it's not the right semantic usage.)