girdsearchcv用于多输出randomforest回归剂
我已经使用 sklearn.ensemble.randomforestregressor
创建了一个MultiOutput RandomForestRegressor。我现在想执行 GridSearchCV
以找到良好的超参数并为每个单独的目标功能输出R^2分数。代码使用如下:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
param_grid = {
'model__bootstrap': [True],
'model__max_depth': [8,10,12],
'model__max_features': [3,4,5],
'model__min_samples_leaf': [3,4,5],
'model__min_samples_split': [3, 5, 7],
'model__n_estimators': [100, 200, 300]
}
model = RandomForestRegressor()
pipe = Pipeline(steps=[
('scaler', StandardScaler()),
('model', model)])
scorer = make_scorer(r2_score, multioutput='raw_values')
search = GridSearchCV(pipe, param_grid, scoring=scorer)
search.fit(X_train, y_train)
print(f'Best parameter score {ship_type} {target}: {search.best_score_}')
运行此代码时,我明确会收到以下错误。
File "run_xgb_rf_regressor.py", line 75, in <module>
model, X = run_regression(ship_types[2], targets)
File "run_xgb_rf_regressor.py", line 50, in run_regression
search.fit(X_train, y_train)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/utils/validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 841, in fit
self._run_search(evaluate_candidates)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 1296, in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 795, in evaluate_candidates
out = parallel(delayed(_fit_and_score)(clone(base_estimator),
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/parallel.py", line 1043, in __call__
if self.dispatch_one_batch(iterator):
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
self._dispatch(tasks)
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
self.results = batch()
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
return [func(*args, **kwargs)
File "/home/lucas/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
return [func(*args, **kwargs)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/utils/fixes.py", line 222, in __call__
return self.function(*args, **kwargs)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 625, in _fit_and_score
test_scores = _score(estimator, X_test, y_test, scorer, error_score)
File "/home/lucas/.local/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 721, in _score
raise ValueError(error_msg % (scores, type(scores), scorer))
ValueError: scoring must return a number, got [0.57359176 0.54407165 0.40313057 0.32515033 0.346224 0.39513717
0.34375699] (<class 'numpy.ndarray'>) instead. (scorer=make_scorer(r2_score, multioutput=raw_values))
错误表明我只能使用单个数字值,在我的情况下,这将是所有目标功能的平均R^2分数。是否有人知道我如何使用GridSearchCV,以便我可以输出单个R^2分数?
非常感谢。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为我将使用以下选项进行
评分
参数(来自文档):因此,
在文档中,
revit
的文档中,需要通过多项搜索进行更仔细的设置。也许决定“最佳”参数应通过某种平均值来完成,在这种情况下,您可以在自定义得分手中添加另一个条目。用户指南的其他有用部分:
I think I would use the following option for
scoring
parameter (from the docs):So something like
Note though in the docs that
refit
will need to be set more carefully with multimetric searches. Maybe deciding the "best" parameters should be done by some average, in which case you can add another entry to the custom scorer.Other useful parts of the User Guide:
https://scikit-learn.org/stable/modules/grid_search.html#multimetric-grid-search
https://scikit-learn.org/stable/modules/model_evaluation.html#implementing-your-own-scoring-object