我可以采用 optuna 函数的最佳参数和最佳模型并将该模型直接应用到我的笔记本中吗?
我建立了 optuna 的一个功能来为我的数据找出 GBM 和 xgboost 的最佳模型,但我想知道是否可以采用最佳模型并将其直接应用到我的笔记本中(提取最佳模型作为对象以便稍后重用) 这是我的目标函数:
import lightgbm as lgb
import optuna
import sklearn.metrics
from xgboost import XGBRegressor
from optuna.integration import XGBoostPruningCallback
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error
best_booster = None
gbm = None
def objective(trial,random_state=22,n_jobs=1,early_stopping_rounds=50):
regrosser_name = trial.suggest_categorical("regressor", ["XGBoost", "lightgbm"])
train_x, valid_x, train_y, valid_y = train_test_split(X_train, y_train, test_size=0.25)
dtrain = lgb.Dataset(train_x, label=train_y)
# Step 2. Setup values for the hyperparameters:
if regrosser_name == 'XGBoost':
params = {
"verbosity": 0, # 0 (silent) - 3 (debug)
"objective": "reg:squarederror",
"n_estimators": 10000,
"max_depth": trial.suggest_int("max_depth", 4, 12),
"learning_rate": trial.suggest_loguniform("learning_rate", 0.005, 0.05),
"colsample_bytree": trial.suggest_loguniform("colsample_bytree", 0.2, 0.6),
"subsample": trial.suggest_loguniform("subsample", 0.4, 0.8),
"alpha": trial.suggest_loguniform("alpha", 0.01, 10.0),
"lambda": trial.suggest_loguniform("lambda", 1e-8, 10.0),
"gamma": trial.suggest_loguniform("lambda", 1e-8, 10.0),
"min_child_weight": trial.suggest_loguniform("min_child_weight", 10, 1000),
"seed": random_state,
"n_jobs": n_jobs,
}
model = XGBRegressor(**params)
model.fit(train_x, train_y)
y_pred = model.predict(X_val)
accuracy_rf = sklearn.metrics.mean_absolute_error(valid_y, y_pred)
return accuracy_rf
print(rf_max_depth)
print(rf_n_estimators)
else:
param = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": -1,
"boosting_type": "gbdt",
"lambda_l1": trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
"lambda_l2": trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
"num_leaves": trial.suggest_int("num_leaves", 2, 256),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
}
gbm = lgb.train(param, dtrain)
preds_gbm = gbm.predict(valid_x)
pred_labels_gbm = np.rint(preds_gbm)
accuracy_gbm = sklearn.metrics.mean_absolute_error(valid_y, pred_labels_gbm)
return accuracy_gbm
这是我尝试解决这个问题的方法:
def callback(study, trial):
global best_booster
if study.best_trial == trial:
best_booster = gbm
if __name__ == "__main__":
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100, callbacks=[callback])
我认为它是关于导入一些东西,如果我的 optuna 函数有任何提示,请说明
i esttablished a function of optuna to find out best model of gbm and xgboost for my data but i was wondering if i can take the best model and apply it directly into my notebook(extracting best model as an object to reuse it later)
here is my objective function:
import lightgbm as lgb
import optuna
import sklearn.metrics
from xgboost import XGBRegressor
from optuna.integration import XGBoostPruningCallback
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error
best_booster = None
gbm = None
def objective(trial,random_state=22,n_jobs=1,early_stopping_rounds=50):
regrosser_name = trial.suggest_categorical("regressor", ["XGBoost", "lightgbm"])
train_x, valid_x, train_y, valid_y = train_test_split(X_train, y_train, test_size=0.25)
dtrain = lgb.Dataset(train_x, label=train_y)
# Step 2. Setup values for the hyperparameters:
if regrosser_name == 'XGBoost':
params = {
"verbosity": 0, # 0 (silent) - 3 (debug)
"objective": "reg:squarederror",
"n_estimators": 10000,
"max_depth": trial.suggest_int("max_depth", 4, 12),
"learning_rate": trial.suggest_loguniform("learning_rate", 0.005, 0.05),
"colsample_bytree": trial.suggest_loguniform("colsample_bytree", 0.2, 0.6),
"subsample": trial.suggest_loguniform("subsample", 0.4, 0.8),
"alpha": trial.suggest_loguniform("alpha", 0.01, 10.0),
"lambda": trial.suggest_loguniform("lambda", 1e-8, 10.0),
"gamma": trial.suggest_loguniform("lambda", 1e-8, 10.0),
"min_child_weight": trial.suggest_loguniform("min_child_weight", 10, 1000),
"seed": random_state,
"n_jobs": n_jobs,
}
model = XGBRegressor(**params)
model.fit(train_x, train_y)
y_pred = model.predict(X_val)
accuracy_rf = sklearn.metrics.mean_absolute_error(valid_y, y_pred)
return accuracy_rf
print(rf_max_depth)
print(rf_n_estimators)
else:
param = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": -1,
"boosting_type": "gbdt",
"lambda_l1": trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
"lambda_l2": trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
"num_leaves": trial.suggest_int("num_leaves", 2, 256),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
}
gbm = lgb.train(param, dtrain)
preds_gbm = gbm.predict(valid_x)
pred_labels_gbm = np.rint(preds_gbm)
accuracy_gbm = sklearn.metrics.mean_absolute_error(valid_y, pred_labels_gbm)
return accuracy_gbm
and here is how i tried to solve this issue:
def callback(study, trial):
global best_booster
if study.best_trial == trial:
best_booster = gbm
if __name__ == "__main__":
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100, callbacks=[callback])
i think its about importing somthing, and if there is any tips on my optuna function please state it
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果我正确理解您的问题,那么是的,这就是模型的目的。
就像将保存的模型带到笔记本上一样,馈送与您过去训练的结构相同的结构的数据,并且应该达到其目的。或在管道中使用它。
甚至可以使用与NP阵列相同结构的1行。例如,我的模型预测是否应批准贷款。
例如,银行客户希望贷款并提交他的信息。银行官员在系统中输入此信息。该系统将这些信息转换为单个NP阵列,其结构与用于训练模型的数据集相同。
然后,系统将使用该模型来预测是否应批准贷款。
我将Optuna XGB模型保存为JSON,例如
My_model.get_booster()。save_model(f'{savepath} my_model.json')
If I understood your question correctly, then yes, that's what models are for.
Like bring your saved model to your notebook, feed it data that has the same structure as what you used to train it, and it should serve its purpose. Or use it in a pipeline.
Even 1 line of the same structure as an np array can be used. For example, my model predicts whether a loan should be approved or not.
For example, a bank customer wants a loan and submits his information. The bank officer inputs this info in the system. The system transforms this information into a single np array with the same structure as the dataset used to train the model.
The model is then used by the system to predict whether the loan should be approved or not.
I save my optuna xgb models as json, e.g.
my_model.get_booster().save_model(f'{savepath}my_model.json')