H2O MOJO模型始终返回相同的预测值
我已经培训了一个由H2O(3.36.0.4)提供的AutOml()函数的堆叠的集合模型。
我已经按照说明创建了一个Java应用程序,但是在运行程序时,该模型始终预测相同的值。
这是Java中开发的应用程序:
import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
public class App {
public static void main(String[] args) throws Exception {
EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));
BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
RowData inputrow = new RowData();
String row;
String[] colnames = new String[36];
String[] data;
for (int i = 0; i <= 10; i++) {
row = csvReader.readLine();
if (i == 0) {
colnames = row.split(",");
} else {
data = row.split(",");
for (int k = 0; k < colnames.length - 2; k++) {
inputrow.put(colnames[k], data[k]);
}
RegressionModelPrediction prediction = model.predictRegression(inputrow);
System.out.println("Prediction "+i+": " + prediction.value);
}
}
System.out.println("");
}
}
它返回以下内容:
Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583
有关更多详细信息,培训和测试数据集具有相同的变量,并且该模型已通过以下参数进行了培训:
aml <- h2o::h2o.automl(y = y,
training_frame = df_h2o,
nfolds = 10,
max_models = 150,
max_runtime_secs = NULL,
keep_cross_validation_predictions = TRUE,
stopping_metric = 'RMSE',
sort_metric ='RMSE',
verbosity = "info")
我已经执行了以下检查,并且与Java中使用的数据集相同:
modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)
val_df <- read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)
h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))
original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions <- as.vector( h2o.predict(mojo_model, new_data))
> h2omodel_preditions
[1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
[1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
I have trained a stacked ensemble model with automl() function provided by H2O (3.36.0.4)for R. Once the model is trained, i have exported it to .zip format with the download_mojo() function.
I have created a Java app following instructions, but when running the program, the model always predicts the same value.
This is the app developed in Java:
import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
public class App {
public static void main(String[] args) throws Exception {
EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));
BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
RowData inputrow = new RowData();
String row;
String[] colnames = new String[36];
String[] data;
for (int i = 0; i <= 10; i++) {
row = csvReader.readLine();
if (i == 0) {
colnames = row.split(",");
} else {
data = row.split(",");
for (int k = 0; k < colnames.length - 2; k++) {
inputrow.put(colnames[k], data[k]);
}
RegressionModelPrediction prediction = model.predictRegression(inputrow);
System.out.println("Prediction "+i+": " + prediction.value);
}
}
System.out.println("");
}
}
It returns this:
Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583
For more details, training and test datasets have the same variables and the model have been trained with following parameters:
aml <- h2o::h2o.automl(y = y,
training_frame = df_h2o,
nfolds = 10,
max_models = 150,
max_runtime_secs = NULL,
keep_cross_validation_predictions = TRUE,
stopping_metric = 'RMSE',
sort_metric ='RMSE',
verbosity = "info")
And I have performed the following checks, with the same dataset used in java:
modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)
val_df <- read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)
h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))
original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions <- as.vector( h2o.predict(mojo_model, new_data))
> h2omodel_preditions
[1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
[1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论