H2O MOJO模型始终返回相同的预测值

发布于 01-23 22:38 字数 3126 浏览 6 评论 0原文

我已经培训了一个由H2O（3.36.0.4）提供的AutOml（）函数的堆叠的集合模型。

我已经按照说明创建了一个Java应用程序，但是在运行程序时，该模型始终预测相同的值。

这是Java中开发的应用程序：

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;

public class App {
    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));

        BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
        RowData inputrow = new RowData();
        String row;
        String[] colnames = new String[36];
        String[] data;
        for (int i = 0; i <= 10; i++) {
            row = csvReader.readLine();
            if (i == 0) {
                colnames = row.split(",");
            } else {
                data = row.split(",");
                for (int k = 0; k < colnames.length - 2; k++) {
                    inputrow.put(colnames[k], data[k]);
                }
                RegressionModelPrediction prediction = model.predictRegression(inputrow);
                System.out.println("Prediction "+i+": " + prediction.value);
            }
        }
        System.out.println("");
    }
}

它返回以下内容：

Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583

有关更多详细信息，培训和测试数据集具有相同的变量，并且该模型已通过以下参数进行了培训：

aml <- h2o::h2o.automl(y = y,
                       training_frame = df_h2o,
                       nfolds = 10,
                       max_models = 150,
                       max_runtime_secs = NULL,
                       keep_cross_validation_predictions = TRUE,
                       stopping_metric = 'RMSE',
                       sort_metric ='RMSE',
                       verbosity = "info")

我已经执行了以下检查，并且与Java中使用的数据集相同：

modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)


val_df <-  read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)

h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))


original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions  <- as.vector( h2o.predict(mojo_model, new_data))


> h2omodel_preditions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213

原文

I have trained a stacked ensemble model with automl() function provided by H2O (3.36.0.4)for R. Once the model is trained, i have exported it to .zip format with the download_mojo() function.

I have created a Java app following instructions, but when running the program, the model always predicts the same value.

This is the app developed in Java:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;

public class App {
    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));

        BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
        RowData inputrow = new RowData();
        String row;
        String[] colnames = new String[36];
        String[] data;
        for (int i = 0; i <= 10; i++) {
            row = csvReader.readLine();
            if (i == 0) {
                colnames = row.split(",");
            } else {
                data = row.split(",");
                for (int k = 0; k < colnames.length - 2; k++) {
                    inputrow.put(colnames[k], data[k]);
                }
                RegressionModelPrediction prediction = model.predictRegression(inputrow);
                System.out.println("Prediction "+i+": " + prediction.value);
            }
        }
        System.out.println("");
    }
}

It returns this:

Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583

For more details, training and test datasets have the same variables and the model have been trained with following parameters:

aml <- h2o::h2o.automl(y = y,
                       training_frame = df_h2o,
                       nfolds = 10,
                       max_models = 150,
                       max_runtime_secs = NULL,
                       keep_cross_validation_predictions = TRUE,
                       stopping_metric = 'RMSE',
                       sort_metric ='RMSE',
                       verbosity = "info")

And I have performed the following checks, with the same dataset used in java:

modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)


val_df <-  read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)

h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))


original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions  <- as.vector( h2o.predict(mojo_model, new_data))


> h2omodel_preditions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213

分享到QQ

分享到微博