H2O MOJO模型始终返回相同的预测值

发布于 01-23 22:38 字数 3126 浏览 6 评论 0原文

我已经培训了一个由H2O(3.36.0.4)提供的AutOml()函数的堆叠的集合模型。

我已经按照说明创建了一个Java应用程序,但是在运行程序时,该模型始终预测相同的值。

这是Java中开发的应用程序:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;

public class App {
    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));

        BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
        RowData inputrow = new RowData();
        String row;
        String[] colnames = new String[36];
        String[] data;
        for (int i = 0; i <= 10; i++) {
            row = csvReader.readLine();
            if (i == 0) {
                colnames = row.split(",");
            } else {
                data = row.split(",");
                for (int k = 0; k < colnames.length - 2; k++) {
                    inputrow.put(colnames[k], data[k]);
                }
                RegressionModelPrediction prediction = model.predictRegression(inputrow);
                System.out.println("Prediction "+i+": " + prediction.value);
            }
        }
        System.out.println("");
    }
}

它返回以下内容:

Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583

有关更多详细信息,培训和测试数据集具有相同的变量,并且该模型已通过以下参数进行了培训:

aml <- h2o::h2o.automl(y = y,
                       training_frame = df_h2o,
                       nfolds = 10,
                       max_models = 150,
                       max_runtime_secs = NULL,
                       keep_cross_validation_predictions = TRUE,
                       stopping_metric = 'RMSE',
                       sort_metric ='RMSE',
                       verbosity = "info")

我已经执行了以下检查,并且与Java中使用的数据集相同:

modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)


val_df <-  read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)

h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))


original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions  <- as.vector( h2o.predict(mojo_model, new_data))


> h2omodel_preditions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213

I have trained a stacked ensemble model with automl() function provided by H2O (3.36.0.4)for R. Once the model is trained, i have exported it to .zip format with the download_mojo() function.

I have created a Java app following instructions, but when running the program, the model always predicts the same value.

This is the app developed in Java:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;

public class App {
    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));

        BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
        RowData inputrow = new RowData();
        String row;
        String[] colnames = new String[36];
        String[] data;
        for (int i = 0; i <= 10; i++) {
            row = csvReader.readLine();
            if (i == 0) {
                colnames = row.split(",");
            } else {
                data = row.split(",");
                for (int k = 0; k < colnames.length - 2; k++) {
                    inputrow.put(colnames[k], data[k]);
                }
                RegressionModelPrediction prediction = model.predictRegression(inputrow);
                System.out.println("Prediction "+i+": " + prediction.value);
            }
        }
        System.out.println("");
    }
}

It returns this:

Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583

For more details, training and test datasets have the same variables and the model have been trained with following parameters:

aml <- h2o::h2o.automl(y = y,
                       training_frame = df_h2o,
                       nfolds = 10,
                       max_models = 150,
                       max_runtime_secs = NULL,
                       keep_cross_validation_predictions = TRUE,
                       stopping_metric = 'RMSE',
                       sort_metric ='RMSE',
                       verbosity = "info")

And I have performed the following checks, with the same dataset used in java:

modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)


val_df <-  read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)

h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))


original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions  <- as.vector( h2o.predict(mojo_model, new_data))


> h2omodel_preditions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文