H2O Mojo model always return the same value for prediction

206 Views Asked by At

I have trained a stacked ensemble model with automl() function provided by H2O (3.36.0.4)for R. Once the model is trained, i have exported it to .zip format with the download_mojo() function.

I have created a Java app following instructions, but when running the program, the model always predicts the same value.

This is the app developed in Java:

import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;

public class App {
    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip"));

        BufferedReader csvReader = new BufferedReader(new FileReader("dataset.csv"));
        RowData inputrow = new RowData();
        String row;
        String[] colnames = new String[36];
        String[] data;
        for (int i = 0; i <= 10; i++) {
            row = csvReader.readLine();
            if (i == 0) {
                colnames = row.split(",");
            } else {
                data = row.split(",");
                for (int k = 0; k < colnames.length - 2; k++) {
                    inputrow.put(colnames[k], data[k]);
                }
                RegressionModelPrediction prediction = model.predictRegression(inputrow);
                System.out.println("Prediction "+i+": " + prediction.value);
            }
        }
        System.out.println("");
    }
}

It returns this:

Prediction 1: 0.09239077248718583
Prediction 2: 0.09239077248718583
Prediction 3: 0.09239077248718583
Prediction 4: 0.09239077248718583
Prediction 5: 0.09239077248718583
Prediction 6: 0.09239077248718583
Prediction 7: 0.09239077248718583
Prediction 8: 0.09239077248718583
Prediction 9: 0.09239077248718583
Prediction 10: 0.09239077248718583

For more details, training and test datasets have the same variables and the model have been trained with following parameters:

aml <- h2o::h2o.automl(y = y,
                       training_frame = df_h2o,
                       nfolds = 10,
                       max_models = 150,
                       max_runtime_secs = NULL,
                       keep_cross_validation_predictions = TRUE,
                       stopping_metric = 'RMSE',
                       sort_metric ='RMSE',
                       verbosity = "info")

And I have performed the following checks, with the same dataset used in java:

modelPath <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828")
loaded_model <- h2o.loadModel(modelPath)


val_df <-  read.csv("dataset.csv")
input_row <- val_df[1:10 ,1:36]
new_data <- as.h2o(input_row)

h2omodel_preditions <-as.vector(h2o.predict(loaded_model, new_data, exact_quantiles=TRUE))


original_mojo_path <- paste0(getwd(), "/StackedEnsemble_BestOfFamily_8_AutoML_1_20220407_144828.zip")
mojo_model <- h2o.upload_mojo(original_mojo_path)
mojo_predictions  <- as.vector( h2o.predict(mojo_model, new_data))


> h2omodel_preditions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
> mojo_predictions
 [1] 0.27564401 0.25663341 0.17848737 0.05179671 0.02977053 0.28588998 0.29157313 0.19800770 0.06251480 0.23992213
0

There are 0 best solutions below