Building GBM model in R and exporting POJO and MOJO model

Share on Facebook0Share on Google+0Tweet about this on TwitterShare on LinkedIn0

Get the dataset:

Training:

http://h2o-training.s3.amazonaws.com/pums2013/adult_2013_train.csv.gz

Test:

http://h2o-training.s3.amazonaws.com/pums2013/adult_2013_test.csv.gz

Here is the script to build GBM grid model and export MOJO model:

library(h2o)
h2o.init()

# Importing Dataset
trainfile <- file.path("/Users/avkashchauhan/learn/adult_2013_train.csv.gz")
adult_2013_train <- h2o.importFile(trainfile, destination_frame = "adult_2013_train")
testfile <- file.path("/Users/avkashchauhan/learn/adult_2013_test.csv.gz")
adult_2013_test <- h2o.importFile(testfile, destination_frame = "adult_2013_test")

# Display Dataset
adult_2013_train
adult_2013_test

# Feature Engineering
actual_log_wagp <- h2o.assign(adult_2013_test[, "LOG_WAGP"], key = "actual_log_wagp")

for (j in c("COW", "SCHL", "MAR", "INDP", "RELP", "RAC1P", "SEX", "POBP")) {
 adult_2013_train[[j]] <- as.factor(adult_2013_train[[j]])
 adult_2013_test[[j]] <- as.factor(adult_2013_test[[j]])
}
predset <- c("RELP", "SCHL", "COW", "MAR", "INDP", "RAC1P", "SEX", "POBP", "AGEP", "WKHP", "LOG_CAPGAIN", "LOG_CAPLOSS")

# Building GBM Model:
log_wagp_gbm_grid <- h2o.gbm(x = predset,
 y = "LOG_WAGP",
 training_frame = adult_2013_train,
 model_id = "GBMModel",
 distribution = "gaussian",
 max_depth = 5,
 ntrees = 110,
 validation_frame = adult_2013_test)

log_wagp_gbm_grid

# Prediction 
h2o.predict(log_wagp_gbm_grid, adult_2013_test)

# Download POJO Model:
h2o.download_pojo(log_wagp_gbm_grid, "/Users/avkashchauhan/learn", get_genmodel_jar = TRUE)

# Download MOJO model:
h2o.download_mojo(log_wagp_gbm_grid, "/Users/avkashchauhan/learn", get_genmodel_jar = TRUE)

You will see GBM_model.java (as POJO Model) and GBM_model.zip (MOJO model) at the location where you will save these models.

Thats it, enjoy!

 

Advertisements

Share on Facebook0Share on Google+0Tweet about this on TwitterShare on LinkedIn0