Using H2O AutoML for Kaggle Porto Seguro Safe Driver Prediction Competition

If you into competitive machine learning you must be visiting Kaggle routinely. Currently you can compete for cash and recognition at the Porto Seguro’s Safe Driver Prediction as well. I did try to given training dataset (as it is) with H2O AutoML which ran for about 5 hours and I was able to get into top […]

Continue reading


Handling exception “Argument python_obj should be a …”

Recently I hit the following exception when running python code with H2O functions on a new machine however this exception does not happen on my main machine. The exception was as below: H2OTypeError: Argument `python_obj` should be a None | list | tuple | dict | numpy.ndarray | pandas.DataFrame | scipy.sparse.issparse, got H2OTwoDimTable Error in […]

Continue reading


Exploring & transforming H2O Data Frame in R and Python

Sometime you may need to ingest a dataset for building models and then your first task is to explore all the features and their type you have. Once that is done you may want to change the feature types to the one you want. Here is the code snippet in Python: df = h2o.import_file(‘https://raw.githubusercontent.com/h2oai/sparkling-water/master/examples/smalldata/prostate.csv’) df.types […]

Continue reading


Python example of building GLM, GBM and Random Forest Binomial Model with H2O

Here is an example of using H2O machine learning library and then building GLM, GBM and Distributed Random Forest models for categorical response variable. Lets import h2o library and initialize the H2O machine learning cluster: import h2o h2o.init() Importing dataset and getting familiar with it: df = h2o.import_file(“https://raw.githubusercontent.com/h2oai/sparkling-water/master/examples/smalldata/prostate.csv”) df.summary() df.col_names Lets configure our predictors and […]

Continue reading


Visualizing H2O GBM and Random Forest MOJO Models Trees in python

In this example we will build a tree based model first using H2O machine learning library and the save that model as MOJO. Using GraphViz/Dot library we will extract individual trees/cross validated model trees from the MOJO and visualize them. If you are new to H2O MOJO model, learn here. You can also get full […]

Continue reading


Stacked Ensemble Model in Scala using H2O GBM and Deep Learning Models

In this full Scala sample we will be using H2O Stacked Ensembles algorithm. Stacked ensemble is a process of building models of various types first with cross-validation and keep fold columns for each model. In the next step building the stacked ensemble model using all the CV folds. You can learn more about Stacked Ensembles here. […]

Continue reading


Logistic Regression with H2O Deep Learning in Scala

Here is the sample code which show using Feed Forward Network based Deep Learning algorithms from H2O to perform a logistic regression . First lets import key classes specific to H2O import org.apache.spark.h2o._ import water.Key import java.io.File Now we will create H2O context so we can call key H2O function specific to data ingest and […]

Continue reading


H2O AutoML examples in python and Scala

AutoML is included into H2O version 3.14.0.1 and above. You can learn more about AutoML in the H2O blog here. H2O’s AutoML can be used for automating a large part of the machine learning workflow, which includes automatic training and tuning of many models within a user-specified time-limit. The user can also use a performance […]

Continue reading


Building Regression and Classification GBM models in Scala with H2O

In the full code below you will learn to build H2O GBM model (Regression and binomial classification) in Scala. Lets first import all the classes we need for this project: import org.apache.spark.SparkFiles import org.apache.spark.h2o._ import org.apache.spark.examples.h2o._ import org.apache.spark.sql.{DataFrame, SQLContext} import water.Key import java.io.File import water.support.SparkContextSupport.addFiles import water.support.H2OFrameSupport._ // Create SQL support implicit val sqlContext = […]

Continue reading