In the full code below you will learn to build H2O GBM model (Regression and binomial classification) in Scala. Lets first import all the classes we need for this project: import org.apache.spark.SparkFiles import org.apache.spark.h2o._ import org.apache.spark.examples.h2o._ import org.apache.spark.sql.{DataFrame, SQLContext} import water.Key import java.io.File import water.support.SparkContextSupport.addFiles import water.support.H2OFrameSupport._ // Create SQL support implicit val sqlContext = […]

# RANSAC and Nonlinear Regression in Python

We use Python3. More details can be found in Sebastian Raschka’s book: https://www.goodreads.com/book/show/25545994-python-machine-learning?ac=1&from_search=true Find the data here: https://archive.ics.uci.edu/ml/datasets/Housing. Linear regression models can be heavily impacted by the presence of outliers. As an alternative to throwing out outliers, we will look at a robust method of regression using the RANdom SAmple Consensus (RANSAC) algorithm, which is […]

# Examples and Summary of Non-linear Regression in R, with IMDB Movie Data

As digital production of information becomes increasingly cheap and easy, people are offered with more and more options for consuming those digital productions in a limited time. Rating (score) has thus become an essential ingredient to help people make those choices. Predicting rating score in turn becomes a lucrative business for effective marketing. Use the […]

# CART, A Regression Tree Model for Wine Choosing

Compared with regression traditionally, decision trees may be better suited for tasks with many features or many complex, non-linear relationships among features and outcome. These situations present challenges for regression. Regression modeling also makes assumptions about how numeric data is distributed that are often violated in real-world data. This is not the case for trees. […]

# Example of Ridge and Lasso Regression

Data and background: https://charleshsliao.wordpress.com/2017/02/28/dplyr-rename-and-a-lame-regression/ Math comparison: Now let us build Ridge and Lasso regression to hunt down the smallest RMSE. Ridge: </pre> library(car) ## ## Attaching package: ‘car’ ## The following object is masked from ‘package:dplyr’: ## ## recode vif(lmst) ## lcavol lweight age lbph svi lcp gleason pgg45 ## 2.318496 1.472295 1.356604 1.383429 2.045313 3.117451 […]