We use Python3. More details can be found in Sebastian Raschka’s book: https://www.goodreads.com/book/show/25545994-python-machine-learning?ac=1&from_search=true Find the data here: https://archive.ics.uci.edu/ml/datasets/Housing. Linear regression models can be heavily impacted by the presence of outliers. As an alternative to throwing out outliers, we will look at a robust method of regression using the RANdom SAmple Consensus (RANSAC) algorithm, which is […]

# Examples and Summary of Non-linear Regression in R, with IMDB Movie Data

As digital production of information becomes increasingly cheap and easy, people are offered with more and more options for consuming those digital productions in a limited time. Rating (score) has thus become an essential ingredient to help people make those choices. Predicting rating score in turn becomes a lucrative business for effective marketing. Use the […]

# CART, A Regression Tree Model for Wine Choosing

Compared with regression traditionally, decision trees may be better suited for tasks with many features or many complex, non-linear relationships among features and outcome. These situations present challenges for regression. Regression modeling also makes assumptions about how numeric data is distributed that are often violated in real-world data. This is not the case for trees. […]

# Example of Ridge and Lasso Regression

Data and background: https://charleshsliao.wordpress.com/2017/02/28/dplyr-rename-and-a-lame-regression/ Math comparison: Now let us build Ridge and Lasso regression to hunt down the smallest RMSE. Ridge: </pre> library(car) ## ## Attaching package: ‘car’ ## The following object is masked from ‘package:dplyr’: ## ## recode vif(lmst) ## lcavol lweight age lbph svi lcp gleason pgg45 ## 2.318496 1.472295 1.356604 1.383429 2.045313 3.117451 […]