System-Aware Distributed Optimization for Machine Learning

Introduction The scale of modern datasets necessitates the design and development of efficient and theoretically grounded distributed optimization algorithms for machine learning. Distributed systems offer the promise of scalability, vertically and horizontally, along both computation and storage dimensions, while at the same time pose unique challenges for algorithm designers. One particularly critical challenge is to […]

Continue reading


Introduction to Support Vector Machines: Part I

In this post, you will learn about the basics of Support Vector Machines (SVM), which is a well-regarded supervised machine learning algorithm. This technique needs to be in everyone’s tool-bag especially people who aspire to be a data scientist one day. Since there’s a lot to learn about, I’ll introduce SVM to you across two posts […]

Continue reading


Basic SVM in Python

In Python we can build SVM model for classification with sklearn library. We can use basic linearsvc or svc with more parameters to tune. We use the data from sklearn library, and the IDE is sublime text3. Most of the code comes from the book: https://www.goodreads.com/book/show/32439431-introduction-to-machine-learning-with-python?from_search=true from sklearn.svm import LinearSVC from sklearn.svm import SVC import […]

Continue reading


Quick Example of Parallel Computation in R for SVM/Random Forest, with MNIST and Credit Data

It is generally acknowledged that SVM algorithm is relatively slow to train, even with tuning parameters such as cost and kernel. The general way to boost the speed is to apply packages of “parallel” “do parallel” “doSNOW” and for each function. Data and background: Data and background: https://charleshsliao.wordpress.com/2017/02/24/svm-tuning-based-on-mnist/ ########################################################## #1. ste up load data function load_image_file […]

Continue reading


Kernels, SVM and a Letter Recognition Example

This article is still about SVM and related parameters, especially the one called Kernel. We can use different Kernel methods to project or map data into higher dimension space. This would be typically useful for non-linear problems in real life. The linear kernel does not transform the data at all The polynomial kernel of degree […]

Continue reading


SVM to Recognize Hand Written Digits in R

Background: https://charleshsliao.wordpress.com/2017/02/24/svm-tuning-based-on-mnist/ </pre> load_image_file <- function(filename) { ret = list() f = file(filename,’rb’) readBin(f,’integer’,n=1,size=4,endian=’big’) ret$n = readBin(f,’integer’,n=1,size=4,endian=’big’) nrow = readBin(f,’integer’,n=1,size=4,endian=’big’) ncol = readBin(f,’integer’,n=1,size=4,endian=’big’) x = readBin(f,’integer’,n=ret$n*nrow*ncol,size=1,signed=F) ret$x = matrix(x, ncol=nrow*ncol, byrow=T) close(f) ret } load_label_file <- function(filename) { f = file(filename,’rb’) readBin(f,’integer’,n=1,size=4,endian=’big’) n = readBin(f,’integer’,n=1,size=4,endian=’big’) y = readBin(f,’integer’,n=n,size=1,signed=F) close(f) y } #show handwritten digit with show_digit(matriximage$x[n,]),n […]

Continue reading


SVM(e1071 of R) Tuning with MNIST

Background: Handwriting recognition is a well-studied subject in computer vision and has found wide applications in our daily life (such as USPS mail sorting). In this project, we will explore various machine learning techniques for recognizing handwriting digits. The dataset you will be using is the well-known MINST dataset. (1) The MNIST database of handwritten […]

Continue reading