Analysis of Historical Weather Data for Los Angeles, CA

This post explores historical weather data from Los Angeles, California over the period of 1906 to the present using Pandas and Matplotlib. The data in the post was collected from the National Centers for Environmental Information website and is available for download here. Organizing the data by year, an animation of the max temperatures throughout […]

Continue reading


Measuring Data Science Business Value

This blog post covers metrics that help data science leaders ensure their team’s work is aligned to business value. Data science managers and executives, whether coming up through the technical side or the manager side, all struggle with providing visibility for their team and how the team’s work is aligned to business value. It is […]

Continue reading


Quick Tips for Getting A Data Science Team Off the Ground

Should you start a data science team? Or not? It isn’t an easy decision. This blog post provides tips to help leaders at startups and early-stage companies decide whether it is the right time to start building a data science team. Why Data Science? An increasing number of startups and early-stage companies are realizing they […]

Continue reading


Applying Data Science to Robotics

Author: Ammar A. Raja Source: http://www.datasciencecentral.com/profiles/blogs/how-data-science-apply-to-robotics 1. SHORT BIO OF THE AUTHOR Dr. Ammar A. Raja is an assistant professor at COMSATS Institute of Information Technology, Pakistan. He received his PhD degree in Finance from The London School of Economics and Political Science (LSE) in 2012. Apart from conducting research in data analytics, he also […]

Continue reading


Recommender Systems through Collaborative Filtering

This is a technical deep dive of the collaborative filtering algorithm and how to use it in practice. From Amazon recommending products you may be interested in based on your recent purchases to Netflix recommending shows and movies you may want to watch, recommender systems have become popular across many applications of data science. Like […]

Continue reading


Downloading more than 20 years of The New York Times

Articles for the period from 1987 to present are available without subscription. Their copyright notice is web scraping friendly: “… you may download material from The New York Times on the Web (one machine readable copy and one print copy per page) for your personal, noncommercial use only.” Why waste the opportunity to download these […]

Continue reading


A framework for building and evaluating data products

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Pinterest data scientist Grace Huang on lessons learned in the course of machine learning product launches. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, […]

Continue reading


A Neural Network in 10 lines of C++ Code

Purpose: For education purposes only. The code demonstrates supervised learning task using a very simple neural network. In my next post, I am going to replace the vast majority of subroutines with CUDA kernels. Reference: Andrew Trask‘s post. The core component of the code, the learning algorithm, is only 10 lines: The loop above runs for 50 iterations […]

Continue reading


How to install NVIDIA CUDA 8.0, cuDNN 5.1, Tensorflow, and Keras on Ubuntu 16.04

Please follow the instructions below and you will be rewarded with Keras with Tenserflow backend and, most importantly, GPU support. Step 1. Linux Update apt repositories and install the linux -image-extra-virtual package. This package includes the kernel module that’s required by the NVIDIA drivers. sudo apt-get update sudo apt-get install -y linux-image-extra-virtual Install the version of the headers that matches the freshly installed […]

Continue reading


Data Science != Software Engineering

Domino’s guide, “What Engineering Leaders Need to Know About Data Science”, provides insights to help engineering leaders increase data science productivity and decrease engineering time spent on avoidable tickets. This post covers the differences between data science and engineering, because it is an initial step toward more efficient data science workflows, tooling, and infrastructure. For […]

Continue reading