“[Video game] AI is still in the dark ages,” Epic CEO Tim Sweeney told a crowd gathered for Games Beat’s 2017 industry summit. The video game industry has witness a tremendous amount of growth, thanks to the incredible increase in computation power in terms of visual representations. Using the parallel computation ability of GPUs, powerful […]

# Basics of Computational Reinforcement Learning

Link: http://videolectures.net/rldm2015_littman_computational_reinforcement/ In machine learning, reinforcement learning plays an important role. It stems from the system’s decision-making ability to be improved through interacting with the world and evaluating feedback. This tutorial introduces basic concepts and vocabulary in this field. Additionally, the tutorial shows us recent advances in the theory and practice of reinforcement learning. To […]

# Markov decision processes

Markov decision processes (MDPs), named after Andrey Markov, provide a mathematical framework for modeling decision making in situations where outcomes are partlyrandom and partly under the control of a decision maker. MDPs are useful for studying a wide range of optimization problems solved via dynamic programming andreinforcement learning. MDPs were known at least as early […]

# Kaggle Announces Code Competitions

When I checked this morning, the number was 3,735,359. 3,735,359 Kaggle submissions. Each one was packaged up, sent as blips of ones and zeros, over miles of copper, kilometers of fiber optics, furlongs of under sea cables, through cell towers and satellites. They were created by world experts and total beginners alike. Some were full of errors, rife with overfitting, as […]

# Applying Temporal Difference Methods to Machine Learning — Part 3

In this third Part of Applying Temporal Difference Methods to Machine Learning, I will be experimenting with the intra-sequence update variant of TD learning. It is a method where after each time step, the parameters are updated rather than waiting at the end of the sequence. This post relates to my class project for the Reinforcement […]

# Applying Temporal Difference Methods to Machine Learning — Part 2

In this Part 2 of Applying Temporal Difference Methods to Machine Learning, I will show results of applying what Sutton refers to the traditional machine learning approach compared to the Temporal Difference approach. For more information on this series, refer to the first part. An important consideration with regard to the problem I am using […]

# Applying Temporal Difference Methods to Machine Learning — Part 1

In this post I detail my project for the course Reinforcement Learning (COMP767) taken at McGill, applying Temporal Difference (TD) methods in a Machine Learning setting. This concept was first discussed by Sutton when he introduced this family of learning algorithms. I aim to go over what was discussed in the paper and see how it performs on a […]

# Learning to Reinforcement Learn

Paper source:https://arxiv.org/abs/1611.05763 Paper discussion:https://www.reddit.com/r/MachineLearning/comments/5dm7yu/r_learning_to_reinforcement_learn/ Paper Authors: 1. Introduction Reinforcement learning (RL) methods have achieved human and superhuman-level performance in many complex and large-scale environments, like Atari games and Go. However, compared to human performance, previous deep RL systems have at least two shortcomings: Deep RL typically requires a massive volume of training data, whereas human […]

# 2017 Trends — What You Want & What Comes

Source: c’t Magazin für Computer Technik 3/17 [Vedio Source] [Article Source] Don’t be worried about AI. A glance at the state of the art research shows that neural networks would still serve us, and artificial general intelligence is not yet in sight. Therefore, robots and language assistants are nowhere near as smart as what the […]

# Theoretical & empirical analysis of Expected Sarsa

Jupyter Notebook submitted as an assignment for the Reinforcement Learning class at McGill analyzing the theoretical and empirical performance of Expected Sarsa. The theoretical analysis component presented is a review of Section 5 from A Theoretical and Empirical Analysis of Expected Sarsa, van Seijen, van Hasselt, Whiteson, and Weiring (2009) I later show the empirically […]