Notes on Reinforcement Learning (2): Dynamic Programming

Policy Evaluation

Consider a sequence of approximate value functions $v_0, v_1, v_2, \dots,$ each mapping $\mathcal{S}^+$ to $\mathbb{R}$. The initial approximation, $v_0$ is chosen arbitrarily, and each successive approximation is obtained by using the Bellman equation for $v_\pi$ as an update rule:

for all $s\in\mathcal{S}$.

Notes on Reinforcement Learning (1): Finite Markov Decision Processes

The Agent-Environment Interface

  • The agent and environment interact at each of a sequence of discrete time steps, $t=0,1,2,3,\dots$
  • At each time step $t$, the agent receives some representation of the environment’s state, $S_t\in\mathcal{S}$, where $\mathcal{S}$ is the set of possible states.
  • On that basis, the agent selects an action, $A_t \in \mathcal{A}(S_t)$, where $\mathcal{A}(S_t)$ is the set of actions available in state $S_t$.
  • One time step later, in part as a consequence of its action, the agent receives a numerical reward, $R_{t+1} \in \mathcal{R} \subset \mathbb{R}$, and finds itself in a new state, $S_{t+1}$.

ML With R (3): Logistic Regression

Code in R for Coursera-ML and CMPUT466/551 in University of Alberta

This post refers to programming exercise 2 in Coursera-ML. All the code can be found in this repo.


Logistic Regression

Visualizing the data

Use a scatter plot to visualize the data.

data <- read.csv("ex2data1.txt", header=FALSE)
X <- as.matrix(data[,c("V1", "V2")])
y <- as.matrix(data$V3)
ggplot(data, aes(x=V1, y=V2, col=factor(V3))) + 
  geom_point(shape=I(3), size=I(3)) +
  labs(x="Exam1 score", y="Exam2 score") +
  scale_color_manual(name="", labels=c("Not admitted", "Admitted"), values = c("blue", "red"))

ML With R (2): Regularized Linear Regression

Code in R for Coursera-ML and CMPUT466/551 in University of Alberta

This post refers to programming exercise 5 in Coursera-ML. All the code can be found in this repo.


Regularized Linear Regression

Visualizing the dataset

Use a scatter plot to visualize the data.

data <- readMat("ex5data1.mat")
X <- data$X
y <- data$y
Xtest <- data$Xtest
ytest <- data$ytest
Xval <- data$Xval
yval <- data$yval
df <- data.frame(X=X, y=y)
ggplot(df, aes(X, y, col="Training data")) + geom_point(shape=I(4), size=I(3)) +
  labs(x="Change in water level (x)", y="Water flowing out of the damn (y)") +
  scale_color_manual(guide=FALSE, values=c("Red"))

ML With R (1): Linear Regression

Code in R for Coursera-ML and CMPUT466/551 in University of Alberta

This post refers to programming exercise 1 in Coursera-ML. All the code can be found in this repo.


Linear regression with one variable

Plotting the Data

Use a scatter plot to visualize the data.

data <- read.csv("ex1data1.txt", header=FALSE)
ggplot(data, aes(V1, V2, col="Training data")) + 
  geom_point(shape=I(4), size=I(3)) + 
  labs(x="Population of City in 10,000s", y="Profit in $10,000s") +


Probabilistic Graphical Models (1): Introduction

What is probabilistic graphical model (PGM)? Forget about some high-level and complicated sentences on textbook or wikipedia. From my perspective, it’s all about graphs, distributions and the connections between them.

There are three aspects to a graphical model:

  • The graph
  • A model of the data based on the graph
  • The data itself

Actually, the data is generated by the underlying distribution and the model is the connection between the graph and the distribution. Here comes three fundamental issues:

  • Representation: How to represent the three aspects mentioned above succinctly?
  • Inference: How do I answser questions/queries according to my model and the given data? For example, the probability of a certain variable given some observations $P(X_i \vert \mathcal{D})$.
  • Learning: What model is “right” for my data? We may want to “learn” the parameters of the model, or the model itself or even the topology of the graph from the data.

GSoC 2016: Inferring Infobox Template Class Mappings From Wikipedia and WikiData

This page is the public project page and will be updated every week.

Project Description

There are many infoboxes on wikipedia. Here is a example about football box:

As seen, every infobox has some properties. Actually, every infobox follows a certain template. In my project, the goal is to find mappings between the classes (eg. dbo:Person, dbo:City) in the DBpedia ontology and infobox templates on pages of Wikipedia resources using techniques of machine learning.

There are lots of infobox mappings available for a few languages, but not as many for other languages. In order to infer mappings for all the languages, cross-lingual knowledge validation should also be considered.

The main output of the project will be a list of new high-quality infobox-class mappings.

Trace of My Study on Machine Learning

This blog will record the timeline, resources, projects along the way of my study on machine learning since 2015. And I will keep updating this blog.


Machine Learning Basics(1): Linear Regression

As a beginner in machine learning, I plan to sketch out my learning process. And it will be my first post in this series.

1. Definition

We have an input vector $X^T=(X_1, X_2, \dots, X_p)$ and want to predict a real-valued output $Y$. The linear regression model has the form

Here the $\beta_j$’s are unknown parameters or coefficients.