# A simple policy gradient implementation with keras (part 2)

This post describes how to set up a simple policy gradient network with Keras and pong.

Machine learning and Python.

This post describes how to set up a simple policy gradient network with Keras and pong.

In this post I’ll show how to set up a standard keras network so that it optimizes a reinforcement learning objective using policy gradients, following Karpathy’s excellent explanation.

In my opinion there are two Vims: the interface (modal editing), and the environment (plugins and customizations).

Bidirectional recurrent neural networks
(BiRNNs) enable us to classify
each element in a sequence while using information from that element’s past and
future.
Keras provides a high level interface to
Theano and
TensorFlow.
In this post I’ll describe how to implement
BiRNNs with Keras without using `go_backwards`

(there are different ways to skin a cat).

*Continues from [Numpy character embeddings]
(/Numpy-character-embeddings/).* </p>
The `numpy`

embedding model turned out to be extremely slow because
it wasn’t vectorised. `Chainer`

is a python deep learning
package that enables us to
implement the model easily with automatic differentiation and the
resulting vectorised operations are fast - and can be run on a GPU if you want.
In this post I’ll explore how the different optimisers perform out of the box.

*Continues from [Embedding derivative derivation]
(/Embedding-derivative-derivation/).* </p>
Let’s implement the embedding model in `numpy`

, train it on some
characters, generate some text, and plot two of the components over time.

Sometimes you can’t use automatic differentiation to train a model and then you have to do the derivation yourself. This derivation is for a simple two-layer model where the input layer is followed by an embedding layer which is followed by a fully connected softmax layer. It is based on some [old and new matrix algebra (and calculus) useful for statistics] (http://research.microsoft.com/en-us/um/people/minka/papers/matrix/).

Character N-gram language models is an exciting idea that looks like the direction language modelling is taking. Functional programming is another idea that is receiving attention in the machine learning community. To learn more about them I’m playing around with a simple N-gram counter and text generator in Haskell.

Socionics comes across as the more serious and academic eastern European cousin of MBTI (which is much better known by English speakers). Although many of the same criticism that apply to personality theories/tools such as MBTI are also applicable to socionics, I find these models fascinating. Here I’ll use socionics to set up a fun application/demonstration of Markov random fields.

The package `pyugm`

is a package for
learning (discrete at this stage)
undirected graphical models in Python. It implements
loopy belief propagation (LBP) on cluster graphs
or Gibbs sampling for inference. In this post I’ll
show how a simple image segmentation model can be build and
calibrated.

When we want to classify sequences, HCRFs are - if we can forget about recurrent neural networks for a moment - discriminative counterparts to hidden Markov models.

*Continues from [Spelling correction with pyhacrf]
(/Spelling-correction-with-pyhacrf/).* </p>
In this post I’ll look in more detail at how
[`pyhacrf`

] (https://github.com/dirko/pyhacrf) can
be used to generate spelling corrections for incorrect tokens.
We’ll use cross-validation to set the regularisation
parameter, add character transition features, and compare
the model’s spelling suggestions to a Levenshtein baseline.

What happens when we run singular value decomposition (SVD) on images? In this post I’ll show how to do SVD on images with python and some of the interesting visual effects that result. </p>

In this post I’ll describe how to generate suggestions for
incorrectly spelled words using the [`pyhacrf`

]
(https://github.com/dirko/pyhacrf) package.
`pyhacrf`

implements the Hidden Alignment Conditional
Random Field (HACRF) model in python with a `sklearn`

-like
interface.

First post! I plan to add some posts about the `pyhacrf`

project soon,
and later something on probablistic graphical models
and relational learning with deep networks.