Neural Networks & Reinforcement Learning

Bachelor thesis on multilayer perceptrons and the universal approximation theorem, with a reinforcement learning agent trained to play Flappy Bird.

Python Neural Networks Reinforcement Learning Mathematics

Overview

My second-year bachelor mathematics project explored the mathematical foundations of neural networks — from the perceptron to multilayer architectures — culminating in a reinforcement learning agent that learned to play Flappy Bird.

The Mathematics

The project covers the theory behind how neural networks learn:

The Perceptron — the building block of neural networks, with a proof of convergence of its training algorithm for linearly separable classification problems
Multilayer Perceptrons — defining the architecture, then explaining gradient descent and backpropagation in detail
Topology & Neural Networks — using topological arguments to analyze the capabilities of multilayer perceptrons
Universal Approximation Theorem — a sketch of the proof for a simple case, showing that neural networks can approximate any continuous function

Flappy Bird — Reinforcement Learning

As a practical application, I trained a neural network agent to play Flappy Bird using reinforcement learning. The agent learns entirely from trial and error — receiving rewards for surviving and penalties for crashing — and eventually masters the game.

Abstract

The aim of this project is to explain the mathematics behind the training algorithms of the multilayer perceptron, to study the universal approximation theorem, and to try to train our own made multilayer perceptrons.

We first discuss the perceptron, which is a building block for the multilayer perceptron. We discuss how a perceptron can learn and what kind of problems it can solve. In this chapter, we prove the convergence of the perceptron training algorithm for classification problems of classes that are linearly separable.

After this, we define the multilayer perceptron and explain the algorithms gradient descent and backpropagation in detail, which can be used to train the multilayer perceptron. We discuss how topology can be used to analyze some capabilities of the multilayer perceptron.

We then take a look at a simple case of the universal approximation theorem. We give a sketch of the proof of this simple case.

In the last chapter, we discuss some numerical experiments that we did with our own implementations of the multilayer perceptron.