TITLE: Optimisation and Information Geometry for Randomised Algorithms
SPEAKER: Roman Belavkin (Middlesex University)

ABSTRACT:

Many learning and optimization algorithms are iterative procedures, which use local information (e.g. gradient of the objective function) to generate new steps towards the goal (e.g. the maximum of the objective function). Usually, local information is insufficient to solve the problem in a single iteration, and minimization of the convergence time is related to how well this local information is used. Iterative nature of the algorithms implies that they describe processes with the Markov property, and this allows us to formulate the problem of parameter optimization of an algorithm as an information-theoretic variational problem with constraints on the Shannon capacity of the Markov transition kernel corresponding to the algorithm. Pythagorean theorem for the Shannon capacity allows us to analyze these constraints, and this may help finding the optimal values of the parameters. Optimization of mutation rate in a simple genetic algorithm will be used as an example.