PyData London 2022

Signature methods for time series data
06-19, 15:00–15:45 (Europe/London), Tower Suite 2

Signatures are a mathematical tool that arise in the study of paths. Roughly speaking, they capture the fine structure of a path. It turns out that signatures are extremely useful for analysing time series data in a data science context. This is party because they can take irregularly sampled, highly oscillatory data and produce a single array of values of fixed size which can then be used as features in predictors etc. In this talk I will give a brief introduction to signatures and give a brief demonstration of how you can use them to analyse time series data. No mathematical background will be assumed.


Signatures arise from the study of rough paths originating with Terry Lyons and his numerous collaborators over the past 20 years. If we think of a path as a sequence of sampled values in n-dimensional space, then the (truncated) signature of this path over a particular interval in the parameter space is a free tensor (sum of "square" d-dimensional numpy arrays for values of d up to the truncation level flattened out into a 1-d array) that describes the oscillation of the path over this interval.

Signatures have been used in a wide variety of contexts to generate features in machine learning, usually resulting in an improvement over other techniques. For example, signatures were used to great effect in analysing sepsis data and in handwriting recognition (on the MNIST dataset), and human action recognition.

There are several open source libraries for computing signatures in Python including esig, iisignature, and signatory. (The latter is based on PyTorch and is easily integrated into Torch deep learning models.)

In the first half of the talk I will give a very high level overview of what a signature is and how it is computed. In the second half of the talk I will demonstrate how to use signatures in a simple example.


Prior Knowledge Expected

No previous knowledge expected

I am a research software engineer working on the DataSig project. This project is all about bringing rough path theory and signature methods to data science applications. I maintain the Python package esig for computing signatures and the C++ library libalgebra that backs esig, along with various other similar libraries. Prior to this role I worked as a lecturer in mathematics, and I am the author of the book "Applying Math with Python".