## Sunday, 9 February 2014

### Pascal's Triangle

While smoothing some evenly-spaced time series data a while ago, I found the need to generate rows from Pascal's triangle; for my own entertainment, I tried to come up with some minimal (but not totally unreadable) code:

def pascalRow(n):
" Returns the nth row of Pascal's triangle"
return [1] if n<=0 else reduce(lambda row, n: row[:-1] + [(row[-1] + n), n], pascalRow(n-1), [0])


This is short and quite fast, but severely lacking in readability; a trait only made slightly excusable by it's modularity, and clarity of function. The question you may be asking though, is 'why Pascal's triangle in the first place'?
Gaussian blurring/smoothing is just another instance of the general rule that 'Gaussians come up everywhere'. When dealing with discreet data, it's important to remember that the binomial distribution is the discrete equivalent of the Gaussian. The $n^{th}$ row of Pascal's triangle is a list of the binomial coefficients $\binom nk$. These are the weightings (up to a normalisation factor of $\frac{1}{2^n}$), describing how each raw data-point will contribute to it's neighbors:

$$x_i^{smoothed} = \sum\limits_{k = 0}^{n} \binom nk x_{i-\frac{n}{2} + k}$$

Each smoothed point is thus heavily influenced by its near neighbours, but not by its more distant ones. Note that $n$ determines the radius of the smoothing filter (in units of your data's sampling period).

This kind of clunky, discrete sum is just a special case of a convolution. Convolutions generalise the notion of smoothing to filtering, enabling a huge range of operations (notably Fourier Transforms) to be performed with these same kinds of discrete sums. Convolutions generalise to to higher dimensions (and continuous spaces) as well. In 2D, the weights or 'filter kernel' become a matrix; filtering is then sliding every pixel in the kernel matrix over each pixel in the data matrix. This gives a complexity of $O(n^2m^2)$ for an $m \times m$ kernel. Gaussians (binomials) have some special properties that can make these sums more efficient however: multidimensional Gaussians are separable. A 2D Gaussian is just the product of two orthogonal 1D Gaussians: $G(x, y) = G(x)G(y)$ [Note that the separated components of a 2D Gaussian might not align with the x and y axes of your data matrix. This is only a problem if the Gaussian is anisotropic, ie. if it has different widths in different directions].

This means that convolutions with Gaussian kernels can be done one dimension at a time; sliding each data-pixel over each pixel in two length $m$ binomial/Gaussian kernels, reducing the time complexity to $O(n^2m)$. This reduction extends to higher dimensions too - you can always break down a Gaussian into 1D components and convolute them sequentially.