Sample from a Gaussian Process

2014 March 24

Because apparently this is a challenge for me.

Cf.
this stackexchange answer:
given a random vector *x* sampled from a Gaussian distribution with mean *m*
and covariance *C*, and also given matrix *A*, then *(A.x)* has mean *(A.m)*
and covariance *(A.C.A^T)*. If we’re drawing uncorrelated samples with mean
zero and variance 1, then C is the identity matrix.

So, given such a sample, if we want to achieve a covariance matrix *D*, we
need *A* such that *A.A^T = D*. One incarnation of such is the Cholesky
decomposition of *D*. *D* must be positive definite for the decomposition to
exist, but it must also be that to be a valid covariance matrix. We must
augment the diagonal of *D* to achieve this condition (i.e., include the
individual measurement error term).

If we’re using the squared exponential GP kernel, we have:

```
x = np.linspace(0, 10, 100)
r = x[None,:] - x[:,None]
cov = np.exp(-0.5 * r**2 / 1**2) + np.eye(100) * 1e-6
chol = np.linalg.cholesky(cov)
y = np.dot(chol, np.random.normal(size=x.size))
```

(Note: without the diagonal term added here, the determinant of `cov`

is
precisely zero. So yeah, we just need to add something miniscule to bump it up
to be positive. i think. I don’t know a ton about linear algebra …)

How to we verify that we did it right? I’m pretty sure we have to draw a bunch
of sample *y* values.

```
t = np.dot(chol, np.random.normal(size=(x.size, 300)))
z = np.cov(t) # rowvar=False changes output significantly!
z.shape == (100, 100) # -> True, duh.
```

An `ndshow.view()`

of *z* should give similar structure to `cov`

, and modeling
should yield parameters compatible with the inputs used to generate `cov`

(lengthscale = 1, measurement σ = 1e-3, here).