A stochastic process is a Gaussian process if (and only if) any finite subcollection of random variables
has a multivariate Gaussian distribution. Here,
is the index set for the stochastic process; most often we have
(to index time) or
(to index space).
To define a Gaussian process, one needs (i) a mean function , and (ii) a covariance function
. While there are no restrictions on the mean function, the covariance function must be:
- Symmetric, i.e.
for all
, and
- Positive semi-definite, i.e. for all
,
,
,
.
Covariance functions are sometimes called kernels. Here are some commonly used covariance functions (unless otherwise stated, these kernels are applicable for ):
Squared exponential (SE) kernel
- Also known as the radial basis function (RBF) kernel, or the Gaussian kernel.
- Has the form
, where
and
are hyperparameters.
is an overall scale factor that every kernel has, determining the overall variance.
is a “length-scale” hyperparameter that determines how “wiggly” the function is: larger
means that it is less wiggly.
- The functions drawn from this process are infinitely differentiable (i.e. very smooth). This strong smoothness assumption is probably unrealistic in practice. Nevertheless, the SE kernel remains one of the most popular kernels.
- This kernel is stationary (i.e. value of
depends only on
) and isotropic (i.e. value of
depends only on
).
- It is possible for each dimension to have its own length-scale hyperparameter: we would replace the exponent with
. (This generalization can be done for any stationary kernel. Note that the resulting kernel will no longer be isotropic.)
Rational quadratic (RQ) kernel
- Has the form
, where
,
and
are hyperparameters.
- As in the SE kernel,
is a length-scale parameter.
- The rational quadratic kernel can be viewed as a scale mixture of SE kernels with different length scales. Larger values of
give more weight to the SE kernels with longer length scales. As
, the RQ kernel becomes the SE kernel.
- As in the SE kernel,
Matérn covariance functions
- Named after Swedish statistician Bertil Matérn.
- Has the form
, where
is the gamma function,
is the modified Bessel function of the second kind.
- The hyperparameters are
,
and
.
- The functions drawn from this process are
times differentiable.
- The larger
is, the smoother the functions drawn from this process. As
, this kernel converges to the SE kernel.
- When
for some integer
, the kernel can be written as a product of an exponential and a polynomial of order
. For this reason, the values
,
and
are commonly used. The latter two are more popular as the samples from
are often thought to be too “rough”. (Rasmussen & Williams make the case that it is hard to distinguish between values of
and
.)
- This kernel is stationary and isotropic.
- When
, the resulting kernel is known as the exponential covariance function. If we further restrict
, it is called the Ornstein-Uhlenbeck process.
Periodic kernel
- Has the form
, where
,
and
are hyperparameters.
is the period of the function, determining the distance between repetitions of the function.
- Good for modeling functions which repeat themselves exactly.
- Sometimes functions repeat themselves almost exactly, not exactly. In this situation, we can use the product of the periodic kernel with another kernel (the product of two kernels is itself a kernel). Such kernels are known as locally periodic kernels.
Linear/polynomial kernel
- The linear kernel has the form
, where
is a hyperparameter.
- The polynomial kernel generalizes the linear kernel:
, where
is the degree of the polynomial, usually taken to be 2.
- It is a nonstationary kernel.
Brownian motion
- Brownian motion is a one-dimensional Gaussian process with mean zero and covariance function
.
See this webpage for a longer list of kernels.
References:
- Duvenaud, D. The Kernel Cookbook: Advice on Covariance functions.
- Rasmussen, C. E., and Williams, C. K. I. (2006). Gaussian processes for machine learning. Chapter 4: Covariance Functions.
- Snelson, E. (2006). Tutorial: Gaussian process models for machine learning.
- Wikipedia. Matérn covariance function.
Pingback: Sampling paths from a Gaussian process | Statistical Odds & Ends
Pingback: Sampling paths from a Gaussian process | R-bloggers