Colloquium in Computational and Applied Mathematics, University of Chicago

Chicago, IL

Tensor Moments of Gaussian Mixture Models
FRIDAY, MAY 13, 2022, at 4:00PM
Jones 303, 5747 S. Ellis Ave. Chicago, IL 60637

ABSTRACT: Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences that are useful for clustering, anomaly detection, density estimation, etc. We are interested in high-dimensional problems (e.g., many features) and a potentially massive number of data points. One way to compute the parameters of a GMM is via the method of moments, which compares the sample and model moments. The first moment is the mean, the second (centered) moment is the covariance. We are interested in third, fourth, and even higher-order moments. The d-th moment of an n-dimensional random variable is a symmetric d-way tensor (multidimensional array) of size n x n x n x … x n (d times), so working with moments is assumed to be prohibitively expensive in both storage and time for d>2 and larger values of n. In this talk, we show that the estimation of the model parameters can be accomplished without explicit formation of the model or sample moments. In fact, the cost per iteration for the method of moments is the same order as that of expectation maximization (EM), making method of moments competitive. Along the way, we show how to concisely describe the moments of Gaussians and GMMs using tools from algebraic geometry, enumerative combinatorics, and multilinear algebra. Numerical results validate and illustrate the numerical efficiency of our approaches.

Daniel Sanz-Alonso, Department of Statistics,
CAM Colloquium URL: