GMMParameterEstimation.jl Documentation

GMMParameterEstimation.jl is a package for estimating the parameters of Gaussian k-mixture models using the method of moments. It works for general k with known mixing coefficients, and for k=2,3,4 for unknown mixing coefficients.

Example

The following code snippet will use the given moments to return an estimate of the parameters using the method of moments.

using GMMParameterEstimation
d = 3
k = 2
first_moments = [1.0, 0.980, 1.938, 3.478, 8.909, 20.526, 64.303]
diagonal_moments = [-0.580 5.682 -11.430 97.890 -341.161; -0.480 1.696 -2.650 11.872 -33.239]
off_diag_system = Dict{Vector{Int64}, Float64}([0, 1, 2] => -1.075, [1, 0, 1] => -0.252, [1, 2, 0] => 6.021, [1, 0, 2] => 1.117, [1, 1, 0] => -0.830, [0, 1, 1] => 0.884)
pass, (mixing_coefficients, means, covariances) = estimate_parameters(d, k, first_moments, diagonal_moments, off_diag_system)

$\\~\\$

Parameter estimation

The main functionality of this package stems from

GMMParameterEstimation.estimate_parametersFunction
estimate_parameters(d::Integer, k::Integer, first::Vector{Float64}, second::Matrix{Float64}, last::Dict{Vector{Int64}, Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the unknown mixing coefficient dense covariance matrix case, first should be a list of moments 0 through 3k for the first dimension, second should be a matrix of moments 1 through 2k+1 for the remaining dimensions, and last should be a dictionary of the indices as lists of integers and the corresponding moments. This requires k<= 4.

estimate_parameters(d::Integer, k::Integer, w::Array{Float64}, first::Vector{Float64}, second::Matrix{Float64}, last::Dict{Vector{Int64}, Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the known mixing coefficient dense covariance matrix case, w should be a vector of the mixing coefficients first should be a list of moments 0 through 3k for the first dimension, second should be a matrix of moments 1 through 2k+1 for the remaining dimensions, and last should be a dictionary of the indices as lists of integers and the corresponding moments.

estimate_parameters(d::Integer, k::Integer, first::Vector{Float64}, second::Matrix{Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the unknown mixing coefficient diagonal covariance matrix case, first should be a list of moments 0 through 3k for the first dimension, and second should be a matrix of moments 1 through 2k+1 for the remaining dimensions. This requires k<= 4.

estimate_parameters(d::Integer, k::Integer, w::Array{Float64}, first::Vector{Float64}, second::Matrix{Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the known mixing coefficient diagonal covariance matrix case, w should be a vector of the mixing coefficients first should be a list of moments 0 through 3k for the first dimension, and second should be a matrix of moments 1 through 2k+1 for the remaining dimensions..

which computes the parameter recovery using Algorithm 1 from Estimating Gaussian Mixtures Using Sparse Polynomial Moment Systems.

In one dimension, for a random variable $X$ with density $f$ we define the $i$th moment as $m_i=E[X^i]=\int xf(x)dx$. For a Gaussian mixture model, this results in a polynomial in the parameters. For a sample $\{y_1,y_2,\dots,y_N\}$, we define the $i$th sample moment as $\overline{m_i}=\frac{1}{N}\sum_{j=1}^N y_j^i$. The sample moments approach the true moments as $N\rightarrow\infty$, so by setting the polynomials equal to the empirical moments, we can then solve the polynomial system to recover the parameters.

For a multivariate random variable $X$ with density $f_X$ we define the moments as $m_{i_1,\dots,i_n} = E[X_1^{i_1}\cdots X_n^{i_n}] = \int\cdots\int x_1^{i_1}\cdots x_n^{i_n}f_X(x_1,\dots,x_n)dx_1\cdots dx_n$ and the empirical moments as $\overline{m}_{i_1,\dots,i_n} = \frac{1}{N}\sum_{j=1}^Ny_{j_1}^{i_1}\cdots y_{j_n}^{i_n}$. And again, by setting the polynomials equal to the empirical moments, we can then solve the system of polynomials to recover the parameters. However, choosing which moments becomes more complicated.

$\\~\\$

Generate and sample from Gaussian Mixture Models

Note that the entries of the resulting covariance matrices are generated from a normal distribution centered at 0 with variance 1.

$\\~\\$

GMMParameterEstimation.generateGaussiansFunction
generateGaussians(d::Integer, k::Integer, diagonal::Bool)

Generate means and covariances for k Gaussians with dimension d.

diagonal should be true for spherical case, and false for dense covariance matrices.

The parameters are returned as a tuple, with weights in a 1D vector, means as a k x d array, and variances as a k x d x d array. Note that each entry of each parameter is generated from a normal distribution centered at 0 with variance 1.

$\\~\\$

GMMParameterEstimation.getSampleFunction

getSample(numb::Integer, w::Vector{Float64}, means::Matrix{Float64}, covariances::Vector)

Generate a Gaussian mixture model sample with numb entries, mixing coefficients w, means means, and covariances covariances.

This relies on the Distributions package.

$\\~\\$

GMMParameterEstimation.sampleMomentsFunction
sampleMoments(sample::Matrix{Float64}, k; diagonal = false)

Use the sample to compute the moments necessary for parameter estimation using method of moments.

Returns moments 0 to 3k for the first dimension, moments 1 through 2k+1 for the other dimensions as a matrix, and a dictionary with indices and moments for the off-diagonal system if diagonal is false.

GMMParameterEstimation.diagonalPerfectMomentsFunction

diagonalPerfectMoments(d, k, w, truemeans, truecovariances)

Use the given parameters to compute the exact moments necessary for parameter estimation with diagonal covariance matrices.

Returns moments 0 to 3k for the first dimension, and moments 1 through 2k+1 for the other dimensions as a matrix.

GMMParameterEstimation.densePerfectMomentsFunction
densePerfectMoments(d, k, w, true_means, true_covariances)

Use the given parameters to compute the exact moments necessary for parameter estimation with dense covariance matrices.

Returns moments 0 to 3k for the first dimension, moments 1 through 2k+1 for the other dimensions as a matrix, and a dictionary with indices and moments for the off-diagonal system.

Both expect parameters to be given with weights in a 1D vector, means as a k x d array, and variances as a k x d x d array.

$\\~\\$

Build the polynomial systems

GMMParameterEstimation.build1DSystemFunction
build1DSystem(k::Integer, m::Integer)

Build the polynomial system for a mixture of 1D Gaussians where 'm'-1 is the highest desired moment and the mixing coefficients are unknown.

build1DSystem(k::Integer, m::Integer, a::Union{Vector{Float64}, Vector{Variable}})

Build the polynomial system for a mixture of 1D Gaussians where 'm'-1 is the highest desired moment, and a is a vector of the mixing coefficients.

GMMParameterEstimation.selectSolFunction
selectSol(k::Integer, solution::Result, polynomial::Expression, moment::Number)

Select a k mixture solution from solution accounting for polynomial and moment.

Sort out a k mixture statistically significant solutions from solution, and return the one closest to moment when polynomial is evaluated at those values.

GMMParameterEstimation.mixedMomentSystemFunction
mixedMomentSystem(d, k, mixing, ms, vs)

Build a linear system for finding the off-diagonal covariances entries.

For a d dimensional Gaussian k-mixture model with mixing coefficients mixing, means ms, and covariances vs where the diagonal entries have been filled in and the off diagonals are variables.

Index