Home > examples > positive_definite_karcher_mean.m

positive_definite_karcher_mean

PURPOSE ^

Computes a Karcher mean of a collection of positive definite matrices.

SYNOPSIS ^

function X = positive_definite_karcher_mean(A)

DESCRIPTION ^

 Computes a Karcher mean of a collection of positive definite matrices.

 function X = positive_definite_karcher_mean(A)

 Input:  A 3D matrix A of size nxnxm such that each slice A(:,:,k) is a
         positive definite matrix of size nxn.
 
 Output: A positive definite matrix X of size nxn which is a Karcher mean
         of the m matrices in A, that is, X minimizes the sum of squared
         Riemannian distances to the matrices in A:
            f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k))
         The distance is defined by the natural metric on the set of
         positive definite matrices: dist(X,Y) = norm(logm(X\Y), 'fro').
 
 This simple example is not the best way to compute Karcher means. Its
 purpose it to serve as base code to explore other algorithms. In
 particular, in the presence of large noise, this algorithm seems to not
 be able to reach points with a very small gradient norm. This may be
 caused by insufficient accuracy in the gradient computation.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function X = positive_definite_karcher_mean(A)
0002 % Computes a Karcher mean of a collection of positive definite matrices.
0003 %
0004 % function X = positive_definite_karcher_mean(A)
0005 %
0006 % Input:  A 3D matrix A of size nxnxm such that each slice A(:,:,k) is a
0007 %         positive definite matrix of size nxn.
0008 %
0009 % Output: A positive definite matrix X of size nxn which is a Karcher mean
0010 %         of the m matrices in A, that is, X minimizes the sum of squared
0011 %         Riemannian distances to the matrices in A:
0012 %            f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k))
0013 %         The distance is defined by the natural metric on the set of
0014 %         positive definite matrices: dist(X,Y) = norm(logm(X\Y), 'fro').
0015 %
0016 % This simple example is not the best way to compute Karcher means. Its
0017 % purpose it to serve as base code to explore other algorithms. In
0018 % particular, in the presence of large noise, this algorithm seems to not
0019 % be able to reach points with a very small gradient norm. This may be
0020 % caused by insufficient accuracy in the gradient computation.
0021 
0022 % This file is part of Manopt and is copyrighted. See the license file.
0023 %
0024 % Main author: Nicolas Boumal, Sept. 3, 2013
0025 % Contributors:
0026 %
0027 % Change log:
0028 %
0029     
0030     % Generate some random data to test the function if none is given.
0031     if ~exist('A', 'var') || isempty(A)
0032         n = 5;
0033         m = 10;
0034         A = zeros(n, n, m);
0035         ref = diag(max(.1, 1+.1*randn(n, 1)));
0036         for i = 1 : m
0037             noise = 0.01*randn(n);
0038             noise = (noise + noise')/2;
0039             [V, D] = eig(ref + noise);
0040             A(:, :, i) = V*diag(max(.01, diag(D)))*V';
0041         end
0042     end
0043     
0044     % Retrieve the size of the problem:
0045     % There are m matrices of size nxn to average.
0046     n = size(A, 1);
0047     m = size(A, 3);
0048     assert(n == size(A, 2), ...
0049            ['The slices of A must be square, i.e., the ' ...
0050             'first and second dimensions of A must be equal.']);
0051     
0052     % Our search space is the set of positive definite matrices of size n.
0053     % Notice that this is the only place we specify on which manifold we
0054     % wish to compute Karcher means. Replacing this factory for another
0055     % geometry will yield code to compute Karcher means on that other
0056     % manifold, provided that manifold is equipped with a dist function and
0057     % a logarithmic map log.
0058     M = sympositivedefinitefactory(n);
0059     
0060     % Define a problem structure, specifying the manifold M, the cost
0061     % function and its gradient.
0062     problem.M = M;
0063     problem.cost = @cost;
0064     problem.grad = @grad;
0065     
0066     % Explicitly pick an approximate Hessian for the trust-region method
0067     problem.approxhess = approxhessianFD(problem, struct('stepsize', 1e-4));
0068     
0069     % The functions below make many redundant computations. This
0070     % performance hit can be alleviated by using the caching system. We go
0071     % for a simple implementation here, as a tutorial example.
0072     
0073     % Cost function
0074     function f = cost(X)
0075         f = 0;
0076         for k = 1 : m
0077             f = f + M.dist(X, A(:, :, k))^2;
0078         end
0079         f = f/(2*m);
0080     end
0081 
0082     % Riemannian gradient of the cost function
0083     function g = grad(X)
0084         g = M.zerovec(X);
0085         for k = 1 : m
0086             % Update g in a linear combination of the form
0087             % g = g - [something]/m.
0088             g = M.lincomb(X, 1, g, -1/m, M.log(X, A(:, :, k)));
0089         end
0090     end
0091     
0092     % Execute some checks on the derivatives for early debugging.
0093     % These things can be commented out of course.
0094     % The slopes should agree on part of the plot at least. In this case,
0095     % it is sometimes necessary to inspect the plot visually to make the
0096     % call, but it is indeed correct.
0097     % checkgradient(problem);
0098     % pause;
0099     
0100     % Execute this if you want to force using a proper parallel vector
0101     % transport. This is not necessary. If you omit this, the default
0102     % vector transport is the identity map, which is (of course) cheaper
0103     % and seems to perform well in practice.
0104     % M.transp = M.paralleltransp;
0105     
0106     % Issue a call to a solver. Default options are selected.
0107     % Our initial guess is the first data point.
0108     X = trustregions(problem, A(:, :, 1));
0109 
0110 end

Generated on Sat 12-Nov-2016 14:11:22 by m2html © 2005