Computes a Karcher mean of a collection of positive definite matrices. function X = positive_definite_karcher_mean(A) Input: A 3D matrix A of size nxnxm such that each slice A(:,:,k) is a positive definite matrix of size nxn. Output: A positive definite matrix X of size nxn which is a Karcher mean of the m matrices in A, that is, X minimizes the sum of squared Riemannian distances to the matrices in A: f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k)) The distance is defined by the natural metric on the set of positive definite matrices: dist(X,Y) = norm(logm(X\Y), 'fro'). This simple example is not the best way to compute Karcher means. Its purpose it to serve as base code to explore other algorithms. In particular, in the presence of large noise, this algorithm seems to not be able to reach points with a very small gradient norm. This may be caused by insufficient accuracy in the gradient computation.

- sympositivedefinitefactory Manifold of n-by-n symmetric positive definite matrices with
- rlbfgs Riemannian limited memory BFGS solver for smooth objective functions.
- approxhessianFD Hessian approx. fnctn handle based on finite differences of the gradient.

0001 function X = positive_definite_karcher_mean(A) 0002 % Computes a Karcher mean of a collection of positive definite matrices. 0003 % 0004 % function X = positive_definite_karcher_mean(A) 0005 % 0006 % Input: A 3D matrix A of size nxnxm such that each slice A(:,:,k) is a 0007 % positive definite matrix of size nxn. 0008 % 0009 % Output: A positive definite matrix X of size nxn which is a Karcher mean 0010 % of the m matrices in A, that is, X minimizes the sum of squared 0011 % Riemannian distances to the matrices in A: 0012 % f(X) = sum_k=1^m .5*dist^2(X, A(:, :, k)) 0013 % The distance is defined by the natural metric on the set of 0014 % positive definite matrices: dist(X,Y) = norm(logm(X\Y), 'fro'). 0015 % 0016 % This simple example is not the best way to compute Karcher means. Its 0017 % purpose it to serve as base code to explore other algorithms. In 0018 % particular, in the presence of large noise, this algorithm seems to not 0019 % be able to reach points with a very small gradient norm. This may be 0020 % caused by insufficient accuracy in the gradient computation. 0021 0022 % This file is part of Manopt and is copyrighted. See the license file. 0023 % 0024 % Main author: Nicolas Boumal, Sept. 3, 2013 0025 % Contributors: 0026 % 0027 % Change log: 0028 % 0029 0030 % Generate some random data to test the function if none is given. 0031 if ~exist('A', 'var') || isempty(A) 0032 n = 5; 0033 m = 50; 0034 A = zeros(n, n, m); 0035 ref = diag(max(.1, 1+.1*randn(n, 1))); 0036 for i = 1 : m 0037 noise = 0.01*randn(n); 0038 noise = (noise + noise')/2; 0039 [V, D] = eig(ref + noise); 0040 A(:, :, i) = V*diag(max(.01, diag(D)))*V'; 0041 end 0042 end 0043 0044 % Retrieve the size of the problem: 0045 % There are m matrices of size nxn to average. 0046 n = size(A, 1); 0047 m = size(A, 3); 0048 assert(n == size(A, 2), ... 0049 ['The slices of A must be square, i.e., the ' ... 0050 'first and second dimensions of A must be equal.']); 0051 0052 % Our search space is the set of positive definite matrices of size n. 0053 % Notice that this is the only place we specify on which manifold we 0054 % wish to compute Karcher means. Replacing this factory for another 0055 % geometry will yield code to compute Karcher means on that other 0056 % manifold, provided that manifold is equipped with a dist function and 0057 % a logarithmic map log. 0058 M = sympositivedefinitefactory(n); 0059 0060 % Define a problem structure, specifying the manifold M, the cost 0061 % function and its gradient. 0062 problem.M = M; 0063 problem.cost = @cost; 0064 problem.grad = @grad; 0065 0066 % Explicitly pick an approximate Hessian for the trust-region method 0067 problem.approxhess = approxhessianFD(problem, struct('stepsize', 1e-4)); 0068 0069 % The functions below make many redundant computations. This 0070 % performance hit can be alleviated by using the caching system. We go 0071 % for a simple implementation here, as a tutorial example. 0072 0073 % Cost function 0074 function f = cost(X) 0075 f = 0; 0076 for k = 1 : m 0077 f = f + M.dist(X, A(:, :, k))^2; 0078 end 0079 f = f/(2*m); 0080 end 0081 0082 % Riemannian gradient of the cost function 0083 function g = grad(X) 0084 g = M.zerovec(X); 0085 for k = 1 : m 0086 % Update g in a linear combination of the form 0087 % g = g - [something]/m. 0088 g = M.lincomb(X, 1, g, -1/m, M.log(X, A(:, :, k))); 0089 end 0090 end 0091 0092 % Execute some checks on the derivatives for early debugging. 0093 % These things can be commented out of course. 0094 % The slopes should agree on part of the plot at least. In this case, 0095 % it is sometimes necessary to inspect the plot visually to make the 0096 % call, but it is indeed correct. 0097 % checkgradient(problem); 0098 % pause; 0099 0100 % Execute this if you want to force using a proper parallel vector 0101 % transport. This is not necessary. If you omit this, the default 0102 % vector transport is the identity map, which is (of course) cheaper 0103 % and seems to perform well in practice. 0104 % M.transp = M.paralleltransp; 0105 0106 % Issue a call to a solver. Default options are selected. 0107 % Our initial guess is the first data point. Most solvers work well 0108 % with this problem. Limited-memory BFGS is one good example: 0109 X = rlbfgs(problem, A(:, :, 1)); 0110 0111 end

Generated on Fri 08-Sep-2017 12:43:19 by