Manifold of n-by-n symmetric positive definite matrices with the bi-invariant geometry. function M = sympositivedefinitefactory(n) A point X on the manifold is represented as a symmetric positive definite matrix X (nxn). Tangent vectors are symmetric matrices of the same size (but not necessarily definite). The Riemannian metric is the bi-invariant metric, described notably in Chapter 6 of the 2007 book "Positive definite matrices" by Rajendra Bhatia, Princeton University Press. The retraction / exponential map involves expm (the matrix exponential). If too large a vector is retracted / exponentiated (e.g., a solver tries to make too big a step), this may result in NaN's in the returned point, which most likely would lead to NaN's in the cost / gradient / ... and will result in failure of the optimization. For trustregions, this can be controlled by setting options.Delta0 and options.Delta_bar, to prevent too large steps. Note also that many of the functions involve solving linear systems in X (a point on the manifold), taking matrix exponentals and logarithms, etc. It could therefore be beneficial to do some precomputation on X (an eigenvalue decomposition for example) and store both X and the preprocessing in a structure. This would require modifying the present factory to work with such structures to represent both points and tangent vectors. We omit this in favor of simplicity, but it may be good to keep this in mind if efficiency becomes an issue in your application.
0001 function M = sympositivedefinitefactory(n) 0002 % Manifold of n-by-n symmetric positive definite matrices with 0003 % the bi-invariant geometry. 0004 % 0005 % function M = sympositivedefinitefactory(n) 0006 % 0007 % A point X on the manifold is represented as a symmetric positive definite 0008 % matrix X (nxn). Tangent vectors are symmetric matrices of the same size 0009 % (but not necessarily definite). 0010 % 0011 % The Riemannian metric is the bi-invariant metric, described notably in 0012 % Chapter 6 of the 2007 book "Positive definite matrices" 0013 % by Rajendra Bhatia, Princeton University Press. 0014 % 0015 % 0016 % The retraction / exponential map involves expm (the matrix exponential). 0017 % If too large a vector is retracted / exponentiated (e.g., a solver tries 0018 % to make too big a step), this may result in NaN's in the returned point, 0019 % which most likely would lead to NaN's in the cost / gradient / ... and 0020 % will result in failure of the optimization. For trustregions, this can be 0021 % controlled by setting options.Delta0 and options.Delta_bar, to prevent 0022 % too large steps. 0023 % 0024 % 0025 % Note also that many of the functions involve solving linear systems in X 0026 % (a point on the manifold), taking matrix exponentals and logarithms, etc. 0027 % It could therefore be beneficial to do some precomputation on X (an 0028 % eigenvalue decomposition for example) and store both X and the 0029 % preprocessing in a structure. This would require modifying the present 0030 % factory to work with such structures to represent both points and tangent 0031 % vectors. We omit this in favor of simplicity, but it may be good to keep 0032 % this in mind if efficiency becomes an issue in your application. 0033 0034 % This file is part of Manopt: www.manopt.org. 0035 % Original author: Bamdev Mishra, August 29, 2013. 0036 % Contributors: Nicolas Boumal 0037 % Change log: 0038 % 0039 % March 5, 2014 (NB) 0040 % There were a number of mistakes in the code owing to the tacit 0041 % assumption that if X and eta are symmetric, then X\eta is 0042 % symmetric too, which is not the case. See discussion on the Manopt 0043 % forum started on Jan. 19, 2014. Functions norm, dist, exp and log 0044 % were modified accordingly. Furthermore, they only require matrix 0045 % inversion (as well as matrix log or matrix exp), not matrix square 0046 % roots or their inverse. 0047 % 0048 % July 28, 2014 (NB) 0049 % The dim() function returned n*(n-1)/2 instead of n*(n+1)/2. 0050 % Implemented proper parallel transport from Sra and Hosseini (not 0051 % used by default). 0052 % Also added symmetrization in exp and log (to be sure). 0053 % 0054 % April 3, 2015 (NB): 0055 % Replaced trace(A*B) by a faster equivalent that does not compute 0056 % the whole product A*B, for inner product, norm and distance. 0057 % 0058 % May 23, 2017 (NB): 0059 % As seen in a talk of Wen Huang at the SIAM Optimization Conference 0060 % today, replaced the retraction of this factory (which was simply 0061 % equal to the exponential map) with a simpler, second-order 0062 % retraction. That this retraction is second order can be verified 0063 % numerically with checkretraction(sympositivedefinitefactory(5)); 0064 % Notice that, for this retraction, it would be cheap to evaluate for 0065 % many values of t, that is, it is cheap to retract many points along 0066 % the same tangent direction. This could in principle be exploited to 0067 % speed up line-searches. 0068 0069 symm = @(X) .5*(X+X'); 0070 0071 M.name = @() sprintf('Symmetric positive definite geometry of %dx%d matrices', n, n); 0072 0073 M.dim = @() n*(n+1)/2; 0074 0075 % Helpers to avoid computing full matrices simply to extract their trace 0076 vec = @(A) A(:); 0077 trinner = @(A, B) vec(A')'*vec(B); % = trace(A*B) 0078 trnorm = @(A) sqrt(trinner(A, A)); % = sqrt(trace(A^2)) 0079 0080 % Choice of the metric on the orthonormal space is motivated by the 0081 % symmetry present in the space. The metric on the positive definite 0082 % cone is its natural bi-invariant metric. 0083 % The result is equal to: trace( (X\eta) * (X\zeta) ) 0084 M.inner = @(X, eta, zeta) trinner(X\eta, X\zeta); 0085 0086 % Notice that X\eta is *not* symmetric in general. 0087 % The result is equal to: sqrt(trace((X\eta)^2)) 0088 % There should be no need to take the real part, but rounding errors 0089 % may cause a small imaginary part to appear, so we discard it. 0090 M.norm = @(X, eta) real(trnorm(X\eta)); 0091 0092 % Same here: X\Y is not symmetric in general. 0093 % Same remark about taking the real part. 0094 M.dist = @(X, Y) real(trnorm(real(logm(X\Y)))); 0095 0096 0097 M.typicaldist = @() sqrt(n*(n+1)/2); 0098 0099 0100 M.egrad2rgrad = @egrad2rgrad; 0101 function eta = egrad2rgrad(X, eta) 0102 eta = X*symm(eta)*X; 0103 end 0104 0105 0106 M.ehess2rhess = @ehess2rhess; 0107 function Hess = ehess2rhess(X, egrad, ehess, eta) 0108 % Directional derivatives of the Riemannian gradient 0109 Hess = X*symm(ehess)*X + 2*symm(eta*symm(egrad)*X); 0110 0111 % Correction factor for the non-constant metric 0112 Hess = Hess - symm(eta*symm(egrad)*X); 0113 end 0114 0115 0116 M.proj = @(X, eta) symm(eta); 0117 0118 M.tangent = M.proj; 0119 M.tangent2ambient = @(X, eta) eta; 0120 0121 M.retr = @retraction; 0122 function Y = retraction(X, eta, t) 0123 if nargin < 3 0124 teta = eta; 0125 else 0126 teta = t*eta; 0127 end 0128 % The symm() call is mathematically unnecessary but numerically 0129 % necessary. 0130 Y = symm(X + teta + .5*teta*(X\teta)); 0131 end 0132 0133 M.exp = @exponential; 0134 function Y = exponential(X, eta, t) 0135 if nargin < 3 0136 t = 1.0; 0137 end 0138 % The symm() and real() calls are mathematically not necessary but 0139 % are numerically necessary. 0140 Y = symm(X*real(expm(X\(t*eta)))); 0141 end 0142 0143 M.log = @logarithm; 0144 function H = logarithm(X, Y) 0145 % Same remark regarding the calls to symm() and real(). 0146 H = symm(X*real(logm(X\Y))); 0147 end 0148 0149 M.hash = @(X) ['z' hashmd5(X(:))]; 0150 0151 % Generate a random symmetric positive definite matrix following a 0152 % certain distribution. The particular choice of a distribution is of 0153 % course arbitrary, and specific applications might require different 0154 % ones. 0155 M.rand = @random; 0156 function X = random() 0157 D = diag(1+rand(n, 1)); 0158 [Q, R] = qr(randn(n)); %#ok<NASGU> 0159 X = Q*D*Q'; 0160 end 0161 0162 % Generate a uniformly random unit-norm tangent vector at X. 0163 M.randvec = @randomvec; 0164 function eta = randomvec(X) 0165 eta = symm(randn(n)); 0166 nrm = M.norm(X, eta); 0167 eta = eta / nrm; 0168 end 0169 0170 M.lincomb = @matrixlincomb; 0171 0172 M.zerovec = @(X) zeros(n); 0173 0174 % Poor man's vector transport: exploit the fact that all tangent spaces 0175 % are the set of symmetric matrices, so that the identity is a sort of 0176 % vector transport. It may perform poorly if the origin and target (X1 0177 % and X2) are far apart though. This should not be the case for typical 0178 % optimization algorithms, which perform small steps. 0179 M.transp = @(X1, X2, eta) eta; 0180 0181 % For reference, a proper vector transport is given here, following 0182 % work by Sra and Hosseini: "Conic geometric optimisation on the 0183 % manifold of positive definite matrices", to appear in SIAM J. Optim. 0184 % in 2015; also available here: http://arxiv.org/abs/1312.1039 0185 % This will not be used by default. To force the use of this transport, 0186 % execute "M.transp = M.paralleltransp;" on your M returned by the 0187 % present factory. 0188 M.paralleltransp = @parallel_transport; 0189 function zeta = parallel_transport(X, Y, eta) 0190 E = sqrtm((Y/X)); 0191 zeta = E*eta*E'; 0192 end 0193 0194 % vec and mat are not isometries, because of the unusual inner metric. 0195 M.vec = @(X, U) U(:); 0196 M.mat = @(X, u) reshape(u, n, n); 0197 M.vecmatareisometries = @() false; 0198 0199 end