Returns a manifold struct of "scaled" vector subspaces. function M = grassmanngeneralizedfactory(n, p) function M = grassmanngeneralizedfactory(n, p, B) Generalized Grassmann manifold: each point on this manifold is a collection of "scaled" vector subspaces of dimension p embedded in R^n. The scaling is due to the symmetric positive definite matrix B. When B is identity, the manifold is the standard Grassmann manifold. The metric is obtained by viewing the generalized Grassmannian a Riemannian quotient manifold of the generalized Stiefel manifold, which is the manifold of "scaled" orthonormal matrices. Specifically, the scaled Stiefel manifold is the set {X : X'*B*X = I}. The generalized Grassmann manifold is the Grassmannian of the generalized Stiefel manifold. The generalized Stiefel manifold is endowed with a scaled metric by viewing it as a Riemannian submanifold of the Euclidean space, which is again endowed with the scaled inner product. Some notions (not all) are from Section 4.5 of the paper "The geometry of algorithms with orthogonality constraints", A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998. Paper link: http://arxiv.org/abs/physics/9806030. Note: some computations such as restricted_svd, distance, logarithm, and exponential are new and we believe them to be correct. Also, we hope that the computations are numerically stable. In case some things do not work out as expected or there is some trouble, please contact us at http://www.manopt.org. Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If this is a bottleneck for a specific application, then a way forward is to create a modified version of this file which preprocesses B to speed this up (typically, by computing a Cholesky factorization of it, then calling an appropriate solver). See also: stiefelgeneralizedfactory stiefelfactory grassmannfactory
0001 function M = grassmanngeneralizedfactory(n, p, B) 0002 % Returns a manifold struct of "scaled" vector subspaces. 0003 % 0004 % function M = grassmanngeneralizedfactory(n, p) 0005 % function M = grassmanngeneralizedfactory(n, p, B) 0006 % 0007 % Generalized Grassmann manifold: each point on this manifold is a 0008 % collection of "scaled" vector subspaces of dimension p embedded in R^n. 0009 % The scaling is due to the symmetric positive definite matrix B. 0010 % 0011 % When B is identity, the manifold is the standard Grassmann manifold. 0012 % 0013 % The metric is obtained by viewing the generalized Grassmannian 0014 % a Riemannian quotient manifold of the generalized Stiefel manifold, 0015 % which is the manifold of "scaled" orthonormal matrices. Specifically, 0016 % the scaled Stiefel manifold is the set {X : X'*B*X = I}. 0017 % The generalized Grassmann manifold is the Grassmannian of the 0018 % generalized Stiefel manifold. 0019 % 0020 % The generalized Stiefel manifold is endowed with a scaled metric 0021 % by viewing it as a Riemannian submanifold of the Euclidean space, which 0022 % is again endowed with the scaled inner product. 0023 % 0024 % Some notions (not all) are from Section 4.5 of the paper 0025 % "The geometry of algorithms with orthogonality constraints", 0026 % A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998. 0027 % 0028 % Paper link: http://arxiv.org/abs/physics/9806030. 0029 % 0030 % 0031 % Note: some computations such as restricted_svd, distance, logarithm, and 0032 % exponential are new and we believe them to be correct. 0033 % Also, we hope that the computations are numerically stable. 0034 % In case some things do not work out as expected or there is some trouble, 0035 % please contact us at http://www.manopt.org. 0036 % 0037 % Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If 0038 % this is a bottleneck for a specific application, then a way forward is to 0039 % create a modified version of this file which preprocesses B to speed this 0040 % up (typically, by computing a Cholesky factorization of it, then calling 0041 % an appropriate solver). 0042 % 0043 % See also: stiefelgeneralizedfactory stiefelfactory grassmannfactory 0044 0045 0046 % This file is part of Manopt: www.manopt.org. 0047 % Original author: Bamdev Mishra, June 30, 2015. 0048 % Contributors: 0049 % 0050 % Change log: 0051 % Jan. 4, 2021 (NB): 0052 % Moved symm() function to end of file for compatibility with Octave 6.1.0 0053 0054 assert(n >= p, ... 0055 ['The dimension n of the ambient space must be larger ' ... 0056 'than the dimension p of the subspaces.']); 0057 0058 if ~exist('B', 'var') || isempty(B) 0059 B = speye(n); % Standard Grassmann manifold. 0060 end 0061 0062 M.name = @() sprintf('Generalized Grassmann manifold Gr(%d, %d)', n, p); 0063 0064 M.dim = @() p*(n - p); 0065 0066 M.inner = @(X, eta, zeta) trace(eta'*(B*zeta)); % Scaled metric, but horizontally invariant. 0067 0068 M.norm = @(X, eta) sqrt(M.inner(X, eta, eta)); 0069 0070 M.dist = @distance; 0071 function d = distance(X, Y) 0072 XtBY = X'*(B*Y); % XtY ---> XtBY 0073 cos_princ_angle = svd(XtBY); % svd(XtY) ---> svd(XtBY) 0074 % Two next instructions not necessary: the imaginary parts that 0075 % would appear if the cosines are not between -1 and 1, when 0076 % passed to the acos function, would be very small, and would 0077 % thus vanish when the norm is taken. 0078 % cos_princ_angle = min(cos_princ_angle, 1); 0079 % cos_princ_angle = max(cos_princ_angle, -1); 0080 square_d = norm(acos(cos_princ_angle))^2; 0081 0082 d = sqrt(square_d); 0083 end 0084 0085 M.typicaldist = @() sqrt(p); 0086 0087 0088 % Orthogonal projection of an ambient vector U onto the 0089 % horizontal space at X. 0090 M.proj = @projection; 0091 function Up = projection(X, U) 0092 BX = B*X; 0093 0094 % Projection onto the tangent space 0095 % U = U - X*symm(BX'*U); 0096 % Projection onto the horizontal space 0097 % Up = U - X*skew(BX'*U); 0098 0099 Up = U - X*(BX'*U); 0100 end 0101 0102 M.tangent = M.proj; 0103 0104 M.egrad2rgrad = @egrad2rgrad; 0105 function rgrad = egrad2rgrad(X, egrad) 0106 0107 % First, scale egrad according to the scaled metric in the 0108 % Euclidean space. 0109 egrad_scaled = B\egrad; 0110 0111 % Second, project onto the tangent space. 0112 % No need to project onto the horizontal space as 0113 % by the Riemannian submersion theory, this quantity automatically 0114 % belongs to the horizontal space. 0115 % 0116 % 0117 % rgrad = egrad_scaled - X*symm((B*X)'*egrad_scaled); 0118 % 0119 % Verify that symm(BX'*egrad_scaled) = symm(X'*egrad). 0120 0121 rgrad = egrad_scaled - X*symm(X'*egrad); 0122 end 0123 0124 0125 M.ehess2rhess = @ehess2rhess; 0126 function rhess = ehess2rhess(X, egrad, ehess, H) 0127 egraddot = ehess; 0128 Xdot = H; 0129 0130 % Directional derivative of the Riemannian gradient. 0131 egrad_scaleddot = B\egraddot; 0132 rgraddot = egrad_scaleddot - Xdot*symm(X'*egrad)... 0133 - X*symm(Xdot'*egrad)... 0134 - X*symm(X'*egraddot); 0135 0136 % Project onto the horizontal space. 0137 rhess = M.proj(X, rgraddot); 0138 end 0139 0140 0141 M.retr = @retraction; 0142 function Y = retraction(X, U, t) 0143 if nargin < 3 0144 t = 1.0; 0145 end 0146 Y = guf(X + t*U); % Ensure that Y'*B*Y is identity. 0147 end 0148 0149 0150 M.exp = @exponential; 0151 function Y = exponential(X, U, t) 0152 if nargin == 3 0153 tU = t*U; 0154 else 0155 tU = U; 0156 end 0157 0158 % restricted_svd is defined later in the file. 0159 [u, s, v] = restricted_svd(tU);% svd(tU, 0) ---> restricted_svd(tU). 0160 cos_s = diag(cos(diag(s))); 0161 sin_s = diag(sin(diag(s))); 0162 Y = X*v*cos_s*v' + u*sin_s*v';% Verify that Y'*B*Y is identity 0163 0164 % From numerical experiments, it seems necessary to 0165 % re-orthonormalize. 0166 Y = guf(Y);% Ensure that Y'*B*Y is identity. 0167 end 0168 0169 0170 0171 % Test code for the logarithm: 0172 % gGr = grassmanngeneralizedfactory(5, 2, diag(rand(5,1))); 0173 % x = gGr.rand() 0174 % y = gGr.rand() 0175 % u = gGr.log(x, y) 0176 % gGr.dist(x, y) % These two numbers should 0177 % gGr.norm(x, u) % be the same. 0178 % z = gGr.exp(x, u) % z needs not be the same matrix as y, but it should 0179 % v = gGr.log(x, z) % be the same point as y on Grassmann: dist almost 0. 0180 % gGr.dist(z, y) 0181 M.log = @logarithm; 0182 function U = logarithm(X, Y) 0183 YtBX = Y'*(B*X); % YtX ---> YtBX. 0184 At = (Y' - YtBX*X'); 0185 Bt = YtBX\At; 0186 [u, s, v] = restricted_svd(Bt');% svd(Bt', 'econ') ---> restricted_svd(Bt'). 0187 0188 u = u(:, 1:p); 0189 s = diag(s); 0190 s = s(1:p); 0191 v = v(:, 1:p); 0192 U = u*diag(atan(s))*v'; % A horizontal vector, i.e., U'*(B*X) is zero. 0193 end 0194 0195 0196 M.hash = @(X) ['z' hashmd5(X(:))]; 0197 0198 M.rand = @random; 0199 function X = random() 0200 X = guf(randn(n, p)); % Ensure that X'*B*X is identity; 0201 end 0202 0203 M.randvec = @randomvec; 0204 function U = randomvec(X) 0205 U = projection(X, randn(n, p)); 0206 U = U / norm(U(:)); 0207 end 0208 0209 M.lincomb = @matrixlincomb; 0210 0211 M.zerovec = @(X) zeros(n, p); 0212 0213 % This transport is compatible with the generalized polar retraction. 0214 M.transp = @(X1, X2, d) projection(X2, d); 0215 0216 M.vec = @(X, u_mat) u_mat(:); 0217 M.mat = @(X, u_vec) reshape(u_vec, [n, p]); 0218 M.vecmatareisometries = @() false; 0219 0220 % Some auxiliary functions 0221 function X = guf(Y) 0222 % Generalized polar decomposition of an n-by-p matrix Y. 0223 % X'*B*X is identity. 0224 0225 % Method 1 0226 [u, ~, v] = svd(Y, 0); 0227 0228 % Instead of the following three steps, an equivalent, but an 0229 % expensive, way is to do X = u*(sqrtm(u'*(B*u))\(v')). 0230 [q, ssquare] = eig(u'*(B*u)); 0231 qsinv = q/sparse(diag(sqrt(diag(ssquare)))); 0232 X = u*((qsinv*q')*v'); % X'*B*X is identity. 0233 0234 0235 % Another computation using restricted_svd 0236 % [u, ~, v] = restricted_svd(Y); 0237 % X = u*v'; % X'*B*X is identity. 0238 0239 end 0240 0241 function [u, s, v] = restricted_svd(Y) 0242 % We compute a thin svd-like decomposition of an n-by-p matrix Y 0243 % into matrices u, s, and v such that u is an n-by-p matrix 0244 % with u'*B*u being identity, s is a p-by-p diagonal matrix 0245 % with positive entries, and v is a p-by-p orthogonal matrix. 0246 % Y = u*s*v'. 0247 0248 [v, ssquare] = eig(symm(Y'*(B*Y))); % Y*B*Y is positive definite 0249 ssquarevec = diag(ssquare); 0250 0251 s = sparse(diag(abs(sqrt(ssquarevec)))); 0252 u = Y*(v/s); % u'*B*u is identity. 0253 end 0254 0255 end 0256 0257 function A = symm(Z) 0258 A = (Z + Z')/2; 0259 end