Home > manopt > manifolds > grassmann > grassmanngeneralizedfactory.m

grassmanngeneralizedfactory

PURPOSE ^

Returns a manifold struct of "scaled" vector subspaces.

SYNOPSIS ^

function M = grassmanngeneralizedfactory(n, p, B)

DESCRIPTION ^

 Returns a manifold struct of "scaled" vector subspaces.

 function M = grassmanngeneralizedfactory(n, p)
 function M = grassmanngeneralizedfactory(n, p, B)

 Generalized Grassmann manifold: each point on this manifold is a
 collection of "scaled" vector subspaces of dimension p embedded in R^n.
 The scaling is due to the symmetric positive definite matrix B.

 When B is identity, the manifold is the standard Grassmann manifold.

 The metric is obtained by viewing the generalized Grassmannian
 a Riemannian quotient manifold of the generalized Stiefel manifold, 
 which is the manifold of "scaled" orthonormal matrices. Specifically, 
 the scaled Stiefel manifold is the set {X : X'*B*X = I}. 
 The generalized Grassmann manifold is the Grassmannian of the 
 generalized Stiefel manifold.

 The generalized Stiefel manifold is endowed with a scaled metric
 by viewing it as a Riemannian submanifold of the Euclidean space, which
 is again endowed with the scaled inner product.

 Some notions (not all) are from Section 4.5 of the paper
 "The geometry of algorithms with orthogonality constraints",
 A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998.

 Paper link: http://arxiv.org/abs/physics/9806030.

 
 Note: some computations such as restricted_svd, distance, logarithm, and 
 exponential are new and we believe them to be correct.
 Also, we hope that the computations are numerically stable.
 In case some things do not work out as expected or there is some trouble,
 please contact us at http://www.manopt.org.

 Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If
 this is a bottleneck for a specific application, then a way forward is to
 create a modified version of this file which preprocesses B to speed this
 up (typically, by computing a Cholesky factorization of it, then calling
 an appropriate solver).

 See also: stiefelgeneralizedfactory  stiefelfactory  grassmannfactory

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function M = grassmanngeneralizedfactory(n, p, B)
0002 % Returns a manifold struct of "scaled" vector subspaces.
0003 %
0004 % function M = grassmanngeneralizedfactory(n, p)
0005 % function M = grassmanngeneralizedfactory(n, p, B)
0006 %
0007 % Generalized Grassmann manifold: each point on this manifold is a
0008 % collection of "scaled" vector subspaces of dimension p embedded in R^n.
0009 % The scaling is due to the symmetric positive definite matrix B.
0010 %
0011 % When B is identity, the manifold is the standard Grassmann manifold.
0012 %
0013 % The metric is obtained by viewing the generalized Grassmannian
0014 % a Riemannian quotient manifold of the generalized Stiefel manifold,
0015 % which is the manifold of "scaled" orthonormal matrices. Specifically,
0016 % the scaled Stiefel manifold is the set {X : X'*B*X = I}.
0017 % The generalized Grassmann manifold is the Grassmannian of the
0018 % generalized Stiefel manifold.
0019 %
0020 % The generalized Stiefel manifold is endowed with a scaled metric
0021 % by viewing it as a Riemannian submanifold of the Euclidean space, which
0022 % is again endowed with the scaled inner product.
0023 %
0024 % Some notions (not all) are from Section 4.5 of the paper
0025 % "The geometry of algorithms with orthogonality constraints",
0026 % A. Edelman, T. A. Arias, S. T. Smith, SIMAX, 1998.
0027 %
0028 % Paper link: http://arxiv.org/abs/physics/9806030.
0029 %
0030 %
0031 % Note: some computations such as restricted_svd, distance, logarithm, and
0032 % exponential are new and we believe them to be correct.
0033 % Also, we hope that the computations are numerically stable.
0034 % In case some things do not work out as expected or there is some trouble,
0035 % please contact us at http://www.manopt.org.
0036 %
0037 % Note: egrad2rgrad and ehess2rhess involve solving linear systems in B. If
0038 % this is a bottleneck for a specific application, then a way forward is to
0039 % create a modified version of this file which preprocesses B to speed this
0040 % up (typically, by computing a Cholesky factorization of it, then calling
0041 % an appropriate solver).
0042 %
0043 % See also: stiefelgeneralizedfactory  stiefelfactory  grassmannfactory
0044 
0045 
0046 % This file is part of Manopt: www.manopt.org.
0047 % Original author: Bamdev Mishra, June 30, 2015.
0048 % Contributors:
0049 %
0050 % Change log:
0051 %
0052     
0053     assert(n >= p, ...
0054         ['The dimension n of the ambient space must be larger ' ...
0055         'than the dimension p of the subspaces.']);
0056     
0057     if ~exist('B', 'var') || isempty(B)
0058         B = speye(n); % Standard Grassmann manifold.
0059     end
0060     
0061     M.name = @() sprintf('Generalized Grassmann manifold Gr(%d, %d)', n, p);
0062     
0063     M.dim = @() p*(n - p);   
0064     
0065     M.inner = @(X, eta, zeta) trace(eta'*(B*zeta)); % Scaled metric, but horizontally invariant.
0066     
0067     M.norm = @(X, eta) sqrt(M.inner(X, eta, eta));
0068     
0069     M.dist = @distance; 
0070     function d = distance(X, Y)
0071         XtBY = X'*(B*Y); % XtY ---> XtBY
0072         cos_princ_angle = svd(XtBY); % svd(XtY) ---> svd(XtBY)
0073         % Two next instructions not necessary: the imaginary parts that
0074         % would appear if the cosines are not between -1 and 1, when
0075         % passed to the acos function, would be very small, and would
0076         % thus vanish when the norm is taken.
0077         % cos_princ_angle = min(cos_princ_angle,  1);
0078         % cos_princ_angle = max(cos_princ_angle, -1);
0079         square_d = norm(acos(cos_princ_angle))^2;
0080         
0081         d = sqrt(square_d);
0082     end
0083     
0084     M.typicaldist = @() sqrt(p);
0085     
0086     
0087     % Orthogonal projection of an ambient vector U onto the
0088     % horizontal space at X.
0089     M.proj = @projection;
0090     function Up = projection(X, U)
0091         BX = B*X;
0092         
0093         % Projection onto the tangent space
0094         % U = U - X*symm(BX'*U);
0095         % Projection onto the horizontal space
0096         % Up = U - X*skew(BX'*U);
0097         
0098         Up = U - X*(BX'*U);
0099     end
0100     
0101     M.tangent = M.proj;
0102     
0103     M.egrad2rgrad = @egrad2rgrad;
0104     function rgrad = egrad2rgrad(X, egrad)
0105         
0106         % First, scale egrad according to the scaled metric in the
0107         % Euclidean space.
0108         egrad_scaled = B\egrad;
0109         
0110         % Second, project onto the tangent space.
0111         % No need to project onto the horizontal space as
0112         % by the Riemannian submersion theory, this quantity automatically
0113         % belongs to the horizontal space.
0114         %
0115         %
0116         % rgrad = egrad_scaled - X*symm((B*X)'*egrad_scaled);
0117         %
0118         % Verify that symm(BX'*egrad_scaled) = symm(X'*egrad).
0119         
0120         rgrad = egrad_scaled - X*symm(X'*egrad);
0121     end
0122     
0123     
0124     M.ehess2rhess = @ehess2rhess;
0125     function rhess = ehess2rhess(X, egrad, ehess, H)
0126         egraddot = ehess;
0127         Xdot = H;
0128         
0129         % Directional derivative of the Riemannian gradient.
0130         egrad_scaleddot = B\egraddot;
0131         rgraddot = egrad_scaleddot - Xdot*symm(X'*egrad)...
0132             - X*symm(Xdot'*egrad)...
0133             - X*symm(X'*egraddot);
0134         
0135         % Project onto the horizontal space.
0136         rhess = M.proj(X, rgraddot);
0137     end
0138     
0139     
0140     M.retr = @retraction;
0141     function Y = retraction(X, U, t)
0142         if nargin < 3
0143             t = 1.0;
0144         end
0145         Y = guf(X + t*U); % Ensure that Y'*B*Y is identity.
0146     end
0147     
0148     
0149     M.exp = @exponential;
0150     function Y = exponential(X, U, t)
0151         if nargin == 3
0152             tU = t*U;
0153         else
0154             tU = U;
0155         end
0156         
0157         % restricted_svd is defined later in the file.
0158         [u, s, v] = restricted_svd(tU);% svd(tU, 0) ---> restricted_svd(tU).
0159         cos_s = diag(cos(diag(s)));
0160         sin_s = diag(sin(diag(s)));
0161         Y = X*v*cos_s*v' + u*sin_s*v';% Verify that Y'*B*Y is identity
0162         
0163         % From numerical experiments, it seems necessary to
0164         % re-orthonormalize.
0165         Y = guf(Y);% Ensure that Y'*B*Y is identity.
0166     end
0167     
0168     
0169     
0170     % Test code for the logarithm:
0171     % gGr = grassmanngeneralizedfactory(5, 2, diag(rand(5,1)));
0172     % x = gGr.rand()
0173     % y = gGr.rand()
0174     % u = gGr.log(x, y)
0175     % gGr.dist(x, y) % These two numbers should
0176     % gGr.norm(x, u) % be the same.
0177     % z = gGr.exp(x, u) % z needs not be the same matrix as y, but it should
0178     % v = gGr.log(x, z) % be the same point as y on Grassmann: dist almost 0.
0179     % gGr.dist(z, y)
0180     M.log = @logarithm;
0181     function U = logarithm(X, Y)
0182         YtBX = Y'*(B*X); % YtX ---> YtBX.
0183         At = (Y' - YtBX*X');
0184         Bt = YtBX\At;
0185         [u, s, v] = restricted_svd(Bt');% svd(Bt', 'econ') ---> restricted_svd(Bt').
0186         
0187         u = u(:, 1:p);
0188         s = diag(s);
0189         s = s(1:p);
0190         v = v(:, 1:p);
0191         U = u*diag(atan(s))*v'; % A horizontal vector, i.e., U'*(B*X) is zero.
0192     end
0193     
0194     
0195     M.hash = @(X) ['z' hashmd5(X(:))];
0196     
0197     M.rand = @random;
0198     function X = random()
0199         X = guf(randn(n, p)); % Ensure that X'*B*X is identity;
0200     end
0201     
0202     M.randvec = @randomvec;
0203     function U = randomvec(X)
0204         U = projection(X, randn(n, p));
0205         U = U / norm(U(:));
0206     end
0207     
0208     M.lincomb = @matrixlincomb;
0209     
0210     M.zerovec = @(X) zeros(n, p);
0211     
0212     % This transport is compatible with the generalized polar retraction.
0213     M.transp = @(X1, X2, d) projection(X2, d);
0214     
0215     M.vec = @(X, u_mat) u_mat(:);
0216     M.mat = @(X, u_vec) reshape(u_vec, [n, p]);
0217     M.vecmatareisometries = @() false;
0218     
0219     % Some auxiliary functions
0220     symm = @(D) (D + D')/2;
0221     
0222     function X = guf(Y)
0223         % Generalized polar decomposition of an n-by-p matrix Y.
0224         % X'*B*X is identity.
0225         
0226         % Method 1
0227         [u, ~, v] = svd(Y, 0);
0228   
0229         % Instead of the following three steps, an equivalent, but an
0230         % expensive, way is to do X = u*(sqrtm(u'*(B*u))\(v')).
0231         [q, ssquare] = eig(u'*(B*u));
0232         qsinv = q/sparse(diag(sqrt(diag(ssquare))));
0233         X = u*((qsinv*q')*v'); % X'*B*X is identity.
0234         
0235         
0236         % Another computation using restricted_svd
0237         % [u, ~, v] = restricted_svd(Y);
0238         % X = u*v'; % X'*B*X is identity.
0239         
0240     end
0241     
0242     function [u, s, v] = restricted_svd(Y)
0243         % We compute a thin svd-like decomposition of an n-by-p matrix Y
0244         % into matrices u, s, and v such that u is an n-by-p matrix
0245         % with u'*B*u being identity, s is a p-by-p diagonal matrix
0246         % with positive entries, and v is a p-by-p orthogonal matrix.
0247         % Y = u*s*v'.
0248         
0249         [v, ssquare] = eig(symm(Y'*(B*Y))); % Y*B*Y is positive definite
0250         ssquarevec = diag(ssquare);
0251         
0252         s = sparse(diag(abs(sqrt(ssquarevec))));
0253         u = Y*(v/s); % u'*B*u is identity.
0254     end
0255     
0256 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005