Description of fixedrankembeddedfactory

0001 function M = fixedrankembeddedfactory(m, n, k)
0002 % Manifold struct to optimize fixed-rank matrices w/ an embedded geometry.
0003 %
0004 % function M = fixedrankembeddedfactory(m, n, k)
0005 %
0006 % Manifold of m-by-n real matrices of fixed rank k. This follows the
0007 % embedded geometry described in Bart Vandereycken's 2013 paper:
0008 % "Low-rank matrix completion by Riemannian optimization".
0009 %
0010 % Paper link: http://arxiv.org/pdf/1209.3834.pdf
0011 %
0012 % A point X on the manifold is represented as a structure with three
0013 % fields: U, S and V. The matrices U (mxk) and V (nxk) are orthonormal,
0014 % while the matrix S (kxk) is any /diagonal/ full-rank matrix.
0015 % Following the SVD formalism, X = U*S*V'. Note that the diagonal entries
0016 % of S are not constrained to be nonnegative.
0017 %
0018 % Tangent vectors are represented as a structure with three fields: Up, M
0019 % and Vp. The matrices Up (mxk) and Vp (mxk) obey Up'*U = 0 and Vp'*V = 0.
0020 % The matrix M (kxk) is arbitrary. Such a structure corresponds to the
0021 % following tangent vector in the ambient space of mxn matrices:
0022 %   Z = U*M*V' + Up*V' + U*Vp'
0023 % where (U, S, V) is the current point and (Up, M, Vp) is the tangent
0024 % vector at that point.
0025 %
0026 % Vectors in the ambient space are best represented as mxn matrices. If
0027 % these are low-rank, they may also be represented as structures with
0028 % U, S, V fields, such that Z = U*S*V'. There are no restrictions on what
0029 % U, S and V are, as long as their product as indicated yields a real, mxn
0030 % matrix.
0031 %
0032 % The chosen geometry yields a Riemannian submanifold of the embedding
0033 % space R^(mxn) equipped with the usual trace (Frobenius) inner product.
0034 %
0035 % The tools
0036 %    X_triplet = M.matrix2triplet(X_matrix) and
0037 %    X_matrix = M.triplet2matrix(X_triplet)
0038 % can be used to easily convert between full matrix representation (as a
0039 % matrix of size mxn) and triplet representation as a structure with fields
0040 % U, S, V. The tool matrix2triplet also accepts an optional second input r
0041 % to choose the rank of the triplet representation. By default, r = k. If
0042 % the input matrix X_matrix has rank more than r, the triplet represents
0043 % its best rank-r approximation in the Frobenius norm (truncated SVD).
0044 % Note that these conversions are computationally expensive for large m
0045 % and n: this is mostly useful for small matrices and for prototyping.
0046 %
0047 %
0048 % Please cite the Manopt paper as well as the research paper:
0049 %     @Article{vandereycken2013lowrank,
0050 %       Title   = {Low-rank matrix completion by {Riemannian} optimization},
0051 %       Author  = {Vandereycken, B.},
0052 %       Journal = {SIAM Journal on Optimization},
0053 %       Year    = {2013},
0054 %       Number  = {2},
0055 %       Pages   = {1214--1236},
0056 %       Volume  = {23},
0057 %       Doi     = {10.1137/110845768}
0058 %     }
0059 %
0060 % See also: fixedrankfactory_2factors fixedrankfactory_3factors fixedranktensorembeddedfactory
0061 
0062 % This file is part of Manopt: www.manopt.org.
0063 % Original author: Nicolas Boumal, Dec. 30, 2012.
0064 % Contributors: Bart Vandereycken, Eitan Levin
0065 % Change log:
0066 %
0067 %   Feb. 20, 2014 (NB):
0068 %       Added function tangent to work with checkgradient.
0069 %
0070 %   June 24, 2014 (NB):
0071 %       A couple modifications following Bart's feedback:
0072 %       - The checksum (hash) was replaced for a faster alternative: it's a
0073 %         bit less "safe" in that collisions could arise with higher
0074 %         probability, but they're still very unlikely.
0075 %       - The vector transport was changed.
0076 %       The typical distance was also modified, hopefully giving the
0077 %       trustregions method a better initial guess for the trust-region
0078 %       radius, but that should be tested for different cost functions too.
0079 %
0080 %    July 11, 2014 (NB):
0081 %       Added ehess2rhess and tangent2ambient, supplied by Bart.
0082 %
0083 %    July 14, 2014 (NB):
0084 %       Added vec, mat and vecmatareisometries so that hessianspectrum now
0085 %       works with this geometry. Implemented the tangent function.
0086 %       Made it clearer in the code and in the documentation in what format
0087 %       ambient vectors may be supplied, and generalized some functions so
0088 %       that they should now work with both accepted formats.
0089 %       It is now clearly stated that for a point X represented as a
0090 %       triplet (U, S, V), the matrix S needs to be diagonal.
0091 %
0092 %    Sep.  6, 2018 (NB):
0093 %       Removed M.exp() as it was not implemented.
0094 %
0095 %    March 20, 2019 (NB):
0096 %       Added M.matrix2triplet and M.triplet2matrix to allow easy
0097 %       conversion between matrix representations either as full matrices
0098 %       or as triplets (U, S, V).
0099 %
0100 %    Dec. 14, 2019 (EL):
0101 %       The original retraction code was repaced with a somewhat slower but
0102 %       numerically more stable version. With the original code, trouble
0103 %       could arise when the matrices Up, Vp defining the tangent vector
0104 %       being retracted were ill-conditioned.
0105 %
0106 %    Jan. 28, 2020 (NB):
0107 %       In retraction code, moved parameter t around to highlight the fact
0108 %       that it comes up in only one computation.
0109 %       Replaced vec/mat codes: they are still isometries, but they produce
0110 %       representations of length k*(m+n+k) instead of m*n, which is much
0111 %       more efficient: it only exceeds the true dimension by 2k^2. Also,
0112 %       mat does not attempt to project to the tangent space (which it did
0113 %       before but shouldn't): compose mat with tangent if that is the
0114 %       desired effect.
0115 
0116     M.name = @() sprintf('Manifold of %dx%d matrices of rank %d', m, n, k);
0117     
0118     M.dim = @() (m+n-k)*k;
0119     
0120     M.inner = @(x, d1, d2) d1.M(:).'*d2.M(:) + d1.Up(:).'*d2.Up(:) ...
0121                                              + d1.Vp(:).'*d2.Vp(:);
0122     
0123     M.norm = @(x, d) sqrt(norm(d.M, 'fro')^2 + norm(d.Up, 'fro')^2 ...
0124                                              + norm(d.Vp, 'fro')^2);
0125     
0126     M.dist = @(x, y) error('fixedrankembeddedfactory.dist not implemented yet.');
0127     
0128     M.typicaldist = @() M.dim();
0129     
0130     % Given Z in tangent vector format, projects the components Up and Vp
0131     % such that they satisfy the tangent space constraints up to numerical
0132     % errors. If Z was indeed a tangent vector at X, this should barely
0133     % affect Z (it would not at all if we had infinite numerical accuracy).
0134     M.tangent = @tangent;
0135     function Z = tangent(X, Z)
0136         Z.Up = Z.Up - X.U*(X.U'*Z.Up);
0137         Z.Vp = Z.Vp - X.V*(X.V'*Z.Vp);
0138     end
0139 
0140     % For a given ambient vector Z, applies it to a matrix W. If Z is given
0141     % as a matrix, this is straightforward. If Z is given as a structure
0142     % with fields U, S, V such that Z = U*S*V', the product is executed
0143     % efficiently.
0144     function ZW = apply_ambient(Z, W)
0145         if ~isstruct(Z)
0146             ZW = Z*W;
0147         else
0148             ZW = Z.U*(Z.S*(Z.V'*W));
0149         end
0150     end
0151 
0152     % Same as apply_ambient, but applies Z' to W.
0153     function ZtW = apply_ambient_transpose(Z, W)
0154         if ~isstruct(Z)
0155             ZtW = Z'*W;
0156         else
0157             ZtW = Z.V*(Z.S'*(Z.U'*W));
0158         end
0159     end
0160     
0161     % Orthogonal projection of an ambient vector Z represented as an mxn
0162     % matrix or as a structure with fields U, S, V to the tangent space at
0163     % X, in a tangent vector structure format.
0164     M.proj = @projection;
0165     function Zproj = projection(X, Z)
0166             
0167         ZV = apply_ambient(Z, X.V);
0168         UtZV = X.U'*ZV;
0169         ZtU = apply_ambient_transpose(Z, X.U);
0170 
0171         Zproj.M = UtZV;
0172         Zproj.Up = ZV  - X.U*UtZV;
0173         Zproj.Vp = ZtU - X.V*UtZV';
0174 
0175     end
0176 
0177     M.egrad2rgrad = @projection;
0178     
0179     % Code supplied by Bart.
0180     % Given the Euclidean gradient at X and the Euclidean Hessian at X
0181     % along H, where egrad and ehess are vectors in the ambient space and H
0182     % is a tangent vector at X, returns the Riemannian Hessian at X along
0183     % H, which is a tangent vector.
0184     M.ehess2rhess = @ehess2rhess;
0185     function rhess = ehess2rhess(X, egrad, ehess, H)
0186         
0187         % Euclidean part
0188         rhess = projection(X, ehess);
0189         
0190         % Curvature part
0191         T = apply_ambient(egrad, H.Vp)/X.S;
0192         rhess.Up = rhess.Up + (T - X.U*(X.U'*T));
0193         T = apply_ambient_transpose(egrad, H.Up)/X.S;
0194         rhess.Vp = rhess.Vp + (T - X.V*(X.V'*T));
0195         
0196     end
0197 
0198     % Transforms a tangent vector Z represented as a structure (Up, M, Vp)
0199     % into a structure with fields (U, S, V) that represents that same
0200     % tangent vector in the ambient space of mxn matrices, as U*S*V'.
0201     % This matrix is equal to X.U*Z.M*X.V' + Z.Up*X.V' + X.U*Z.Vp'. The
0202     % latter is an mxn matrix, which could be too large to build
0203     % explicitly, and this is why we return a low-rank representation
0204     % instead. Note that there are no guarantees on U, S and V other than
0205     % that USV' is the desired matrix. In particular, U and V are not (in
0206     % general) orthonormal and S is not (in general) diagonal.
0207     % (In this implementation, S is identity, but this might change.)
0208     M.tangent2ambient_is_identity = false;
0209     M.tangent2ambient = @tangent2ambient;
0210     function Zambient = tangent2ambient(X, Z)
0211         Zambient.U = [X.U*Z.M + Z.Up, X.U];
0212         Zambient.S = eye(2*k);
0213         Zambient.V = [X.V, Z.Vp];
0214     end
0215     
0216     % This retraction is second order, following general results from
0217     % Absil, Malick, "Projection-like retractions on matrix manifolds",
0218     % SIAM J. Optim., 22 (2012), pp. 135-158.
0219     %
0220     % Notice that this retraction is only locally smooth. Indeed, the
0221     % following code exhibits a discontinuity when retracting from
0222     % X = [1 0 ; 0 0] along V = [0 0 ; 0 1]:
0223     %
0224     % M = fixedrankembeddedfactory(2, 2, 1);
0225     % X = struct('U', [1;0], 'V', [1;0], 'S', 1);
0226     % V = struct('Up', [0;1], 'Vp', [0;1], 'M', 1);
0227     % entry = @(M) M(1, 1);
0228     % mat = @(X) X.U*X.S*X.V';
0229     % g = @(t) entry(mat(M.retr(X, V, t)));
0230     % ezplot(g, [-2, 2]);
0231     %
0232     M.retr = @retraction;
0233     function Y = retraction(X, Z, t)
0234         if nargin < 3
0235             t = 1.0;
0236         end
0237 
0238         % Mathematically, Z.Up is orthogonal to X.U, and likewise for
0239         % Z.Vp compared to X.V. Thus, in principle, we could call QR
0240         % on Z.Up and Z.Vp alone, which should be about 4 times faster
0241         % than the calls here where we orthonormalize twice as many
0242         % vectors. However, when Z.Up, Z.Vp are poorly conditioned,
0243         % orthonormalizing them can lead to loss of orthogonality
0244         % against X.U, X.V.
0245         [Qu, Ru] = qr([X.U, Z.Up], 0);
0246         [Qv, Rv] = qr([X.V, Z.Vp], 0);
0247         
0248         % Calling svds or svd should yield the same result, but BV
0249         % advocated svd is more robust, and it doesn't change the
0250         % asymptotic complexity to call svd then trim rather than call
0251         % svds. Also, apparently Matlab calls ARPACK in a suboptimal way
0252         % for svds in this scenario.
0253         % Notice that the parameter t appears only here. Thus, in princple,
0254         % we could make some savings for line-search procedures where we
0255         % retract the same vector multiple times, only with different
0256         % values of t. The asymptotic complexity remains the same though
0257         % (up to a constant factor) because of the matrix-matrix products
0258         % below which cost essentially the same as the QR factorizations.
0259         [U, S, V] = svd(Ru*[X.S + t*Z.M, t*eye(k); t*eye(k), zeros(k)]*Rv');
0260     
0261         Y.U = Qu*U(:, 1:k); 
0262         Y.V = Qv*V(:, 1:k); 
0263         Y.S = S(1:k, 1:k);
0264         
0265         % Equivalent but very slow code
0266         % [U, S, V] = svds(X.U*X.S*X.V' + t*(X.U*Z.M*X.V' + Z.Up*X.V' + X.U*Z.Vp'), k);
0267         % Y.U = U; Y.V = V; Y.S = S;
0268     end
0269 
0270 
0271     % Orthographic retraction provided by Teng Zhang. One interest of the
0272     % orthographic retraction is that if matrices are represented in full
0273     % size, it can be computed without any SVDs. If for an application it
0274     % makes sense to represent the matrices in full size, this may be a
0275     % good idea, but it won't shine in the present implementation of the
0276     % manifold.
0277     M.retr_ortho = @retraction_orthographic;
0278     function Y = retraction_orthographic(X, Z, t)
0279         if nargin < 3
0280             t = 1.0;
0281         end
0282         
0283         % First, write Y (the output) as U1*S0*V1', where U1 and V1 are
0284         % orthogonal matrices and S0 is of size r by r.
0285         [U1, ~] = qr(t*(X.U*Z.M  + Z.Up) + X.U*X.S, 0);
0286         [V1, ~] = qr(t*(X.V*Z.M' + Z.Vp) + X.V*X.S, 0);
0287         S0 = (U1'*X.U)*(X.S + t*Z.M)*(X.V'*V1) ...
0288                          + t*((U1'*Z.Up)*(X.V'*V1) + (U1'*X.U)*(Z.Vp'*V1));
0289         
0290         % Then, obtain the singular value decomposition of Y.
0291         [U2, S2, V2] = svd(S0);
0292         Y.U = U1*U2;
0293         Y.S = S2;
0294         Y.V = V1*V2;
0295         
0296     end
0297 
0298 
0299     % Less safe but much faster checksum, June 24, 2014.
0300     % Older version right below.
0301     M.hash = @(X) ['z' hashmd5([sum(X.U(:)) ; sum(X.S(:)); sum(X.V(:)) ])];
0302     %M.hash = @(X) ['z' hashmd5([X.U(:) ; X.S(:) ; X.V(:)])];
0303     
0304     M.rand = @random;
0305     % Factors U, V live on Stiefel manifolds: reuse their random generator.
0306     stiefelm = stiefelfactory(m, k);
0307     stiefeln = stiefelfactory(n, k);
0308     function X = random()
0309         X.U = stiefelm.rand();
0310         X.V = stiefeln.rand();
0311         X.S = diag(sort(rand(k, 1), 1, 'descend'));
0312     end
0313     
0314     % Generate a random tangent vector at X.
0315     % Note: this may not be the uniform distribution over the set of
0316     % unit-norm tangent vectors.
0317     M.randvec = @randomvec;
0318     function Z = randomvec(X)
0319         Z.M  = randn(k);
0320         Z.Up = randn(m, k);
0321         Z.Vp = randn(n, k);
0322         Z = tangent(X, Z);
0323         nrm = M.norm(X, Z);
0324         Z.M  = Z.M  / nrm;
0325         Z.Up = Z.Up / nrm;
0326         Z.Vp = Z.Vp / nrm;
0327     end
0328     
0329     M.lincomb = @lincomb;
0330     
0331     M.zerovec = @(X) struct('M', zeros(k, k), 'Up', zeros(m, k), ...
0332                                               'Vp', zeros(n, k));
0333     
0334     % New vector transport on June 24, 2014 (as indicated by Bart)
0335     % Reference: Absil, Mahony, Sepulchre 2008 section 8.1.3:
0336     % For Riemannian submanifolds of a Euclidean space, it is acceptable to
0337     % transport simply by orthogonal projection of the tangent vector
0338     % translated in the ambient space.
0339     M.transp = @project_tangent;
0340     function Z2 = project_tangent(X1, X2, Z1)
0341         Z2 = projection(X2, tangent2ambient(X1, Z1));
0342     end
0343 
0344     % The function 'vec' is isometric from the tangent space at X to real
0345     % vectors of length k(m+n+k). The function 'mat' is the left-inverse
0346     % of 'vec'. It is sometimes useful to apply 'tangent' to the output of
0347     % 'mat'.
0348     M.vec = @vec;
0349     function Zvec = vec(X, Z) %#ok<INUSL>
0350         A = Z.M;
0351         B = Z.Up;
0352         C = Z.Vp;
0353         Zvec = [A(:) ; B(:) ; C(:)];
0354     end
0355     rangeM = 1:(k^2);
0356     rangeUp = (k^2)+(1:(m*k));
0357     rangeVp = (k^2+m*k)+(1:(n*k));
0358     M.mat = @(X, Zvec) struct('M',  reshape(Zvec(rangeM),  [k, k]), ...
0359                               'Up', reshape(Zvec(rangeUp), [m, k]), ...
0360                               'Vp', reshape(Zvec(rangeVp), [n, k]));
0361     M.vecmatareisometries = @() true;
0362     
0363     
0364     % It is sometimes useful to switch between representation of matrices
0365     % as triplets or as full matrices of size m x n. The function to
0366     % convert a matrix to a triplet, matrix2triplet, allows to specify the
0367     % rank of the representation. By default, it is equal to k. Omit the
0368     % second input (or set to inf) to get a full SVD triplet (in economy
0369     % format). If so, the resulting triplet does not represent a point on
0370     % the manifold.
0371     M.matrix2triplet = @matrix2triplet;
0372     function X_triplet = matrix2triplet(X_matrix, r)
0373         if ~exist('r', 'var') || isempty(r) || r <= 0
0374             r = k;
0375         end
0376         if r < min(m, n)
0377             [U, S, V] = svds(X_matrix, r);
0378         else
0379             [U, S, V] = svd(X_matrix, 'econ');
0380         end
0381         X_triplet.U = U;
0382         X_triplet.S = S;
0383         X_triplet.V = V;
0384     end
0385     M.triplet2matrix = @triplet2matrix;
0386     function X_matrix = triplet2matrix(X_triplet)
0387         U = X_triplet.U;
0388         S = X_triplet.S;
0389         V = X_triplet.V;
0390         X_matrix = U*S*V';
0391     end
0392 
0393 end
0394 
0395 % Linear combination of tangent vectors
0396 function d = lincomb(x, a1, d1, a2, d2) %#ok<INUSL>
0397 
0398     if nargin == 3
0399         d.Up = a1*d1.Up;
0400         d.Vp = a1*d1.Vp;
0401         d.M  = a1*d1.M;
0402     elseif nargin == 5
0403         d.Up = a1*d1.Up + a2*d2.Up;
0404         d.Vp = a1*d1.Vp + a2*d2.Vp;
0405         d.M  = a1*d1.M  + a2*d2.M;
0406     else
0407         error('fixedrank.lincomb takes either 3 or 5 inputs.');
0408     end
0409 
0410 end
fixedrankembeddedfactory

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE