Home > examples > truncated_svd.m

truncated_svd

PURPOSE ^

Returns an SVD decomposition of A truncated to rank p.

SYNOPSIS ^

function [U, S, V, info] = truncated_svd(A, p)

DESCRIPTION ^

 Returns an SVD decomposition of A truncated to rank p.

 function [U, S, V, info] = truncated_svd(A, p)

 Input: A real matrix A of size mxn and an integer p <= min(m, n).
 Output: An orthonormal matrix U of size mxp, an orthonormal matrix Y of
         size nxp and a diagonal matrix S of size pxp with nonnegative and
         decreasing diagonal entries such that USV.' is the best rank p
         approximation of A according to the Frobenius norm. All real.
         This function produces an output akin to svds.
 
 The decomposition is obtained by maximizing
   f(U, V) = .5*norm(U'*A*V, 'fro')^2
 where U, V are orthonormal. Notice that f(U*Q, V*R) = f(U, V) for all
 Q, R orthogonal pxp matrices. Hence, only the column spaces of U and V
 matter and we may perform the optimization over a product of two
 Grassmannian manifolds.

 It is easy to show that maximizing f is equivalent to minimizing g with
   g(U, V) = min_S norm(U*S*V' - A, 'fro')^2,
 which confirms that we are going for a best low-rank approximation of A.
 
 The inner workings of the Grassmann manifold use the built-in svd
 function of Matlab but only for matrices of size mxp and nxp to
 re-orthonormalize them.
 
 Notice that we are actually chasing a best fixed-rank approximation of a
 matrix, which is best obtained by working directly over a manifold of
 fixed-rank matrices. This is simply an example script to demonstrate some
 functionalities of the toolbox.
 
 The code can be modified to accept a function handle for A(x) = A*x
 instead of a matrix A, which is often useful. This would further require
 a function handle At for the transpose of A, such that At(x) = A.'*x.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [U, S, V, info] = truncated_svd(A, p)
0002 % Returns an SVD decomposition of A truncated to rank p.
0003 %
0004 % function [U, S, V, info] = truncated_svd(A, p)
0005 %
0006 % Input: A real matrix A of size mxn and an integer p <= min(m, n).
0007 % Output: An orthonormal matrix U of size mxp, an orthonormal matrix Y of
0008 %         size nxp and a diagonal matrix S of size pxp with nonnegative and
0009 %         decreasing diagonal entries such that USV.' is the best rank p
0010 %         approximation of A according to the Frobenius norm. All real.
0011 %         This function produces an output akin to svds.
0012 %
0013 % The decomposition is obtained by maximizing
0014 %   f(U, V) = .5*norm(U'*A*V, 'fro')^2
0015 % where U, V are orthonormal. Notice that f(U*Q, V*R) = f(U, V) for all
0016 % Q, R orthogonal pxp matrices. Hence, only the column spaces of U and V
0017 % matter and we may perform the optimization over a product of two
0018 % Grassmannian manifolds.
0019 %
0020 % It is easy to show that maximizing f is equivalent to minimizing g with
0021 %   g(U, V) = min_S norm(U*S*V' - A, 'fro')^2,
0022 % which confirms that we are going for a best low-rank approximation of A.
0023 %
0024 % The inner workings of the Grassmann manifold use the built-in svd
0025 % function of Matlab but only for matrices of size mxp and nxp to
0026 % re-orthonormalize them.
0027 %
0028 % Notice that we are actually chasing a best fixed-rank approximation of a
0029 % matrix, which is best obtained by working directly over a manifold of
0030 % fixed-rank matrices. This is simply an example script to demonstrate some
0031 % functionalities of the toolbox.
0032 %
0033 % The code can be modified to accept a function handle for A(x) = A*x
0034 % instead of a matrix A, which is often useful. This would further require
0035 % a function handle At for the transpose of A, such that At(x) = A.'*x.
0036 
0037 % This file is part of Manopt and is copyrighted. See the license file.
0038 %
0039 % Main author: Nicolas Boumal, July 5, 2013
0040 % Contributors:
0041 %
0042 % Change log:
0043 %
0044 
0045     
0046     % Generate some random data to test the function if none is given.
0047     if ~exist('A', 'var') || isempty(A)
0048         A = randn(42, 60);
0049     end
0050     if ~exist('p', 'var') || isempty(p)
0051         p = 5;
0052     end
0053     
0054     % Retrieve the size of the problem and make sure the requested
0055     % approximation rank is at most the maximum possible rank.
0056     [m, n] = size(A);
0057     assert(p <= min(m, n), 'p must be smaller than the smallest dimension of A.');
0058     
0059     % Define the cost and its derivatives on the Grassmann manifold
0060     tuple.U = grassmannfactory(m, p);
0061     tuple.V = grassmannfactory(n, p);
0062     % All of the code will work just as well if we ignore the invariance
0063     % property of the cost function indicated above and thus place U and V
0064     % on the Stiefel manifold (orthonormal matrices) instead of the
0065     % Grassmann manifold. Working on Stiefel is expected to be slower
0066     % though, partly because de search space is higher dimensional and
0067     % partly because the optimizers are not isolated.
0068     % tuple.U = stiefelfactory(m, p);
0069     % tuple.V = stiefelfactory(n, p);
0070     M = productmanifold(tuple);
0071     
0072     % Define a problem structure, specifying the manifold M, the cost
0073     % function and its derivatives. Here, to demonstrate the rapid
0074     % prototyping capabilities of Manopt, we directly define the Euclidean
0075     % gradient and the Euclidean Hessian egrad and ehess instead of the
0076     % Riemannian gradient and Hessian grad and hess. Manopt will take care
0077     % of the conversion. This automatic conversion is usually not
0078     % computationally optimal though, because much of the computations
0079     % involved in obtaining the gradient could be reused to obtain the
0080     % Hessian. After the prototyping stage, when efficiency becomes
0081     % important, it makes sense to define grad and hess rather than egrad
0082     % an ehess, and to use the caching system (the store structure).
0083     problem.M = M;
0084     problem.cost  = @cost;
0085     problem.egrad = @egrad;
0086     problem.ehess = @ehess;
0087     
0088     % The functions below make many redundant computations. This
0089     % performance hit can be alleviated by using the caching system.
0090     
0091     % Cost function
0092     function f = cost(X)
0093         U = X.U;
0094         V = X.V;
0095         f = -.5*norm(U'*A*V, 'fro')^2;
0096     end
0097     % Euclidean gradient of the cost function
0098     function g = egrad(X)
0099         U = X.U;
0100         V = X.V;
0101         AV = A*V;
0102         AtU = A'*U;
0103         g.U = -AV*(AV'*U);
0104         g.V = -AtU*(AtU'*V);
0105     end
0106     % Euclidean Hessian of the cost function
0107     function h = ehess(X, H)
0108         U = X.U;
0109         V = X.V;
0110         Udot = H.U;
0111         Vdot = H.V;
0112         AV = A*V;
0113         AtU = A'*U;
0114         AVdot = A*Vdot;
0115         AtUdot = A'*Udot;
0116         h.U = -(AVdot*AV'*U + AV*AVdot'*U + AV*AV'*Udot);
0117         h.V = -(AtUdot*AtU'*V + AtU*AtUdot'*V + AtU*AtU'*Vdot);
0118     end
0119     
0120     
0121     % Execute some checks on the derivatives for early debugging.
0122     % These things can be commented out of course.
0123     % checkgradient(problem);
0124     % pause;
0125     % checkhessian(problem);
0126     % pause;
0127     
0128     % Issue a call to a solver. A random initial guess will be chosen and
0129     % default options are selected. Here, we specify a maximum trust
0130     % region radius (which in turn induces an initial trust region radius).
0131     % Note that this is not required: default values are used if we omit
0132     % this. The diameter of the manifold scales like sqrt(2*p), hence the
0133     % form of our (empirical) choice.
0134     options.Delta_bar = 4*sqrt(2*p);
0135     [X, Xcost, info] = trustregions(problem, [], options); %#ok<ASGLU>
0136     U = X.U;
0137     V = X.V;
0138     
0139     % Finish the job by rotating U and V such that the middle matrix S can
0140     % be diagonal with nonnegative, decreasing entries. This requires a
0141     % small svd of size pxp.
0142     Spp = U'*A*V;
0143     [Upp, Spp, Vpp] = svd(Spp);
0144     U = U*Upp;
0145     S = Spp;
0146     V = V*Vpp;
0147     
0148     % For our information, Manopt can also compute the spectrum of the
0149     % Riemannian Hessian on the tangent space at (any) X. Computing the
0150     % spectrum at the solution gives us some idea of the conditioning of
0151     % the problem. If we were to implement a preconditioner for the
0152     % Hessian, this would also inform us on its performance.
0153     %
0154     % Notice that if the optimization is performed on a product of Stiefel
0155     % manifolds instead of a product of Grassmannians, the double
0156     % invariance under the orthogonal group O(p) will appear as twice
0157     % p*(p-1)/2, thus p*(p-1) zero eigenvalues in the spectrum of the
0158     % Hessian. This means that the minimizers are not isolated, which
0159     % typically hinders convergence of second order algorithms.
0160     if M.dim() < 512
0161         evs = hessianspectrum(problem, X);
0162         stairs(sort(evs));
0163         title(['Eigenvalues of the Hessian of the cost function ' ...
0164                'at the solution']);
0165     end
0166 
0167 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005