Home > examples > low_rank_matrix_completion.m

low_rank_matrix_completion

PURPOSE ^

Given partial observation of a low rank matrix, attempts to complete it.

SYNOPSIS ^

function low_rank_matrix_completion()

DESCRIPTION ^

 Given partial observation of a low rank matrix, attempts to complete it.

 function low_rank_matrix_completion()

 This example demonstrates how to use the geometry factory for the
 embedded submanifold of fixed-rank matrices, fixedrankembeddedfactory.
 This geometry is described in the paper
 "Low-rank matrix completion by Riemannian optimization"
 Bart Vandereycken - SIAM Journal on Optimization, 2013.

 This can be a starting point for many optimization problems of the form:

 minimize f(X) such that rank(X) = k, size(X) = [m, n].

 Note that the code is long because it showcases quite a few features of
 Manopt: most of the code is optional.

 Input:  None. This example file generates random data.
 
 Output: None.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function low_rank_matrix_completion()
0002 % Given partial observation of a low rank matrix, attempts to complete it.
0003 %
0004 % function low_rank_matrix_completion()
0005 %
0006 % This example demonstrates how to use the geometry factory for the
0007 % embedded submanifold of fixed-rank matrices, fixedrankembeddedfactory.
0008 % This geometry is described in the paper
0009 % "Low-rank matrix completion by Riemannian optimization"
0010 % Bart Vandereycken - SIAM Journal on Optimization, 2013.
0011 %
0012 % This can be a starting point for many optimization problems of the form:
0013 %
0014 % minimize f(X) such that rank(X) = k, size(X) = [m, n].
0015 %
0016 % Note that the code is long because it showcases quite a few features of
0017 % Manopt: most of the code is optional.
0018 %
0019 % Input:  None. This example file generates random data.
0020 %
0021 % Output: None.
0022 
0023 % This file is part of Manopt and is copyrighted. See the license file.
0024 %
0025 % Main author: Nicolas Boumal, July 15, 2014
0026 % Contributors: Bart Vandereycken
0027 %
0028 % Change log:
0029 %
0030     
0031     % Random data generation. First, choose the size of the problem.
0032     % We will complete a matrix of size mxn of rank k:
0033     m = 200;
0034     n = 500;
0035     k = 10;
0036     % Generate a random mxn matrix A of rank k
0037     L = randn(m, k);
0038     R = randn(n, k);
0039     A = L*R';
0040     % Generate a random mask for observed entries: P(i, j) = 1 if the entry
0041     % (i, j) of A is observed, and 0 otherwise.
0042     fraction = 4 * k*(m+n-k)/(m*n);
0043     P = sparse(rand(m, n) <= fraction);
0044     % Hence, we know the nonzero entries in PA:
0045     PA = P.*A;
0046     
0047     
0048     
0049     
0050     
0051     
0052     
0053     % Pick the manifold of matrices of size mxn of fixed rank k.
0054     problem.M = fixedrankembeddedfactory(m, n, k);
0055 
0056     % Define the problem cost function. The input X is a structure with
0057     % fields U, S, V representing a rank k matrix as U*S*V'.
0058     % f(X) = 1/2 * || P.*(X-A) ||^2
0059     problem.cost = @cost;
0060     function f = cost(X)
0061         % Note that it is very much inefficient to explicitly construct the
0062         % matrix X in this way. Seen as we only need to know the entries
0063         % of Xmat corresponding to the mask P, it would be far more
0064         % efficient to compute those only.
0065         Xmat = X.U*X.S*X.V';
0066         f = .5*norm( P.*Xmat - PA , 'fro')^2;
0067     end
0068 
0069     % Define the Euclidean gradient of the cost function, that is, the
0070     % gradient of f(X) seen as a standard function of X.
0071     % nabla f(X) = P.*(X-A)
0072     problem.egrad = @egrad;
0073     function G = egrad(X)
0074         % Same comment here about Xmat.
0075         Xmat = X.U*X.S*X.V';
0076         G = P.*Xmat - PA;
0077     end
0078 
0079     % This is optional, but it's nice if you have it.
0080     % Define the Euclidean Hessian of the cost at X, along H, where H is
0081     % represented as a tangent vector: a structure with fields Up, Vp, M.
0082     % This is the directional derivative of nabla f(X) at X along Xdot:
0083     % nabla^2 f(X)[Xdot] = P.*Xdot
0084     problem.ehess = @euclidean_hessian;
0085     function ehess = euclidean_hessian(X, H)
0086         % The function tangent2ambient transforms H (a tangent vector) into
0087         % its equivalent ambient vector representation. The output is a
0088         % structure with fields U, S, V such that U*S*V' is an mxn matrix
0089         % corresponding to the tangent vector H. Note that there are no
0090         % additional guarantees about U, S and V. In particular, U and V
0091         % are not orthonormal.
0092         ambient_H = problem.M.tangent2ambient(X, H);
0093         Xdot = ambient_H.U*ambient_H.S*ambient_H.V';
0094         % Same comment here about explicitly constructing the ambient
0095         % vector as an mxn matrix Xdot: we only need its entries
0096         % corresponding to the mask P, and this could be computed
0097         % efficiently.
0098         ehess = P.*Xdot;
0099     end
0100     
0101 
0102 
0103 
0104     % Check consistency of the gradient and the Hessian. Useful if you
0105     % adapt this example for a new cost function and you would like to make
0106     % sure there is no mistake.
0107     % warning('off', 'manopt:fixedrankembeddedfactory:exp');
0108     % checkgradient(problem); pause;
0109     % checkhessian(problem); pause;
0110     
0111     
0112     
0113     
0114     
0115     % Compute an initial guess. Points on the manifold are represented as
0116     % structures with three fields: U, S and V. U and V need to be
0117     % orthonormal, S needs to be diagonal.
0118     [U, S, V] = svds(PA, k);
0119     X0.U = U;
0120     X0.S = S;
0121     X0.V = V;
0122     
0123     % Minimize the cost function using Riemannian trust-regions, starting
0124     % from the initial guess X0.
0125     X = trustregions(problem, X0);
0126     
0127     % The reconstructed matrix is X, represented as a structure with fields
0128     % U, S and V.
0129     Xmat = X.U*X.S*X.V';
0130     fprintf('||X-A||_F = %g\n', norm(Xmat - A, 'fro'));
0131     
0132     
0133     
0134     
0135     
0136     
0137     % Alternatively, we could decide to use a solver such as
0138     % steepestdescent or conjugategradient. These solvers need to solve a
0139     % line-search problem at each iteration. Standard line searches in
0140     % Manopt have generic purpose systems to do this. But for the problem
0141     % at hand, it so happens that we can rather accurately guess how far
0142     % the line-search should look, and it would be a waste to not use that.
0143     % Look up the paper referenced above for the mathematical explanation
0144     % of the code below.
0145     %
0146     % To tell Manopt about this special information, we specify the
0147     % linesearch hint function in the problem structure. Notice that this
0148     % is not the same thing as specifying a linesearch function in the
0149     % options structure.
0150     %
0151     % Both the SD and the CG solvers will detect that we
0152     % specify the hint function below, and they will use an appropriate
0153     % linesearch algorithm by default, as a result. Typically, they will
0154     % try the step t*H first, then if it does not satisfy an Armijo
0155     % criterion, they will decrease t geometrically until satisfaction or
0156     % failure.
0157     %
0158     % Just like the cost, egrad and ehess functions, the linesearch
0159     % function could use a store structure if you like. The present code
0160     % does not use the store structure, which means quite a bit of the
0161     % computations are made redundantly, and as a result a better method
0162     % could appear slower. See the Manopt tutorial about caching when you
0163     % are ready to switch from a proof-of-concept code to an efficient
0164     % code.
0165     %
0166     % The inputs are X (a point on the manifold) and H, a tangent vector at
0167     % X that is assumed to be a descent direction. That is, there exists a
0168     % positive t such that f(Retraction_X(tH)) < f(X). The function below
0169     % is supposed to output a "t" that it is a good "guess" at such a t.
0170     problem.linesearch = @linesearch_helper;
0171     function t = linesearch_helper(X, H)
0172         % Note that you would not usually need the Hessian for this.
0173         residual_omega = nonzeros(problem.egrad(X));
0174         dir_omega      = nonzeros(problem.ehess(X, H));
0175         t = - dir_omega \ residual_omega ;
0176     end
0177 
0178     % Notice that for this solver, the Hessian is not needed.
0179     [Xcg, xcost, info, options] = conjugategradient(problem, X0); %#ok<ASGLU>
0180     
0181     fprintf('Take a look at the options that CG used:\n');
0182     disp(options);
0183     fprintf('And see how many trials were made at each line search call:\n');
0184     info_ls = [info.linesearch];
0185     disp([info_ls.costevals]);
0186 
0187     
0188     
0189     
0190     
0191     
0192     
0193     
0194     fprintf('Try it again without the linesearch helper.\n');
0195     
0196     % Remove the linesearch helper from the problem structure.
0197     problem = rmfield(problem, 'linesearch');
0198     
0199     [Xcg, xcost, info, options] = conjugategradient(problem, X0); %#ok<ASGLU>
0200     
0201     fprintf('Take a look at the options that CG used:\n');
0202     disp(options);
0203     fprintf('And see how many trials were made at each line search call:\n');
0204     info_ls = [info.linesearch];
0205     disp([info_ls.costevals]);
0206     
0207     
0208     
0209     
0210     
0211     
0212 
0213     % If the problem has a small enough dimension, we may (for analysis
0214     % purposes) compute the spectrum of the Hessian at a point X. This may
0215     % help in studying the conditioning of a problem. If you don't provide
0216     % the Hessian, Manopt will approximate the Hessian with finite
0217     % differences of the gradient and try to estimate its "spectrum" (it's
0218     % not a proper linear operator). This can give some intuition, but
0219     % should not be relied upon.
0220     if problem.M.dim() < 100
0221         fprintf('Computing the spectrum of the Hessian...');
0222         s = hessianspectrum(problem, X);
0223         hist(s);
0224     end
0225     
0226     
0227     
0228     
0229 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005