Returns an orthonormal basis of the dominant invariant p-subspace of A. function X = dominant_invariant_subspace(A, p) Input: A real, symmetric matrix A of size nxn and an integer p < n. Output: A real, orthonormal matrix X of size nxp such that trace(X'*A*X) is maximized. That is, the columns of X form an orthonormal basis of a dominant subspace of dimension p of A. These are thus eigenvectors associated with the largest eigenvalues of A (in no particular order). Sign is important: 2 is deemed a larger eigenvalue than -5. The optimization is performed on the Grassmann manifold, since only the space spanned by the columns of X matters. The implementation is short to show how Manopt can be used to quickly obtain a prototype. To make the implementation more efficient, one might first try to use the caching system, that is, use the optional 'store' arguments in the cost, grad and hess functions. Furthermore, using egrad2rgrad and ehess2rhess is quick and easy, but not always efficient. Having a look at the formulas implemented in these functions can help rewrite the code without them, possibly more efficiently. See also: dominant_invariant_subspace_complex
0001 function [X, info] = dominant_invariant_subspace(A, p) 0002 % Returns an orthonormal basis of the dominant invariant p-subspace of A. 0003 % 0004 % function X = dominant_invariant_subspace(A, p) 0005 % 0006 % Input: A real, symmetric matrix A of size nxn and an integer p < n. 0007 % Output: A real, orthonormal matrix X of size nxp such that trace(X'*A*X) 0008 % is maximized. That is, the columns of X form an orthonormal basis 0009 % of a dominant subspace of dimension p of A. These are thus 0010 % eigenvectors associated with the largest eigenvalues of A (in no 0011 % particular order). Sign is important: 2 is deemed a larger 0012 % eigenvalue than -5. 0013 % 0014 % The optimization is performed on the Grassmann manifold, since only the 0015 % space spanned by the columns of X matters. The implementation is short to 0016 % show how Manopt can be used to quickly obtain a prototype. To make the 0017 % implementation more efficient, one might first try to use the caching 0018 % system, that is, use the optional 'store' arguments in the cost, grad and 0019 % hess functions. Furthermore, using egrad2rgrad and ehess2rhess is quick 0020 % and easy, but not always efficient. Having a look at the formulas 0021 % implemented in these functions can help rewrite the code without them, 0022 % possibly more efficiently. 0023 % 0024 % See also: dominant_invariant_subspace_complex 0025 0026 % This file is part of Manopt and is copyrighted. See the license file. 0027 % 0028 % Main author: Nicolas Boumal, July 5, 2013 0029 % Contributors: 0030 % 0031 % Change log: 0032 % 0033 % NB Dec. 6, 2013: 0034 % We specify a max and initial trust region radius in the options. 0035 % NB Jan. 20, 2018: 0036 % Added a few comments regarding implementation of the cost. 0037 % XJ Aug. 31, 2021 0038 % Added AD to compute the grad and the hess 0039 0040 % Generate some random data to test the function 0041 if ~exist('A', 'var') || isempty(A) 0042 A = randn(128); 0043 A = (A+A')/2; 0044 end 0045 if ~exist('p', 'var') || isempty(p) 0046 p = 3; 0047 end 0048 0049 % Make sure the input matrix is square and symmetric 0050 n = size(A, 1); 0051 assert(isreal(A), 'A must be real.') 0052 assert(size(A, 2) == n, 'A must be square.'); 0053 assert(norm(A-A', 'fro') < n*eps, 'A must be symmetric.'); 0054 assert(p<=n, 'p must be smaller than n.'); 0055 0056 % Define the cost and its derivatives on the Grassmann manifold 0057 Gr = grassmannfactory(n, p); 0058 problem.M = Gr; 0059 problem.cost = @(X) -.5*trace(X'*A*X); 0060 problem.grad = @(X) -Gr.egrad2rgrad(X, A*X); 0061 problem.hess = @(X, H) -Gr.ehess2rhess(X, A*X, A*H, H); 0062 0063 % Notice that it would be more efficient to compute trace(X'*A*X) via 0064 % the formula sum(sum(X .* (A*X))) -- the code above is written so as 0065 % to be as close as possible to the familiar mathematical formulas, for 0066 % ease of interpretation. Also, the product A*X is needed for both the 0067 % cost and the gradient, as well as for the Hessian: one can use the 0068 % caching capabilities of Manopt (the store structures) to save on 0069 % redundant computations. 0070 0071 % An alternative way to compute the gradient and the hessian is to use 0072 % automatic differentiation provided in the deep learning toolbox (slower). 0073 % Notice that the function trace is not supported for AD so far. 0074 % Replace it with ctrace described in manoptADhelp 0075 % problem.cost = @(X) -.5*ctrace(X'*A*X); 0076 % It's also feasible to specify the cost in a more efficient way 0077 % problem.cost = @(X) -.5*sum(sum(X .* (A*X))); 0078 % Call manoptAD to prepare AD for the problem structure 0079 % problem = manoptAD(problem); 0080 0081 % Execute some checks on the derivatives for early debugging. 0082 % These can be commented out. 0083 % checkgradient(problem); 0084 % pause; 0085 % checkhessian(problem); 0086 % pause; 0087 0088 % Issue a call to a solver. A random initial guess will be chosen and 0089 % default options are selected except for the ones we specify here. 0090 options.Delta_bar = 8*sqrt(p); 0091 [X, costX, info, options] = trustregions(problem, [], options); %#ok<ASGLU> 0092 0093 fprintf('Options used:\n'); 0094 disp(options); 0095 0096 % For our information, Manopt can also compute the spectrum of the 0097 % Riemannian Hessian on the tangent space at (any) X. Computing the 0098 % spectrum at the solution gives us some idea of the conditioning of 0099 % the problem. If we were to implement a preconditioner for the 0100 % Hessian, this would also inform us on its performance. 0101 % 0102 % Notice that (typically) all eigenvalues of the Hessian at the 0103 % solution are positive, i.e., we find an isolated minimizer. If we 0104 % replace the Grassmann manifold by the Stiefel manifold, hence still 0105 % optimizing over orthonormal matrices but ignoring the invariance 0106 % cost(XQ) = cost(X) for all Q orthogonal, then we see 0107 % dim O(p) = p(p-1)/2 zero eigenvalues in the Hessian spectrum, making 0108 % the optimizer not isolated anymore. 0109 if Gr.dim() < 512 0110 evs = hessianspectrum(problem, X); 0111 stairs(sort(evs)); 0112 title(['Eigenvalues of the Hessian of the cost function ' ... 0113 'at the solution']); 0114 xlabel('Eigenvalue number (sorted)'); 0115 ylabel('Value of the eigenvalue'); 0116 end 0117 0118 end