Riemannian Barzilai-Borwein solver with non-monotone line-search. function [x, cost, info, options] = barzilaiborwein(problem) function [x, cost, info, options] = barzilaiborwein(problem, x0) function [x, cost, info, options] = barzilaiborwein(problem, x0, options) function [x, cost, info, options] = barzilaiborwein(problem, [], options) Apply the Barzilai-Borwein minimization algorithm to the problem defined in the problem structure, starting at x0 if it is provided (otherwise, at a random point on the manifold). To specify options whilst not specifying an initial guess, give x0 as [] (the empty matrix). The algorithm uses its own special non-monotone line-search strategy. Therefore, no line-search algorithm should be specified in the problem structure or in the options structure. In most of the examples bundled with the toolbox (see link below), the solver can be replaced by the present one if need be. The outputs x and cost are the last reached point on the manifold and its cost. This is not necessarily the best point generated since the method is not monotone. The struct-array info contains information about the iterations: iter : the iteration number (0 for the initial guess) cost : cost value time : elapsed time in seconds gradnorm : Riemannian norm of the gradient stepsize : norm of the last tangent vector retracted linesearch : information logged by the line-search algorithm And possibly additional information logged by options.statsfun. For example, type [info.gradnorm] to obtain a vector of the successive gradient norms reached. The options structure is used to overwrite the default values. All options have a default value and are hence optional. To force an option value, pass an options structure with a field options.optionname, where optionname is one of the following and the default value is indicated between parentheses: tolgradnorm (1e-6) The algorithm terminates if the norm of the gradient drops below this. maxiter (1000) The algorithm terminates if maxiter iterations have been executed. maxtime (Inf) The algorithm terminates if maxtime seconds elapsed. minstepsize (1e-10) The algorithm terminates if the linesearch returns a displacement vector (to be retracted) smaller in norm than this value. linesearch (@linesearch_hint) This option should not be changed, as the present solver has its own dedicated line-search strategy. strategy ('direct') The strategy used for the Barzilai-Borwein stepsize 'direct', compute the direct step <s_k,s_k>/<s_k,y_k> 'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k> 'alternate', alternates between direct and inverse step lambdamax (1e3) The maximum stepsize allowed by the Barzilai-Borwein method lambdamin (1e-3) The minimum stepsize allowed by the Barzilai-Borwein method lambda0 (1/10) The initial stepsize of the Barzilai-Borwein method ls_nmsteps (10) The non-monotone line-search checks a sufficient decrease with respect to the previous ls_nmsteps objective function values. statsfun (none) Function handle to a function that will be called after each iteration to provide the opportunity to log additional statistics. They will be returned in the info struct. See the generic Manopt documentation about solvers for further information. stopfun (none) Function handle to a function that will be called at each iteration to provide the opportunity to specify additional stopping criteria. See the generic Manopt documentation about solvers for further information. verbosity (3) Integer number used to tune the amount of output the algorithm generates during execution (mostly as text in the command window). The higher, the more output. 0 means silent. storedepth (2) Maximum number of different points x of the manifold for which a store structure will be kept in memory in the storedb. If the caching features of Manopt are not used, this is irrelevant. For this algorithm, a store depth of 2 should always be sufficient. The implementation of the Barzilai-Borwein method is based on the paper: B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with nonmonotone line-search and the matrix geometric mean computation", IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015. See also: steepestdescent conjugategradient trustregions
0001 function [x, cost, info, options] = barzilaiborwein(problem, x, options) 0002 % Riemannian Barzilai-Borwein solver with non-monotone line-search. 0003 % 0004 % function [x, cost, info, options] = barzilaiborwein(problem) 0005 % function [x, cost, info, options] = barzilaiborwein(problem, x0) 0006 % function [x, cost, info, options] = barzilaiborwein(problem, x0, options) 0007 % function [x, cost, info, options] = barzilaiborwein(problem, [], options) 0008 % 0009 % Apply the Barzilai-Borwein minimization algorithm to the problem defined 0010 % in the problem structure, starting at x0 if it is provided (otherwise, at 0011 % a random point on the manifold). To specify options whilst not specifying 0012 % an initial guess, give x0 as [] (the empty matrix). 0013 % 0014 % The algorithm uses its own special non-monotone line-search strategy. 0015 % Therefore, no line-search algorithm should be specified in the problem 0016 % structure or in the options structure. 0017 % 0018 % In most of the examples bundled with the toolbox (see link below), the 0019 % solver can be replaced by the present one if need be. 0020 % 0021 % The outputs x and cost are the last reached point on the manifold and its 0022 % cost. This is not necessarily the best point generated since the method 0023 % is not monotone. The struct-array info contains information about the 0024 % iterations: 0025 % iter : the iteration number (0 for the initial guess) 0026 % cost : cost value 0027 % time : elapsed time in seconds 0028 % gradnorm : Riemannian norm of the gradient 0029 % stepsize : norm of the last tangent vector retracted 0030 % linesearch : information logged by the line-search algorithm 0031 % And possibly additional information logged by options.statsfun. 0032 % For example, type [info.gradnorm] to obtain a vector of the successive 0033 % gradient norms reached. 0034 % 0035 % The options structure is used to overwrite the default values. All 0036 % options have a default value and are hence optional. To force an option 0037 % value, pass an options structure with a field options.optionname, where 0038 % optionname is one of the following and the default value is indicated 0039 % between parentheses: 0040 % 0041 % tolgradnorm (1e-6) 0042 % The algorithm terminates if the norm of the gradient drops below this. 0043 % maxiter (1000) 0044 % The algorithm terminates if maxiter iterations have been executed. 0045 % maxtime (Inf) 0046 % The algorithm terminates if maxtime seconds elapsed. 0047 % minstepsize (1e-10) 0048 % The algorithm terminates if the linesearch returns a displacement 0049 % vector (to be retracted) smaller in norm than this value. 0050 % linesearch (@linesearch_hint) 0051 % This option should not be changed, as the present solver has its 0052 % own dedicated line-search strategy. 0053 % strategy ('direct') 0054 % The strategy used for the Barzilai-Borwein stepsize 0055 % 'direct', compute the direct step <s_k,s_k>/<s_k,y_k> 0056 % 'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k> 0057 % 'alternate', alternates between direct and inverse step 0058 % lambdamax (1e3) 0059 % The maximum stepsize allowed by the Barzilai-Borwein method 0060 % lambdamin (1e-3) 0061 % The minimum stepsize allowed by the Barzilai-Borwein method 0062 % lambda0 (1/10) 0063 % The initial stepsize of the Barzilai-Borwein method 0064 % ls_nmsteps (10) 0065 % The non-monotone line-search checks a sufficient decrease with respect 0066 % to the previous ls_nmsteps objective function values. 0067 % statsfun (none) 0068 % Function handle to a function that will be called after each 0069 % iteration to provide the opportunity to log additional statistics. 0070 % They will be returned in the info struct. See the generic Manopt 0071 % documentation about solvers for further information. 0072 % stopfun (none) 0073 % Function handle to a function that will be called at each iteration 0074 % to provide the opportunity to specify additional stopping criteria. 0075 % See the generic Manopt documentation about solvers for further 0076 % information. 0077 % verbosity (3) 0078 % Integer number used to tune the amount of output the algorithm 0079 % generates during execution (mostly as text in the command window). 0080 % The higher, the more output. 0 means silent. 0081 % storedepth (2) 0082 % Maximum number of different points x of the manifold for which a 0083 % store structure will be kept in memory in the storedb. If the 0084 % caching features of Manopt are not used, this is irrelevant. For 0085 % this algorithm, a store depth of 2 should always be sufficient. 0086 % 0087 % 0088 % The implementation of the Barzilai-Borwein method is based on the paper: 0089 % 0090 % B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with 0091 % nonmonotone line-search and the matrix geometric mean computation", 0092 % IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015. 0093 % 0094 % See also: steepestdescent conjugategradient trustregions 0095 0096 % This file is part of Manopt: www.manopt.org. 0097 % Original author: Margherita Porcelli, May 31, 2017 0098 % Contributors: Nicolas Boumal, Bruno Iannazzo 0099 % Change log: 0100 % 0101 % Aug. 2, 2018 (NB): 0102 % Now using storedb.remove() to keep the cache lean. 0103 0104 0105 % Verify that the problem description is sufficient for the solver. 0106 if ~canGetCost(problem) 0107 warning('manopt:getCost', ... 0108 'No cost provided. The algorithm will likely abort.'); 0109 end 0110 if ~canGetGradient(problem) && ~canGetApproxGradient(problem) 0111 % Note: we do not give a warning if an approximate gradient is 0112 % explicitly given in the problem description, as in that case the 0113 % user seems to be aware of the issue. 0114 warning('manopt:getGradient:approx', ... 0115 ['No gradient provided. Using an FD approximation instead (slow).\n' ... 0116 'It may be necessary to increase options.tolgradnorm.\n' ... 0117 'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']); 0118 problem.approxgrad = approxgradientFD(problem); 0119 end 0120 0121 % Ensure options exists as a structure 0122 if ~exist('options', 'var') || isempty(options) 0123 options = struct(); 0124 end 0125 0126 % Set local defaults here 0127 localdefaults.minstepsize = 1e-10; 0128 localdefaults.maxiter = 1000; 0129 localdefaults.tolgradnorm = 1e-6; 0130 0131 % Upper and lower bound for the Barzilai-Borwein stepsize 0132 localdefaults.lambdamax = 1e3; 0133 localdefaults.lambdamin = 1e-3; 0134 % Initial Barzilai-Borwein stepsize 0135 localdefaults.lambda0 = 1/10; 0136 0137 % Barzilai-Borwein strategy (direct, inverse or alternate) 0138 localdefaults.strategy = 'direct'; 0139 0140 % Line-search parameters 0141 % 1) Make sure the user didn't try to define a line search 0142 if canGetLinesearch(problem) || isfield(options, 'linesearch') 0143 error('manopt:BB:ls', ... 0144 ['The problem structure may not specify a line-search ' ... 0145 'hint for the BB solver,\nand the options structure ' ... 0146 'may not specify a line-search algorithm for BB.']); 0147 end 0148 % 2) Define the line-search parameters 0149 problem.linesearch = @(x, d, storedb, key) 1; 0150 options.linesearch = @linesearch_hint; 0151 % The Armijo sufficient decrease parameter 0152 localdefaults.ls_suff_decr = 1e-4; 0153 % The previous steps checked in the non-monotone line-search strategy 0154 localdefaults.ls_nmsteps = 10; 0155 0156 0157 % Merge global and local defaults, then merge w/ user options, if any. 0158 localdefaults = mergeOptions(getGlobalDefaults(), localdefaults); 0159 options = mergeOptions(localdefaults, options); 0160 0161 0162 % Shorthands for some parameters 0163 strategy = options.strategy; 0164 lambdamax = options.lambdamax; 0165 lambdamin = options.lambdamin; 0166 lambda0 = options.lambda0; 0167 0168 timetic = tic(); 0169 0170 0171 % If no initial point x is given by the user, generate one at random. 0172 if ~exist('x', 'var') || isempty(x) 0173 x = problem.M.rand(); 0174 end 0175 0176 % Create a store database and get a key for the current x 0177 storedb = StoreDB(options.storedepth); 0178 key = storedb.getNewKey(); 0179 0180 % Compute objective-related quantities for x 0181 [cost, grad] = getCostGrad(problem, x, storedb, key); 0182 gradnorm = problem.M.norm(x, grad); 0183 0184 % Some variables below need to store information about iterations. We 0185 % preallocate for a reasonable amount of intended iterations to avoid 0186 % memory re-allocations. 0187 mem_init_size = min(10000, options.maxiter+1); 0188 0189 % Store the cost value 0190 f = zeros(mem_init_size, 1); 0191 f(1) = cost; 0192 0193 % Iteration counter (at any point, iter is the number of fully executed 0194 % iterations so far) 0195 iter = 0; 0196 0197 % Save stats in a struct array info, and preallocate. 0198 stats = savestats(); 0199 info(1) = stats; 0200 info(mem_init_size).iter = []; 0201 0202 if options.verbosity >= 2 0203 fprintf(' iter\t cost val\t grad. norm\n'); 0204 end 0205 0206 % Set the initial Barzilai-Borwein stepsize 0207 lambda = lambda0; 0208 0209 % Start iterating until stopping criterion triggers 0210 while true 0211 0212 % Display iteration information 0213 if options.verbosity >= 2 0214 fprintf('%5d\t%+.16e\t%.8e\n', iter, cost, gradnorm); 0215 end 0216 0217 % Start timing this iteration 0218 timetic = tic(); 0219 0220 % Run standard stopping criterion checks 0221 [stop, reason] = stoppingcriterion(problem, x, options, ... 0222 info, iter+1); 0223 0224 % If none triggered, run specific stopping criterion check 0225 if ~stop && stats.stepsize < options.minstepsize 0226 stop = true; 0227 reason = sprintf(['Last stepsize smaller than minimum ' ... 0228 'allowed; options.minstepsize = %g.'], ... 0229 options.minstepsize); 0230 end 0231 0232 if stop 0233 if options.verbosity >= 1 0234 fprintf([reason '\n']); 0235 end 0236 break; 0237 end 0238 0239 % Pick the descent direction as minus the gradient (scaled) 0240 desc_dir = problem.M.lincomb(x, -lambda, grad); 0241 0242 % Execute the nonmonotone line search 0243 k = iter + 1; 0244 start = max(1, k - options.ls_nmsteps + 1); 0245 0246 [stepsize, newx, newkey, lsstats] = ... 0247 options.linesearch(problem, x, desc_dir, max(f(start:k)), ... 0248 -lambda * gradnorm^2, options, storedb, key); 0249 0250 % Updates the value of lambda 0251 lambda = lambda * lsstats.alpha; 0252 0253 % Compute the new cost-related quantities for newx 0254 [newcost, newgrad] = getCostGrad(problem, newx, storedb, newkey); 0255 newgradnorm = problem.M.norm(newx, newgrad); 0256 0257 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0258 % BARZILAI-BORWEIN STRATEGY 0259 0260 % Store the cost value 0261 f(iter+2) = newcost; 0262 0263 % Transport the old gradient to newx 0264 grad_transp = problem.M.transp(x, newx, grad); 0265 0266 % Compute the difference between grandients 0267 Y = problem.M.lincomb(newx, 1, newgrad, -1, grad_transp); 0268 0269 % Compute the transported step 0270 Stransp = problem.M.lincomb(x, -lambda, grad_transp); 0271 0272 % Compute the new Barzilai-Borwein step following the strategy 0273 % direct strategy 0274 if strcmp(strategy, 'direct') 0275 num = problem.M.norm(newx, Stransp)^2; 0276 den = problem.M.inner(newx, Stransp, Y); 0277 if den > 0 0278 lambda = min( lambdamax, max(lambdamin, num/den) ); 0279 else 0280 lambda = lambdamax; 0281 end 0282 end 0283 0284 % inverse strategy 0285 if strcmp(strategy, 'inverse') 0286 num = problem.M.inner(newx, Stransp, Y); 0287 den = problem.M.norm(newx, Y)^2; 0288 0289 if num > 0 0290 lambda = min( lambdamax, max(lambdamin, num/den) ); 0291 else 0292 lambda = lambdamax; 0293 end 0294 end 0295 0296 % alternate strategy 0297 if strcmp(strategy, 'alternate') 0298 num = problem.M.norm(newx, Stransp)^2; 0299 den = problem.M.inner(newx, Stransp, Y); 0300 den2 = problem.M.norm(newx, Y)^2; 0301 if (den > 0) 0302 if mod(iter,2)==0 0303 lambda = min( lambdamax, max(lambdamin, num/den) ); 0304 else 0305 lambda = min( lambdamax, max(lambdamin, den/den2) ); 0306 end 0307 else 0308 lambda = lambdamax; 0309 end 0310 end 0311 0312 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0313 0314 % Update iterate info 0315 storedb.removefirstifdifferent(key, newkey); 0316 x = newx; 0317 key = newkey; 0318 cost = newcost; 0319 grad = newgrad; 0320 gradnorm = newgradnorm; 0321 0322 % iter is the number of iterations we have accomplished. 0323 iter = iter + 1; 0324 0325 % Make sure we don't use too much memory for the store database. 0326 storedb.purge(); 0327 0328 % Log statistics for freshly executed iteration 0329 stats = savestats(); 0330 info(iter+1) = stats; 0331 0332 end 0333 0334 info = info(1:iter+1); 0335 0336 if options.verbosity >= 1 0337 fprintf('Total time is %f [s] (excludes statsfun)\n', ... 0338 info(end).time); 0339 end 0340 0341 0342 0343 % Routine in charge of collecting the current iteration stats 0344 function stats = savestats() 0345 stats.iter = iter; 0346 stats.cost = cost; 0347 stats.gradnorm = gradnorm; 0348 if iter == 0 0349 stats.stepsize = NaN; 0350 stats.time = toc(timetic); 0351 stats.linesearch = []; 0352 else 0353 stats.stepsize = stepsize; 0354 stats.time = info(iter).time + toc(timetic); 0355 stats.linesearch = lsstats; 0356 end 0357 stats = applyStatsfun(problem, x, storedb, key, options, stats); 0358 end 0359 0360 end