Description of barzilaiborwein

 Riemannian Barzilai-Borwein solver with non-monotone line-search.

 function [x, cost, info, options] = barzilaiborwein(problem)
 function [x, cost, info, options] = barzilaiborwein(problem, x0)
 function [x, cost, info, options] = barzilaiborwein(problem, x0, options)
 function [x, cost, info, options] = barzilaiborwein(problem, [], options)

 Apply the Barzilai-Borwein minimization algorithm to the problem defined
 in the problem structure, starting at x0 if it is provided (otherwise, at
 a random point on the manifold). To specify options whilst not specifying
 an initial guess, give x0 as [] (the empty matrix).

 The algorithm uses its own special non-monotone line-search strategy.
 Therefore, no line-search algorithm should be specified in the problem
 structure or in the options structure.

 In most of the examples bundled with the toolbox (see link below), the
 solver can be replaced by the present one if need be.

 The outputs x and cost are the last reached point on the manifold and its
 cost. This is not necessarily the best point generated since the method
 is not monotone. The struct-array info contains information about the
 iterations:
   iter : the iteration number (0 for the initial guess)
   cost : cost value
   time : elapsed time in seconds
   gradnorm : Riemannian norm of the gradient
   stepsize : norm of the last tangent vector retracted
   linesearch : information logged by the line-search algorithm
   And possibly additional information logged by options.statsfun.
 For example, type [info.gradnorm] to obtain a vector of the successive
 gradient norms reached.

 The options structure is used to overwrite the default values. All
 options have a default value and are hence optional. To force an option
 value, pass an options structure with a field options.optionname, where
 optionname is one of the following and the default value is indicated
 between parentheses:

   tolgradnorm (1e-6)
       The algorithm terminates if the norm of the gradient drops below this.
   maxiter (1000)
       The algorithm terminates if maxiter iterations have been executed.
   maxtime (Inf)
       The algorithm terminates if maxtime seconds elapsed.
   minstepsize (1e-10)
       The algorithm terminates if the linesearch returns a displacement
       vector (to be retracted) smaller in norm than this value.
   linesearch (@linesearch_hint)
       This option should not be changed, as the present solver has its
       own dedicated line-search strategy.
   strategy ('direct')
       The strategy used for the Barzilai-Borwein stepsize
       'direct', compute the direct step <s_k,s_k>/<s_k,y_k>
       'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k>
       'alternate', alternates between direct and inverse step
   lambdamax (1e3)
       The maximum stepsize allowed by the Barzilai-Borwein method
   lambdamin (1e-3)
       The minimum stepsize allowed by the Barzilai-Borwein method
   lambda0 (1/10)
       The initial stepsize of the Barzilai-Borwein method
   ls_nmsteps (10)
       The non-monotone line-search checks a sufficient decrease with respect
       to the previous ls_nmsteps objective function values.
   statsfun (none)
       Function handle to a function that will be called after each
       iteration to provide the opportunity to log additional statistics.
       They will be returned in the info struct. See the generic Manopt
       documentation about solvers for further information.
   stopfun (none)
       Function handle to a function that will be called at each iteration
       to provide the opportunity to specify additional stopping criteria.
       See the generic Manopt documentation about solvers for further
       information.
   verbosity (3)
       Integer number used to tune the amount of output the algorithm
       generates during execution (mostly as text in the command window).
       The higher, the more output. 0 means silent.
   storedepth (2)
       Maximum number of different points x of the manifold for which a
       store structure will be kept in memory in the storedb. If the
       caching features of Manopt are not used, this is irrelevant. For
       this algorithm, a store depth of 2 should always be sufficient.
   

 The implementation of the Barzilai-Borwein method is based on the paper:

 B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with 
 nonmonotone line-search and the matrix geometric mean computation",
 IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015.

 See also: steepestdescent conjugategradient trustregions

0001 function [x, cost, info, options] = barzilaiborwein(problem, x, options)
0002 % Riemannian Barzilai-Borwein solver with non-monotone line-search.
0003 %
0004 % function [x, cost, info, options] = barzilaiborwein(problem)
0005 % function [x, cost, info, options] = barzilaiborwein(problem, x0)
0006 % function [x, cost, info, options] = barzilaiborwein(problem, x0, options)
0007 % function [x, cost, info, options] = barzilaiborwein(problem, [], options)
0008 %
0009 % Apply the Barzilai-Borwein minimization algorithm to the problem defined
0010 % in the problem structure, starting at x0 if it is provided (otherwise, at
0011 % a random point on the manifold). To specify options whilst not specifying
0012 % an initial guess, give x0 as [] (the empty matrix).
0013 %
0014 % The algorithm uses its own special non-monotone line-search strategy.
0015 % Therefore, no line-search algorithm should be specified in the problem
0016 % structure or in the options structure.
0017 %
0018 % In most of the examples bundled with the toolbox (see link below), the
0019 % solver can be replaced by the present one if need be.
0020 %
0021 % The outputs x and cost are the last reached point on the manifold and its
0022 % cost. This is not necessarily the best point generated since the method
0023 % is not monotone. The struct-array info contains information about the
0024 % iterations:
0025 %   iter : the iteration number (0 for the initial guess)
0026 %   cost : cost value
0027 %   time : elapsed time in seconds
0028 %   gradnorm : Riemannian norm of the gradient
0029 %   stepsize : norm of the last tangent vector retracted
0030 %   linesearch : information logged by the line-search algorithm
0031 %   And possibly additional information logged by options.statsfun.
0032 % For example, type [info.gradnorm] to obtain a vector of the successive
0033 % gradient norms reached.
0034 %
0035 % The options structure is used to overwrite the default values. All
0036 % options have a default value and are hence optional. To force an option
0037 % value, pass an options structure with a field options.optionname, where
0038 % optionname is one of the following and the default value is indicated
0039 % between parentheses:
0040 %
0041 %   tolgradnorm (1e-6)
0042 %       The algorithm terminates if the norm of the gradient drops below this.
0043 %   maxiter (1000)
0044 %       The algorithm terminates if maxiter iterations have been executed.
0045 %   maxtime (Inf)
0046 %       The algorithm terminates if maxtime seconds elapsed.
0047 %   minstepsize (1e-10)
0048 %       The algorithm terminates if the linesearch returns a displacement
0049 %       vector (to be retracted) smaller in norm than this value.
0050 %   linesearch (@linesearch_hint)
0051 %       This option should not be changed, as the present solver has its
0052 %       own dedicated line-search strategy.
0053 %   strategy ('direct')
0054 %       The strategy used for the Barzilai-Borwein stepsize
0055 %       'direct', compute the direct step <s_k,s_k>/<s_k,y_k>
0056 %       'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k>
0057 %       'alternate', alternates between direct and inverse step
0058 %   lambdamax (1e3)
0059 %       The maximum stepsize allowed by the Barzilai-Borwein method
0060 %   lambdamin (1e-3)
0061 %       The minimum stepsize allowed by the Barzilai-Borwein method
0062 %   lambda0 (1/10)
0063 %       The initial stepsize of the Barzilai-Borwein method
0064 %   ls_nmsteps (10)
0065 %       The non-monotone line-search checks a sufficient decrease with respect
0066 %       to the previous ls_nmsteps objective function values.
0067 %   statsfun (none)
0068 %       Function handle to a function that will be called after each
0069 %       iteration to provide the opportunity to log additional statistics.
0070 %       They will be returned in the info struct. See the generic Manopt
0071 %       documentation about solvers for further information.
0072 %   stopfun (none)
0073 %       Function handle to a function that will be called at each iteration
0074 %       to provide the opportunity to specify additional stopping criteria.
0075 %       See the generic Manopt documentation about solvers for further
0076 %       information.
0077 %   verbosity (3)
0078 %       Integer number used to tune the amount of output the algorithm
0079 %       generates during execution (mostly as text in the command window).
0080 %       The higher, the more output. 0 means silent.
0081 %   storedepth (2)
0082 %       Maximum number of different points x of the manifold for which a
0083 %       store structure will be kept in memory in the storedb. If the
0084 %       caching features of Manopt are not used, this is irrelevant. For
0085 %       this algorithm, a store depth of 2 should always be sufficient.
0086 %
0087 %
0088 % The implementation of the Barzilai-Borwein method is based on the paper:
0089 %
0090 % B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with
0091 % nonmonotone line-search and the matrix geometric mean computation",
0092 % IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015.
0093 %
0094 % See also: steepestdescent conjugategradient trustregions
0095 
0096 % This file is part of Manopt: www.manopt.org.
0097 % Original author: Margherita Porcelli, May 31, 2017
0098 % Contributors: Nicolas Boumal, Bruno Iannazzo
0099 % Change log:
0100 %
0101 %   Aug. 2, 2018 (NB):
0102 %       Now using storedb.remove() to keep the cache lean.
0103 
0104     
0105     % Verify that the problem description is sufficient for the solver.
0106     if ~canGetCost(problem)
0107         warning('manopt:getCost', ...
0108                 'No cost provided. The algorithm will likely abort.');  
0109     end
0110     if ~canGetGradient(problem) && ~canGetApproxGradient(problem)
0111         % Note: we do not give a warning if an approximate gradient is
0112         % explicitly given in the problem description, as in that case the
0113         % user seems to be aware of the issue.
0114         warning('manopt:getGradient:approx', ...
0115                ['No gradient provided. Using an FD approximation instead (slow).\n' ...
0116                 'It may be necessary to increase options.tolgradnorm.\n' ...
0117                 'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']);
0118         problem.approxgrad = approxgradientFD(problem);
0119     end
0120 
0121     % Ensure options exists as a structure
0122     if ~exist('options', 'var') || isempty(options)
0123         options = struct();
0124     end
0125     
0126     % Set local defaults here
0127     localdefaults.minstepsize = 1e-10;
0128     localdefaults.maxiter = 1000;
0129     localdefaults.tolgradnorm = 1e-6;
0130 
0131     % Upper and lower bound for the Barzilai-Borwein stepsize
0132     localdefaults.lambdamax = 1e3;
0133     localdefaults.lambdamin = 1e-3;
0134     % Initial Barzilai-Borwein stepsize
0135     localdefaults.lambda0 = 1/10;
0136 
0137     % Barzilai-Borwein strategy (direct, inverse or alternate)
0138     localdefaults.strategy = 'direct';
0139 
0140     % Line-search parameters
0141     % 1) Make sure the user didn't try to define a line search
0142     if canGetLinesearch(problem) || isfield(options, 'linesearch')
0143         error('manopt:BB:ls', ...
0144               ['The problem structure may not specify a line-search ' ...
0145                'hint for the BB solver,\nand the options structure ' ...
0146                'may not specify a line-search algorithm for BB.']);
0147     end
0148     % 2) Define the line-search parameters
0149     problem.linesearch = @(x, d, storedb, key) 1;
0150     options.linesearch = @linesearch_hint;
0151     % The Armijo sufficient decrease parameter
0152     localdefaults.ls_suff_decr = 1e-4;
0153     % The previous steps checked in the non-monotone line-search strategy
0154     localdefaults.ls_nmsteps = 10;
0155     
0156     
0157     % Merge global and local defaults, then merge w/ user options, if any.
0158     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0159     options = mergeOptions(localdefaults, options); 
0160 
0161     
0162     % Shorthands for some parameters
0163     strategy = options.strategy;
0164     lambdamax = options.lambdamax;
0165     lambdamin = options.lambdamin;
0166     lambda0 = options.lambda0;
0167     
0168     timetic = tic();
0169     
0170     
0171     % If no initial point x is given by the user, generate one at random.
0172     if ~exist('x', 'var') || isempty(x)
0173         x = problem.M.rand();
0174     end
0175 
0176     % Create a store database and get a key for the current x
0177     storedb = StoreDB(options.storedepth);
0178     key = storedb.getNewKey();
0179     
0180     % Compute objective-related quantities for x
0181     [cost, grad] = getCostGrad(problem, x, storedb, key);
0182     gradnorm = problem.M.norm(x, grad);
0183 
0184     % Some variables below need to store information about iterations. We
0185     % preallocate for a reasonable amount of intended iterations to avoid
0186     % memory re-allocations.
0187     mem_init_size = min(10000, options.maxiter+1);
0188     
0189     % Store the cost value
0190     f = zeros(mem_init_size, 1);
0191     f(1) = cost;
0192     
0193     % Iteration counter (at any point, iter is the number of fully executed
0194     % iterations so far)
0195     iter = 0;
0196     
0197     % Save stats in a struct array info, and preallocate.
0198     stats = savestats();
0199     info(1) = stats;
0200     info(mem_init_size).iter = [];
0201     
0202     if options.verbosity >= 2
0203         fprintf(' iter\t                cost val\t     grad. norm\n');
0204     end
0205 
0206     % Set the initial Barzilai-Borwein stepsize
0207     lambda = lambda0;
0208 
0209     % Start iterating until stopping criterion triggers
0210     while true
0211 
0212         % Display iteration information
0213         if options.verbosity >= 2
0214             fprintf('%5d\t%+.16e\t%.8e\n', iter, cost, gradnorm);
0215         end
0216         
0217         % Start timing this iteration
0218         timetic = tic();
0219         
0220         % Run standard stopping criterion checks
0221         [stop, reason] = stoppingcriterion(problem, x, options, ...
0222                                                              info, iter+1);
0223         
0224         % If none triggered, run specific stopping criterion check
0225         if ~stop && stats.stepsize < options.minstepsize
0226             stop = true;
0227             reason = sprintf(['Last stepsize smaller than minimum '  ...
0228                               'allowed; options.minstepsize = %g.'], ...
0229                               options.minstepsize);
0230         end
0231     
0232         if stop
0233             if options.verbosity >= 1
0234                 fprintf([reason '\n']);
0235             end
0236             break;
0237         end
0238 
0239         % Pick the descent direction as minus the gradient (scaled)
0240         desc_dir = problem.M.lincomb(x, -lambda, grad);
0241 
0242         % Execute the nonmonotone line search
0243         k = iter + 1; 
0244         start = max(1, k - options.ls_nmsteps + 1);
0245         
0246         [stepsize, newx, newkey, lsstats] = ...
0247             options.linesearch(problem, x, desc_dir, max(f(start:k)), ...
0248                             -lambda * gradnorm^2, options, storedb, key);
0249 
0250         % Updates the value of lambda
0251         lambda = lambda * lsstats.alpha;
0252 
0253         % Compute the new cost-related quantities for newx
0254         [newcost, newgrad] = getCostGrad(problem, newx, storedb, newkey);
0255         newgradnorm = problem.M.norm(newx, newgrad);
0256 
0257         %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0258         % BARZILAI-BORWEIN STRATEGY
0259 
0260         % Store the cost value
0261         f(iter+2) = newcost;
0262        
0263         % Transport the old gradient to newx
0264         grad_transp = problem.M.transp(x, newx, grad);
0265 
0266         % Compute the difference between grandients
0267         Y = problem.M.lincomb(newx, 1, newgrad, -1, grad_transp);
0268 
0269         % Compute the transported step
0270         Stransp =  problem.M.lincomb(x, -lambda, grad_transp); 
0271 
0272         % Compute the new Barzilai-Borwein step following the strategy
0273         % direct strategy
0274         if strcmp(strategy, 'direct')
0275           num = problem.M.norm(newx, Stransp)^2; 
0276           den = problem.M.inner(newx, Stransp, Y);
0277           if den > 0
0278             lambda = min( lambdamax, max(lambdamin, num/den) );
0279           else
0280             lambda = lambdamax;
0281           end
0282         end
0283 
0284         % inverse strategy
0285         if strcmp(strategy, 'inverse')
0286           num = problem.M.inner(newx, Stransp, Y);
0287           den = problem.M.norm(newx, Y)^2;
0288 
0289           if num > 0  
0290             lambda = min( lambdamax, max(lambdamin, num/den) );
0291           else
0292             lambda = lambdamax;
0293           end
0294         end
0295 
0296         % alternate strategy
0297         if strcmp(strategy, 'alternate')
0298           num = problem.M.norm(newx, Stransp)^2; 
0299           den = problem.M.inner(newx, Stransp, Y);
0300           den2 = problem.M.norm(newx, Y)^2;
0301           if (den > 0)  
0302             if mod(iter,2)==0
0303                 lambda = min( lambdamax, max(lambdamin, num/den) );
0304         else
0305                 lambda = min( lambdamax, max(lambdamin, den/den2) );
0306             end
0307           else
0308             lambda = lambdamax;
0309           end
0310         end
0311         
0312         %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0313         
0314         % Update iterate info
0315         storedb.removefirstifdifferent(key, newkey);
0316         x = newx;
0317         key = newkey;
0318         cost = newcost;
0319         grad = newgrad;
0320         gradnorm = newgradnorm;
0321 
0322         % iter is the number of iterations we have accomplished.
0323         iter = iter + 1;
0324 
0325         % Make sure we don't use too much memory for the store database.
0326         storedb.purge();
0327         
0328         % Log statistics for freshly executed iteration
0329         stats = savestats();
0330         info(iter+1) = stats;
0331         
0332     end
0333     
0334     info = info(1:iter+1);
0335 
0336     if options.verbosity >= 1
0337         fprintf('Total time is %f [s] (excludes statsfun)\n', ...
0338                 info(end).time);
0339     end
0340     
0341     
0342     
0343     % Routine in charge of collecting the current iteration stats
0344     function stats = savestats()
0345         stats.iter = iter;
0346         stats.cost = cost;
0347         stats.gradnorm = gradnorm;
0348         if iter == 0
0349             stats.stepsize = NaN;
0350             stats.time = toc(timetic);
0351             stats.linesearch = [];
0352         else
0353             stats.stepsize = stepsize;
0354             stats.time = info(iter).time + toc(timetic);
0355             stats.linesearch = lsstats;
0356         end
0357         stats = applyStatsfun(problem, x, storedb, key, options, stats);
0358     end
0359     
0360 end

barzilaiborwein

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE