Home > manopt > solvers > barzilaiborwein > barzilaiborwein.m

barzilaiborwein

PURPOSE ^

Riemannian Barzilai-Borwein solver with non-monotone line-search.

SYNOPSIS ^

function [x, cost, info, options] = barzilaiborwein(problem, x, options)

DESCRIPTION ^

 Riemannian Barzilai-Borwein solver with non-monotone line-search.

 function [x, cost, info, options] = barzilaiborwein(problem)
 function [x, cost, info, options] = barzilaiborwein(problem, x0)
 function [x, cost, info, options] = barzilaiborwein(problem, x0, options)
 function [x, cost, info, options] = barzilaiborwein(problem, [], options)

 Apply the Barzilai-Borwein minimization algorithm to the problem defined
 in the problem structure, starting at x0 if it is provided (otherwise, at
 a random point on the manifold). To specify options whilst not specifying
 an initial guess, give x0 as [] (the empty matrix).

 The algorithm uses its own special non-monotone line-search strategy.
 Therefore, no lin-search algorithm should be specified in the problem
 structure or in the options structure.

 In most of the examples bundled with the toolbox (see link below), the
 solver can be replaced by the present one if need be.

 The outputs x and cost are the last reached point on the manifold and its
 cost. This is not necessarily the best point generated since the method
 is not monotone. The struct-array info contains information about the
 iterations:
   iter : the iteration number (0 for the initial guess)
   cost : cost value
   time : elapsed time in seconds
   gradnorm : Riemannian norm of the gradient
   stepsize : norm of the last tangent vector retracted
   linesearch : information logged by the line-search algorithm
   And possibly additional information logged by options.statsfun.
 For example, type [info.gradnorm] to obtain a vector of the successive
 gradient norms reached.

 The options structure is used to overwrite the default values. All
 options have a default value and are hence optional. To force an option
 value, pass an options structure with a field options.optionname, where
 optionname is one of the following and the default value is indicated
 between parentheses:

   tolgradnorm (1e-6)
       The algorithm terminates if the norm of the gradient drops below this.
   maxiter (1000)
       The algorithm terminates if maxiter iterations have been executed.
   maxtime (Inf)
       The algorithm terminates if maxtime seconds elapsed.
   minstepsize (1e-10)
       The algorithm terminates if the linesearch returns a displacement
       vector (to be retracted) smaller in norm than this value.
   linesearch (@linesearch_hint)
       This option should not be changed, as the present solver has its
       own dedicated line-search strategy.
   strategy ('direct')
       The strategy used for the Barzilai-Borwein stepsize
       'direct', compute the direct step <s_k,s_k>/<s_k,y_k>
       'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k>
       'alternate', alternates between direct and inverse step
   lambdamax (1e3)
       The maximum stepsize allowed by the Barzilai-Borwein method
   lambdamin (1e-3)
       The minimum stepsize allowed by the Barzilai-Borwein method
   lambda0 (1/10)
       The initial stepsize of the Barzilai-Borwein method
   ls_nmsteps (10)
       The non-monotone line-search checks a sufficient decrease with respect
       to the previous ls_nmsteps objective function values.
   statsfun (none)
       Function handle to a function that will be called after each
       iteration to provide the opportunity to log additional statistics.
       They will be returned in the info struct. See the generic Manopt
       documentation about solvers for further information.
   stopfun (none)
       Function handle to a function that will be called at each iteration
       to provide the opportunity to specify additional stopping criteria.
       See the generic Manopt documentation about solvers for further
       information.
   verbosity (3)
       Integer number used to tune the amount of output the algorithm
       generates during execution (mostly as text in the command window).
       The higher, the more output. 0 means silent.
   storedepth (2)
       Maximum number of different points x of the manifold for which a
       store structure will be kept in memory in the storedb. If the
       caching features of Manopt are not used, this is irrelevant. For
       this algorithm, a store depth of 2 should always be sufficient.
   

 The implementation of the Barzilai-Borwein method is based on the paper:

 B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with 
 nonmonotone line-search and the matrix geometric mean computation",
 IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015.

 See also: steepestdescent conjugategradient trustregions

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [x, cost, info, options] = barzilaiborwein(problem, x, options)
0002 % Riemannian Barzilai-Borwein solver with non-monotone line-search.
0003 %
0004 % function [x, cost, info, options] = barzilaiborwein(problem)
0005 % function [x, cost, info, options] = barzilaiborwein(problem, x0)
0006 % function [x, cost, info, options] = barzilaiborwein(problem, x0, options)
0007 % function [x, cost, info, options] = barzilaiborwein(problem, [], options)
0008 %
0009 % Apply the Barzilai-Borwein minimization algorithm to the problem defined
0010 % in the problem structure, starting at x0 if it is provided (otherwise, at
0011 % a random point on the manifold). To specify options whilst not specifying
0012 % an initial guess, give x0 as [] (the empty matrix).
0013 %
0014 % The algorithm uses its own special non-monotone line-search strategy.
0015 % Therefore, no lin-search algorithm should be specified in the problem
0016 % structure or in the options structure.
0017 %
0018 % In most of the examples bundled with the toolbox (see link below), the
0019 % solver can be replaced by the present one if need be.
0020 %
0021 % The outputs x and cost are the last reached point on the manifold and its
0022 % cost. This is not necessarily the best point generated since the method
0023 % is not monotone. The struct-array info contains information about the
0024 % iterations:
0025 %   iter : the iteration number (0 for the initial guess)
0026 %   cost : cost value
0027 %   time : elapsed time in seconds
0028 %   gradnorm : Riemannian norm of the gradient
0029 %   stepsize : norm of the last tangent vector retracted
0030 %   linesearch : information logged by the line-search algorithm
0031 %   And possibly additional information logged by options.statsfun.
0032 % For example, type [info.gradnorm] to obtain a vector of the successive
0033 % gradient norms reached.
0034 %
0035 % The options structure is used to overwrite the default values. All
0036 % options have a default value and are hence optional. To force an option
0037 % value, pass an options structure with a field options.optionname, where
0038 % optionname is one of the following and the default value is indicated
0039 % between parentheses:
0040 %
0041 %   tolgradnorm (1e-6)
0042 %       The algorithm terminates if the norm of the gradient drops below this.
0043 %   maxiter (1000)
0044 %       The algorithm terminates if maxiter iterations have been executed.
0045 %   maxtime (Inf)
0046 %       The algorithm terminates if maxtime seconds elapsed.
0047 %   minstepsize (1e-10)
0048 %       The algorithm terminates if the linesearch returns a displacement
0049 %       vector (to be retracted) smaller in norm than this value.
0050 %   linesearch (@linesearch_hint)
0051 %       This option should not be changed, as the present solver has its
0052 %       own dedicated line-search strategy.
0053 %   strategy ('direct')
0054 %       The strategy used for the Barzilai-Borwein stepsize
0055 %       'direct', compute the direct step <s_k,s_k>/<s_k,y_k>
0056 %       'inverse', compute the inverse step <s_k,y_k>/<y_k,y_k>
0057 %       'alternate', alternates between direct and inverse step
0058 %   lambdamax (1e3)
0059 %       The maximum stepsize allowed by the Barzilai-Borwein method
0060 %   lambdamin (1e-3)
0061 %       The minimum stepsize allowed by the Barzilai-Borwein method
0062 %   lambda0 (1/10)
0063 %       The initial stepsize of the Barzilai-Borwein method
0064 %   ls_nmsteps (10)
0065 %       The non-monotone line-search checks a sufficient decrease with respect
0066 %       to the previous ls_nmsteps objective function values.
0067 %   statsfun (none)
0068 %       Function handle to a function that will be called after each
0069 %       iteration to provide the opportunity to log additional statistics.
0070 %       They will be returned in the info struct. See the generic Manopt
0071 %       documentation about solvers for further information.
0072 %   stopfun (none)
0073 %       Function handle to a function that will be called at each iteration
0074 %       to provide the opportunity to specify additional stopping criteria.
0075 %       See the generic Manopt documentation about solvers for further
0076 %       information.
0077 %   verbosity (3)
0078 %       Integer number used to tune the amount of output the algorithm
0079 %       generates during execution (mostly as text in the command window).
0080 %       The higher, the more output. 0 means silent.
0081 %   storedepth (2)
0082 %       Maximum number of different points x of the manifold for which a
0083 %       store structure will be kept in memory in the storedb. If the
0084 %       caching features of Manopt are not used, this is irrelevant. For
0085 %       this algorithm, a store depth of 2 should always be sufficient.
0086 %
0087 %
0088 % The implementation of the Barzilai-Borwein method is based on the paper:
0089 %
0090 % B. Iannazzo, M. Porcelli, "The Riemannian Barzilai-Borwein method with
0091 % nonmonotone line-search and the matrix geometric mean computation",
0092 % IMA Journal of Numerical Analysis, to appear, https://doi.org/10.1093/imanum/drx015.
0093 %
0094 % See also: steepestdescent conjugategradient trustregions
0095 
0096 % This file is part of Manopt: www.manopt.org.
0097 % Original author: Margherita Porcelli, May 31, 2017
0098 % Contributors: Nicolas Boumal, Bruno Iannazzo
0099 % Change log:
0100 
0101     
0102     % Verify that the problem description is sufficient for the solver.
0103     if ~canGetCost(problem)
0104         warning('manopt:getCost', ...
0105                 'No cost provided. The algorithm will likely abort.');  
0106     end
0107     if ~canGetGradient(problem) && ~canGetApproxGradient(problem)
0108         % Note: we do not give a warning if an approximate gradient is
0109         % explicitly given in the problem description, as in that case the
0110         % user seems to be aware of the issue.
0111         warning('manopt:getGradient:approx', ...
0112                ['No gradient provided. Using an FD approximation instead (slow).\n' ...
0113                 'It may be necessary to increase options.tolgradnorm.\n' ...
0114                 'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']);
0115         problem.approxgrad = approxgradientFD(problem);
0116     end
0117 
0118     % Ensure options exists as a structure
0119     if ~exist('options', 'var') || isempty(options)
0120         options = struct();
0121     end
0122     
0123     % Set local defaults here
0124     localdefaults.minstepsize = 1e-10;
0125     localdefaults.maxiter = 1000;
0126     localdefaults.tolgradnorm = 1e-6;
0127 
0128     % Upper and lower bound for the Barzilai-Borwein stepsize
0129     localdefaults.lambdamax = 1e3;
0130     localdefaults.lambdamin = 1e-3;
0131     % Initial Barzilai-Borwein stepsize
0132     localdefaults.lambda0 = 1/10;
0133 
0134     % Barzilai-Borwein strategy (direct, inverse or alternate)
0135     localdefaults.strategy = 'direct';
0136 
0137     % Line-search parameters
0138     % 1) Make sure the user didn't try to define a line search
0139     if canGetLinesearch(problem) || isfield(options, 'linesearch')
0140         error('manopt:BB:ls', ...
0141               ['The problem structure may not specify a line-search ' ...
0142                'hint for the BB solver,\nand the options structure ' ...
0143                'may not specify a line-search algorithm for BB.']);
0144     end
0145     % 2) Define the line-search parameters
0146     problem.linesearch = @(x, d, storedb, key) 1;
0147     options.linesearch = @linesearch_hint;
0148     % The Armijo sufficient decrease parameter
0149     localdefaults.ls_suff_decr = 1e-4;
0150     % The previous steps checked in the non-monotone line-search strategy
0151     localdefaults.ls_nmsteps = 10;
0152     
0153     
0154     % Merge global and local defaults, then merge w/ user options, if any.
0155     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0156     options = mergeOptions(localdefaults, options); 
0157 
0158     
0159     % Shorthands for some parameters
0160     strategy = options.strategy;
0161     lambdamax = options.lambdamax;
0162     lambdamin = options.lambdamin;
0163     lambda0 = options.lambda0;
0164     
0165     timetic = tic();
0166     
0167     
0168     % If no initial point x is given by the user, generate one at random.
0169     if ~exist('x', 'var') || isempty(x)
0170         x = problem.M.rand();
0171     end
0172 
0173     % Create a store database and get a key for the current x
0174     storedb = StoreDB(options.storedepth);
0175     key = storedb.getNewKey();
0176     
0177     % Compute objective-related quantities for x
0178     [cost, grad] = getCostGrad(problem, x, storedb, key);
0179     gradnorm = problem.M.norm(x, grad);
0180 
0181     % Some variables below need to store information about iterations. We
0182     % preallocate for a reasonable amount of intended iterations to avoid
0183     % memory re-allocations.
0184     mem_init_size = min(10000, options.maxiter+1);
0185     
0186     % Store the cost value
0187     f = zeros(mem_init_size, 1);
0188     f(1) = cost;
0189     
0190     % Iteration counter (at any point, iter is the number of fully executed
0191     % iterations so far)
0192     iter = 0;
0193     
0194     % Save stats in a struct array info, and preallocate.
0195     stats = savestats();
0196     info(1) = stats;
0197     info(mem_init_size).iter = [];
0198     
0199     if options.verbosity >= 2
0200         fprintf(' iter\t                cost val\t     grad. norm\n');
0201     end
0202 
0203     % Set the initial Barzilai-Borwein stepsize
0204     lambda = lambda0;
0205 
0206     % Start iterating until stopping criterion triggers
0207     while true
0208 
0209         % Display iteration information
0210         if options.verbosity >= 2
0211             fprintf('%5d\t%+.16e\t%.8e\n', iter, cost, gradnorm);
0212         end
0213         
0214         % Start timing this iteration
0215         timetic = tic();
0216         
0217         % Run standard stopping criterion checks
0218         [stop, reason] = stoppingcriterion(problem, x, options, ...
0219                                                              info, iter+1);
0220         
0221         % If none triggered, run specific stopping criterion check
0222         if ~stop && stats.stepsize < options.minstepsize
0223             stop = true;
0224             reason = sprintf(['Last stepsize smaller than minimum '  ...
0225                               'allowed; options.minstepsize = %g.'], ...
0226                               options.minstepsize);
0227         end
0228     
0229         if stop
0230             if options.verbosity >= 1
0231                 fprintf([reason '\n']);
0232             end
0233             break;
0234         end
0235 
0236         % Pick the descent direction as minus the gradient (scaled)
0237         desc_dir = problem.M.lincomb(x, -lambda, grad);
0238 
0239         % Execute the nonmonotone line search
0240         k = iter + 1; 
0241         start = max(1, k - options.ls_nmsteps + 1);
0242         
0243         [stepsize, newx, newkey, lsstats] = ...
0244             options.linesearch(problem, x, desc_dir, max(f(start:k)), ...
0245                             -lambda * gradnorm^2, options, storedb, key);
0246 
0247         % Updates the value of lambda
0248         lambda = lambda * lsstats.alpha;
0249 
0250         % Compute the new cost-related quantities for newx
0251         [newcost, newgrad] = getCostGrad(problem, newx, storedb, newkey);
0252         newgradnorm = problem.M.norm(newx, newgrad);
0253 
0254         %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0255         % BARZILAI-BORWEIN STRATEGY
0256 
0257         % Store the cost value
0258         f(iter+2) = newcost;
0259        
0260         % Transport the old gradient to newx
0261         grad_transp = problem.M.transp(x, newx, grad);
0262 
0263         % Compute the difference between grandients
0264         Y = problem.M.lincomb(newx, 1, newgrad, -1, grad_transp);
0265 
0266         % Compute the transported step
0267         Stransp =  problem.M.lincomb(x, -lambda, grad_transp); 
0268 
0269         % Compute the new Barzilai-Borwein step following the strategy
0270         % direct strategy
0271         if strcmp(strategy, 'direct')
0272           num = problem.M.norm(newx, Stransp)^2; 
0273           den = problem.M.inner(newx, Stransp, Y);
0274           if den > 0
0275             lambda = min( lambdamax, max(lambdamin, num/den) );
0276           else
0277             lambda = lambdamax;
0278           end
0279         end
0280 
0281         % inverse strategy
0282         if strcmp(strategy, 'inverse')
0283           num = problem.M.inner(newx, Stransp, Y);
0284           den = problem.M.norm(newx, Y)^2;
0285 
0286           if num > 0  
0287             lambda = min( lambdamax, max(lambdamin, num/den) );
0288           else
0289             lambda = lambdamax;
0290           end
0291         end
0292 
0293         % alternate strategy
0294         if strcmp(strategy, 'alternate')
0295           num = problem.M.norm(newx, Stransp)^2; 
0296           den = problem.M.inner(newx, Stransp, Y);
0297           den2 = problem.M.norm(newx, Y)^2;
0298           if (den > 0)  
0299             if mod(iter,2)==0
0300                 lambda = min( lambdamax, max(lambdamin, num/den) );
0301         else
0302                 lambda = min( lambdamax, max(lambdamin, den/den2) );
0303             end
0304           else
0305             lambda = lambdamax;
0306           end
0307         end
0308         
0309         %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0310 
0311         % Make sure we don't use too much memory for the store database
0312         storedb.purge();
0313         
0314         % Update iterate info
0315         x = newx;
0316         key = newkey;
0317         cost = newcost;
0318         grad = newgrad;
0319         gradnorm = newgradnorm;
0320 
0321         % iter is the number of iterations we have accomplished.
0322         iter = iter + 1;
0323         
0324         % Log statistics for freshly executed iteration
0325         stats = savestats();
0326         info(iter+1) = stats;
0327         
0328     end
0329     
0330     info = info(1:iter+1);
0331 
0332     if options.verbosity >= 1
0333         fprintf('Total time is %f [s] (excludes statsfun)\n', ...
0334                 info(end).time);
0335     end
0336     
0337     
0338     
0339     % Routine in charge of collecting the current iteration stats
0340     function stats = savestats()
0341         stats.iter = iter;
0342         stats.cost = cost;
0343         stats.gradnorm = gradnorm;
0344         if iter == 0
0345             stats.stepsize = NaN;
0346             stats.time = toc(timetic);
0347             stats.linesearch = [];
0348         else
0349             stats.stepsize = stepsize;
0350             stats.time = info(iter).time + toc(timetic);
0351             stats.linesearch = lsstats;
0352         end
0353         stats = applyStatsfun(problem, x, storedb, key, options, stats);
0354     end
0355     
0356 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005