Home > manopt > solvers > bfgs > rlbfgs.m

rlbfgs

PURPOSE ^

Riemannian limited memory BFGS solver for smooth objective functions.

SYNOPSIS ^

function [x, cost, info, options] = rlbfgs(problem, x0, options)

DESCRIPTION ^

 Riemannian limited memory BFGS solver for smooth objective functions.
 
 function [x, cost, info, options] = rlbfgs(problem)
 function [x, cost, info, options] = rlbfgs(problem, x0)
 function [x, cost, info, options] = rlbfgs(problem, x0, options)
 function [x, cost, info, options] = rlbfgs(problem, [], options)


 This is a Riemannian limited memory BFGS solver (quasi-Newton method), 
 which aims to minimize the cost function in the given problem structure.
 It requires access to the gradient of the cost function.

 Parameter options.memory can be used to specify the number of iterations
 the algorithm remembers and uses to approximate the inverse Hessian of
 the cost. Default value is 30.
 For unlimited memory, set options.memory = Inf.


 For a description of the algorithm and theorems offering convergence
 guarantees, see the references below.

 The initial iterate is x0 if it is provided. Otherwise, a random point on
 the manifold is picked. To specify options whilst not specifying an
 initial iterate, give x0 as [] (the empty matrix).

 The two outputs 'x' and 'cost' are the last reached point on the manifold
 and its cost.
 
 The output 'info' is a struct-array which contains information about the
 iterations:
   iter (integer)
       The iteration number. The initial guess is 0.
    cost (double)
       The corresponding cost value.
    gradnorm (double)
       The (Riemannian) norm of the gradient.
    time (double)
       The total elapsed time in seconds to reach the corresponding cost.
    stepsize (double)
       The size of the step from the previous to the new iterate.
   accepted (Boolean)
       true if step is accepted in the cautious update. 0 otherwise.
   And possibly additional information logged by options.statsfun.
 For example, type [info.gradnorm] to obtain a vector of the successive
 gradient norms reached at each iteration.

 The options structure is used to overwrite the default values. All
 options have a default value and are hence optional. To force an option
 value, pass an options structure with a field options.optionname, where
 optionname is one of the following and the default value is indicated
 between parentheses:

   tolgradnorm (1e-6)
       The algorithm terminates if the norm of the gradient drops below
       this. For well-scaled problems, a rule of thumb is that you can
       expect to reduce the gradient norm by 8 orders of magnitude
       (sqrt(eps)) compared to the gradient norm at a "typical" point (a
       rough initial iterate for example). Further decrease is sometimes
       possible, but inexact floating point arithmetic will eventually
       limit the final accuracy. If tolgradnorm is set too low, the
       algorithm may end up iterating forever (or at least until another
       stopping criterion triggers).
   maxiter (1000)
       The algorithm terminates if maxiter iterations were executed.
   maxtime (Inf)
       The algorithm terminates if maxtime seconds elapsed.
   minstepsize (1e-10)
     The minimum norm of the tangent vector that points from the current
     point to the next point. If the norm is less than minstepsize, the 
     program will terminate.
   memory (30)
     The number of previous iterations the program remembers. This is used 
     to approximate the inverse Hessian at the current point. Because of
     difficulty of maintaining a representation of operators in terms of
     coordinates, a recursive method is used. The number of steps in the
     recursion is at most options.memory. This parameter can take any
     integer value >= 0, or Inf, which is taken to be options.maxiter. If
     options.maxiter has value Inf, then it will take value 10000 and a
     warning will be displayed.
   strict_inc_func (@(t) 1e-4*t)
     The Cautious step needs a real function that has value 0 at t = 0,
     and  is strictly increasing. See details in Wen Huang's paper
     "A Riemannian BFGS Method without Differentiated Retraction for 
     Nonconvex Optimization Problems"
   statsfun (none)
       Function handle to a function that will be called after each
       iteration to provide the opportunity to log additional statistics.
       They will be returned in the info struct. See the generic Manopt
       documentation about solvers for further information. statsfun is
       called with the point x that was reached last.
   stopfun (none)
       Function handle to a function that will be called at each iteration
       to provide the opportunity to specify additional stopping criteria.
       See the generic Manopt documentation about solvers for further
       information.
   verbosity (2)
       Integer number used to tune the amount of output the algorithm
       generates during execution (mostly as text in the command window).
       The higher, the more output. 0 means silent. 3 and above includes a
       display of the options structure at the beginning of the execution.
   debug (false)
       Set to true to allow the algorithm to perform additional
       computations for debugging purposes. If a debugging test fails, you
       will be informed of it, usually via the command window. Be aware
       that these additional computations appear in the algorithm timings
       too, and may interfere with operations such as counting the number
       of cost evaluations, etc. (the debug calls get storedb too).
   storedepth (2)
       Maximum number of different points x of the manifold for which a
       store structure will be kept in memory in the storedb for caching.
       If memory usage is an issue, you may try to lower this number.
       Profiling may then help to investigate if a performance hit was
       incurred as a result.


 Please cite the Manopt paper as well as the research paper:
 @InBook{Huang2016,
   title     = {A {R}iemannian {BFGS} Method for Nonconvex Optimization Problems},
   author    = {Huang, W. and Absil, P.-A. and Gallivan, K.A.},
   year      = {2016},
   publisher = {Springer International Publishing},
   editor    = {Karas{\"o}zen, B{\"u}lent and Manguo{\u{g}}lu, Murat and Tezer-Sezgin, M{\"u}nevver and G{\"o}ktepe, Serdar and U{\u{g}}ur, {\"O}m{\"u}r},
   address   = {Cham},
   booktitle = {Numerical Mathematics and Advanced Applications ENUMATH 2015},
   pages     = {627--634},
   doi       = {10.1007/978-3-319-39929-4_60}
 }

 We point out that, at the moment, this implementation of RLBFGS can be
 slower than the implementation in ROPTLIB by Wen Huang et al. referenced
 above. For the purpose of comparing to their work, please use their
 implementation.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [x, cost, info, options] = rlbfgs(problem, x0, options)
0002 % Riemannian limited memory BFGS solver for smooth objective functions.
0003 %
0004 % function [x, cost, info, options] = rlbfgs(problem)
0005 % function [x, cost, info, options] = rlbfgs(problem, x0)
0006 % function [x, cost, info, options] = rlbfgs(problem, x0, options)
0007 % function [x, cost, info, options] = rlbfgs(problem, [], options)
0008 %
0009 %
0010 % This is a Riemannian limited memory BFGS solver (quasi-Newton method),
0011 % which aims to minimize the cost function in the given problem structure.
0012 % It requires access to the gradient of the cost function.
0013 %
0014 % Parameter options.memory can be used to specify the number of iterations
0015 % the algorithm remembers and uses to approximate the inverse Hessian of
0016 % the cost. Default value is 30.
0017 % For unlimited memory, set options.memory = Inf.
0018 %
0019 %
0020 % For a description of the algorithm and theorems offering convergence
0021 % guarantees, see the references below.
0022 %
0023 % The initial iterate is x0 if it is provided. Otherwise, a random point on
0024 % the manifold is picked. To specify options whilst not specifying an
0025 % initial iterate, give x0 as [] (the empty matrix).
0026 %
0027 % The two outputs 'x' and 'cost' are the last reached point on the manifold
0028 % and its cost.
0029 %
0030 % The output 'info' is a struct-array which contains information about the
0031 % iterations:
0032 %   iter (integer)
0033 %       The iteration number. The initial guess is 0.
0034 %    cost (double)
0035 %       The corresponding cost value.
0036 %    gradnorm (double)
0037 %       The (Riemannian) norm of the gradient.
0038 %    time (double)
0039 %       The total elapsed time in seconds to reach the corresponding cost.
0040 %    stepsize (double)
0041 %       The size of the step from the previous to the new iterate.
0042 %   accepted (Boolean)
0043 %       true if step is accepted in the cautious update. 0 otherwise.
0044 %   And possibly additional information logged by options.statsfun.
0045 % For example, type [info.gradnorm] to obtain a vector of the successive
0046 % gradient norms reached at each iteration.
0047 %
0048 % The options structure is used to overwrite the default values. All
0049 % options have a default value and are hence optional. To force an option
0050 % value, pass an options structure with a field options.optionname, where
0051 % optionname is one of the following and the default value is indicated
0052 % between parentheses:
0053 %
0054 %   tolgradnorm (1e-6)
0055 %       The algorithm terminates if the norm of the gradient drops below
0056 %       this. For well-scaled problems, a rule of thumb is that you can
0057 %       expect to reduce the gradient norm by 8 orders of magnitude
0058 %       (sqrt(eps)) compared to the gradient norm at a "typical" point (a
0059 %       rough initial iterate for example). Further decrease is sometimes
0060 %       possible, but inexact floating point arithmetic will eventually
0061 %       limit the final accuracy. If tolgradnorm is set too low, the
0062 %       algorithm may end up iterating forever (or at least until another
0063 %       stopping criterion triggers).
0064 %   maxiter (1000)
0065 %       The algorithm terminates if maxiter iterations were executed.
0066 %   maxtime (Inf)
0067 %       The algorithm terminates if maxtime seconds elapsed.
0068 %   minstepsize (1e-10)
0069 %     The minimum norm of the tangent vector that points from the current
0070 %     point to the next point. If the norm is less than minstepsize, the
0071 %     program will terminate.
0072 %   memory (30)
0073 %     The number of previous iterations the program remembers. This is used
0074 %     to approximate the inverse Hessian at the current point. Because of
0075 %     difficulty of maintaining a representation of operators in terms of
0076 %     coordinates, a recursive method is used. The number of steps in the
0077 %     recursion is at most options.memory. This parameter can take any
0078 %     integer value >= 0, or Inf, which is taken to be options.maxiter. If
0079 %     options.maxiter has value Inf, then it will take value 10000 and a
0080 %     warning will be displayed.
0081 %   strict_inc_func (@(t) 1e-4*t)
0082 %     The Cautious step needs a real function that has value 0 at t = 0,
0083 %     and  is strictly increasing. See details in Wen Huang's paper
0084 %     "A Riemannian BFGS Method without Differentiated Retraction for
0085 %     Nonconvex Optimization Problems"
0086 %   statsfun (none)
0087 %       Function handle to a function that will be called after each
0088 %       iteration to provide the opportunity to log additional statistics.
0089 %       They will be returned in the info struct. See the generic Manopt
0090 %       documentation about solvers for further information. statsfun is
0091 %       called with the point x that was reached last.
0092 %   stopfun (none)
0093 %       Function handle to a function that will be called at each iteration
0094 %       to provide the opportunity to specify additional stopping criteria.
0095 %       See the generic Manopt documentation about solvers for further
0096 %       information.
0097 %   verbosity (2)
0098 %       Integer number used to tune the amount of output the algorithm
0099 %       generates during execution (mostly as text in the command window).
0100 %       The higher, the more output. 0 means silent. 3 and above includes a
0101 %       display of the options structure at the beginning of the execution.
0102 %   debug (false)
0103 %       Set to true to allow the algorithm to perform additional
0104 %       computations for debugging purposes. If a debugging test fails, you
0105 %       will be informed of it, usually via the command window. Be aware
0106 %       that these additional computations appear in the algorithm timings
0107 %       too, and may interfere with operations such as counting the number
0108 %       of cost evaluations, etc. (the debug calls get storedb too).
0109 %   storedepth (2)
0110 %       Maximum number of different points x of the manifold for which a
0111 %       store structure will be kept in memory in the storedb for caching.
0112 %       If memory usage is an issue, you may try to lower this number.
0113 %       Profiling may then help to investigate if a performance hit was
0114 %       incurred as a result.
0115 %
0116 %
0117 % Please cite the Manopt paper as well as the research paper:
0118 % @InBook{Huang2016,
0119 %   title     = {A {R}iemannian {BFGS} Method for Nonconvex Optimization Problems},
0120 %   author    = {Huang, W. and Absil, P.-A. and Gallivan, K.A.},
0121 %   year      = {2016},
0122 %   publisher = {Springer International Publishing},
0123 %   editor    = {Karas{\"o}zen, B{\"u}lent and Manguo{\u{g}}lu, Murat and Tezer-Sezgin, M{\"u}nevver and G{\"o}ktepe, Serdar and U{\u{g}}ur, {\"O}m{\"u}r},
0124 %   address   = {Cham},
0125 %   booktitle = {Numerical Mathematics and Advanced Applications ENUMATH 2015},
0126 %   pages     = {627--634},
0127 %   doi       = {10.1007/978-3-319-39929-4_60}
0128 % }
0129 %
0130 % We point out that, at the moment, this implementation of RLBFGS can be
0131 % slower than the implementation in ROPTLIB by Wen Huang et al. referenced
0132 % above. For the purpose of comparing to their work, please use their
0133 % implementation.
0134 %
0135 
0136 
0137 % This file is part of Manopt: www.manopt.org.
0138 % Original author: Changshuo Liu, July 19, 2017.
0139 % Contributors: Nicolas Boumal
0140 % Change log:
0141 %
0142 %   Nov. 27, 2017 (Wen Huang):
0143 %       Changed the default strict_inc_func to @(t) 1e-4*t from @(t) t.
0144 %
0145 %   Jan. 18, 2018 (NB):
0146 %       Corrected a bug related to the way the line search hint was defined
0147 %       by default.
0148 %
0149 %   Aug. 2, 2018 (NB):
0150 %       Using the new storedb.remove features to keep storedb lean, and
0151 %       reduced the default value of storedepth from 30 to 2 as a result.
0152 
0153 
0154     % Verify that the problem description is sufficient for the solver.
0155     if ~canGetCost(problem)
0156         warning('manopt:getCost', ...
0157                 'No cost provided. The algorithm will likely abort.');
0158     end
0159     if ~canGetGradient(problem) && ~canGetApproxGradient(problem)
0160         % Note: we do not give a warning if an approximate gradient is
0161         % explicitly given in the problem description, as in that case the user
0162         % seems to be aware of the issue.
0163         warning('manopt:getGradient:approx', ...
0164            ['No gradient provided. Using an FD approximation instead (slow).\n' ...
0165             'It may be necessary to increase options.tolgradnorm.\n' ...
0166             'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']);
0167         problem.approxgrad = approxgradientFD(problem);
0168     end
0169     
0170     % This solver uses linesearch_hint as a line search algorithm. By
0171     % default, try a step size of 1, so that if the BFGS approximation of
0172     % the Hessian (or inverse Hessian) is good, then the iteration is close
0173     % to a Newton step.
0174     if ~canGetLinesearch(problem)
0175         problem.linesearch = @(x, d) 1;
0176     end
0177     
0178     % Local defaults for the program
0179     localdefaults.minstepsize = 1e-10;
0180     localdefaults.maxiter = 1000;
0181     localdefaults.tolgradnorm = 1e-6;
0182     localdefaults.memory = 30;
0183     localdefaults.strict_inc_func = @(t) 1e-4*t;
0184     localdefaults.ls_max_steps = 25;
0185     localdefaults.storedepth = 2;
0186     
0187     % Merge global and local defaults, then merge w/ user options, if any.
0188     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0189     if ~exist('options', 'var') || isempty(options)
0190         options = struct();
0191     end
0192     options = mergeOptions(localdefaults, options);
0193     
0194     % To make sure memory in range [0, Inf)
0195     options.memory = max(options.memory, 0);
0196     if options.memory == Inf
0197         if isinf(options.maxiter)
0198             options.memory = 10000;
0199             warning('rlbfgs:memory', ['options.memory and options.maxiter' ...
0200               ' are both Inf; options.memory has been changed to 10000.']);
0201         else
0202             options.memory = options.maxiter;
0203         end
0204     end
0205     
0206     M = problem.M;
0207     
0208     
0209     timetic = tic();
0210     
0211     
0212     % Create a random starting point if no starting point is provided.
0213     if ~exist('x0', 'var')|| isempty(x0)
0214         xCur = M.rand();
0215     else
0216         xCur = x0;
0217     end
0218     
0219     % Create a store database and get a key for the current x
0220     storedb = StoreDB(options.storedepth);
0221     key = storedb.getNewKey();
0222     
0223     % __________Initialization of variables______________
0224     % Number of iterations since the last restart
0225     k = 0;  
0226     % Total number of BFGS iterations
0227     iter = 0; 
0228     
0229     % This cell stores step vectors which point from x_{t} to x_{t+1} for t
0230     % indexing the last iterations, capped at options.memory.
0231     % That is, it stores up to options.memory of the most recent step
0232     % vectors. However, the implementation below does not need step vectors
0233     % in their respective tangent spaces at x_{t}'s. Rather, it requires
0234     % them transported to the current point's tangent space by vector
0235     % tranport. For details regarding the requirements on the the vector
0236     % tranport, see the reference paper by Huang et al.
0237     % In this implementation, those step vectors are iteratively
0238     % transported to the current point's tangent space after every
0239     % iteration. Thus, at every iteration, vectors in sHistory are in the
0240     % current point's tangent space.
0241     sHistory = cell(1, options.memory);
0242     
0243     % This cell stores the differences for latest t's of the gradient at
0244     % x_{t+1} and the gradient at x_{t}, transported to x_{t+1}'s tangent
0245     % space. The memory is also capped at options.memory.
0246     yHistory = cell(1, options.memory);
0247     
0248     % rhoHistory{t} stores the reciprocal of the inner product between
0249     % sHistory{t} and yHistory{t}.
0250     rhoHistory = cell(1, options.memory);
0251     
0252     % Scaling of direction given by getDirection for acceptable step
0253     alpha = 1; 
0254     
0255     % Scaling of initial matrix, Barzilai-Borwein.
0256     scaleFactor = 1;
0257     
0258     % Norm of the step
0259     stepsize = 1;
0260     
0261     % Stores whether the step is accepted by the cautious update check.
0262     accepted = true;
0263     
0264     % Query the cost function and its gradient
0265     [xCurCost, xCurGradient] = getCostGrad(problem, xCur, storedb, key);
0266     
0267     xCurGradNorm = M.norm(xCur, xCurGradient);
0268     
0269     % Line-search statistics for recording in info.
0270     lsstats = [];
0271     
0272     % Flag to control restarting scheme to avoid infinite loops (see below)
0273     ultimatum = false;
0274     
0275     % Save stats in a struct array info, and preallocate.
0276     stats = savestats();
0277     info(1) = stats;
0278     info(min(10000, options.maxiter+1)).iter = [];
0279     
0280     if options.verbosity >= 2
0281         fprintf(' iter                   cost val            grad. norm           alpha\n');
0282     end
0283     
0284     % Main iteration
0285     while true
0286 
0287         % Display iteration information
0288         if options.verbosity >= 2
0289         fprintf('%5d    %+.16e        %.8e      %.4e\n', ...
0290                 iter, xCurCost, xCurGradNorm, alpha);
0291         end
0292         
0293         % Start timing this iteration
0294         timetic = tic();
0295         
0296         % Run standard stopping criterion checks
0297         [stop, reason] = stoppingcriterion(problem, xCur, options, ...
0298                                            info, iter+1);
0299         
0300         % If none triggered, run specific stopping criterion check
0301         if ~stop 
0302             if stats.stepsize < options.minstepsize
0303                 % To avoid infinite loop and to push the search further
0304                 % in case BFGS approximation of Hessian is off towards
0305                 % the end, we erase the memory by setting k = 0;
0306                 % In this way, it starts off like a steepest descent.
0307                 % If even steepest descent does not work, then it is
0308                 % hopeless and we will terminate.
0309                 if ~ultimatum
0310                     if options.verbosity >= 2
0311                         fprintf(['stepsize is too small, restarting ' ...
0312                             'the bfgs procedure at the current point.\n']);
0313                     end
0314                     k = 0;
0315                     ultimatum = true;
0316                 else
0317                     stop = true;
0318                     reason = sprintf(['Last stepsize smaller than '  ...
0319                         'minimum allowed; options.minstepsize = %g.'], ...
0320                         options.minstepsize);
0321                 end
0322             else
0323                 % We are not in trouble: lift the ultimatum if it was on.
0324                 ultimatum = false;
0325             end
0326         end  
0327         
0328         if stop
0329             if options.verbosity >= 1
0330                 fprintf([reason '\n']);
0331             end
0332             break;
0333         end
0334 
0335         
0336         % Compute BFGS direction
0337         p = getDirection(M, xCur, xCurGradient, sHistory,...
0338                 yHistory, rhoHistory, scaleFactor, min(k, options.memory));
0339 
0340         % Execute line-search
0341         [stepsize, xNext, newkey, lsstats] = ...
0342             linesearch_hint(problem, xCur, p, xCurCost, ...
0343                             M.inner(xCur, xCurGradient, p), ...
0344                             options, storedb, key);
0345         
0346         % Record the BFGS step-multiplier alpha which was effectively
0347         % selected. Toward convergence, we hope to see alpha = 1.
0348         alpha = stepsize/M.norm(xCur, p);
0349         step = M.lincomb(xCur, alpha, p);
0350         
0351         
0352         % Query cost and gradient at the candidate new point.
0353         [xNextCost, xNextGrad] = getCostGrad(problem, xNext, storedb, newkey);
0354         
0355         % Compute sk and yk
0356         sk = M.transp(xCur, xNext, step);
0357         yk = M.lincomb(xNext, 1, xNextGrad, ...
0358                              -1, M.transp(xCur, xNext, xCurGradient));
0359 
0360         % Computation of the BFGS step is invariant under scaling of sk and
0361         % yk by a common factor. For numerical reasons, we scale sk and yk
0362         % so that sk is a unit norm vector.
0363         norm_sk = M.norm(xNext, sk);
0364         sk = M.lincomb(xNext, 1/norm_sk, sk);
0365         yk = M.lincomb(xNext, 1/norm_sk, yk);
0366         
0367         inner_sk_yk = M.inner(xNext, sk, yk);
0368         inner_sk_sk = M.norm(xNext, sk)^2;    % ensures nonnegativity
0369         
0370         
0371         % If the cautious step is accepted (which is the intended
0372         % behavior), we record sk, yk and rhok and need to do some
0373         % housekeeping. If the cautious step is rejected, these are not
0374         % recorded. In all cases, xNext is the next iterate: the notion of
0375         % accept/reject here is limited to whether or not we keep track of
0376         % sk, yk, rhok to update the BFGS operator.
0377         cap = options.strict_inc_func(xCurGradNorm);
0378         if inner_sk_sk ~= 0 && (inner_sk_yk / inner_sk_sk) >= cap
0379             
0380             accepted = true;
0381             
0382             rhok = 1/inner_sk_yk;
0383             
0384             scaleFactor = inner_sk_yk / M.norm(xNext, yk)^2;
0385             
0386             % Time to store the vectors sk, yk and the scalar rhok.
0387             % Remember: we need to transport all vectors to the most
0388             % current tangent space.
0389             
0390             % If we are out of memory
0391             if k >= options.memory
0392                 
0393                 % sk and yk are saved from 1 to the end with the most
0394                 % current recorded to the rightmost hand side of the cells
0395                 % that are occupied. When memory is full, do a shift so
0396                 % that the rightmost is earliest and replace it with the
0397                 % most recent sk, yk.
0398                 for  i = 2 : options.memory
0399                     sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0400                     yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0401                 end
0402                 if options.memory > 1
0403                     sHistory = sHistory([2:end, 1]);
0404                     yHistory = yHistory([2:end, 1]);
0405                     rhoHistory = rhoHistory([2:end 1]);
0406                 end
0407                 if options.memory > 0
0408                     sHistory{options.memory} = sk;
0409                     yHistory{options.memory} = yk;
0410                     rhoHistory{options.memory} = rhok;
0411                 end
0412                 
0413             % If we are not out of memory
0414             else
0415                 
0416                 for i = 1:k
0417                     sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0418                     yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0419                 end
0420                 sHistory{k+1} = sk;
0421                 yHistory{k+1} = yk;
0422                 rhoHistory{k+1} = rhok;
0423                 
0424             end
0425             
0426             k = k + 1;
0427             
0428         % The cautious step is rejected: we do not store sk, yk, rhok but
0429         % we still need to transport stored vectors to the new tangent
0430         % space.
0431         else
0432             
0433             accepted = false;
0434             
0435             for  i = 1 : min(k, options.memory)
0436                 sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0437                 yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0438             end
0439             
0440         end
0441         
0442         % Update variables to new iterate.
0443         storedb.removefirstifdifferent(key, newkey);
0444         xCur = xNext;
0445         key = newkey;
0446         xCurGradient = xNextGrad;
0447         xCurGradNorm = M.norm(xNext, xNextGrad);
0448         xCurCost = xNextCost;
0449         
0450         % iter is the number of iterations we have accomplished.
0451         iter = iter + 1;
0452         
0453         % Make sure we don't use too much memory for the store database
0454         % (this is independent from the BFGS memory.)
0455         storedb.purge();
0456         
0457         
0458         % Log statistics for freshly executed iteration
0459         stats = savestats();
0460         info(iter+1) = stats; 
0461         
0462     end
0463 
0464     
0465     % Housekeeping before we return
0466     info = info(1:iter+1);
0467     x = xCur;
0468     cost = xCurCost;
0469 
0470     if options.verbosity >= 1
0471         fprintf('Total time is %f [s] (excludes statsfun)\n', ...
0472                 info(end).time);
0473     end
0474 
0475     
0476     % Routine in charge of collecting the current iteration stats
0477     function stats = savestats()
0478         stats.iter = iter;
0479         stats.cost = xCurCost;
0480         stats.gradnorm = xCurGradNorm;
0481         if iter == 0
0482             stats.stepsize = NaN;
0483             stats.time = toc(timetic);
0484             stats.accepted = NaN;
0485         else
0486             stats.stepsize = stepsize;
0487             stats.time = info(iter).time + toc(timetic);
0488             stats.accepted = accepted;
0489         end
0490         stats.linesearch = lsstats;
0491         stats = applyStatsfun(problem, xCur, storedb, key, options, stats);
0492     end
0493 
0494 end
0495 
0496 
0497 
0498 
0499 % BFGS step, see Wen's paper for details. This functon takes in a tangent
0500 % vector g, and applies an approximate inverse Hessian P to it to get Pg.
0501 % Then, -Pg is returned.
0502 %
0503 % Theory requires the vector transport to be isometric and to satisfy the
0504 % locking condition (see paper), but these properties do not seem to be
0505 % crucial in practice. If your manifold provides M.isotransp, it may be
0506 % good to do M.transp = M.isotransp; after loading M with a factory.
0507 %
0508 % This implementation operates in the tangent space of the most recent
0509 % point since all vectors in sHistory and yHistory have been transported
0510 % there.
0511 function dir = getDirection(M, xCur, xCurGradient, sHistory, yHistory, ...
0512                             rhoHistory, scaleFactor, k)
0513     
0514     q = xCurGradient;
0515     
0516     inner_s_q = zeros(1, k);
0517     
0518     for i = k : -1 : 1
0519         inner_s_q(1, i) = rhoHistory{i} * M.inner(xCur, sHistory{i}, q);
0520         q = M.lincomb(xCur, 1, q, -inner_s_q(1, i), yHistory{i});
0521     end
0522     
0523     r = M.lincomb(xCur, scaleFactor, q);
0524     
0525     for i = 1 : k
0526          omega = rhoHistory{i} * M.inner(xCur, yHistory{i}, r);
0527          r = M.lincomb(xCur, 1, r, inner_s_q(1, i)-omega, sHistory{i});
0528     end
0529     
0530     dir = M.lincomb(xCur, -1, r);
0531     
0532 end

Generated on Mon 10-Sep-2018 11:48:06 by m2html © 2005