Home > manopt > solvers > bfgs > rlbfgs.m

rlbfgs

PURPOSE ^

Riemannian limited memory BFGS solver for smooth objective functions.

SYNOPSIS ^

function [x, cost, info, options] = rlbfgs(problem, x0, options)

DESCRIPTION ^

 Riemannian limited memory BFGS solver for smooth objective functions.

 function [x, cost, info, options] = rlbfgs(problem)
 function [x, cost, info, options] = rlbfgs(problem, x0)
 function [x, cost, info, options] = rlbfgs(problem, x0, options)
 function [x, cost, info, options] = rlbfgs(problem, [], options)


 This is a Riemannian limited memory BFGS solver (quasi-Newton method), 
 which aims to minimize the cost function in the given problem structure.
 It requires access to the gradient of the cost function.

 Parameter options.memory can be used to specify the number of iterations
 the algorithm remembers and uses to approximate the inverse Hessian of
 the cost. Default value is 30.
 For unlimited memory, set options.memory = Inf.


 For a description of the algorithm and theorems offering convergence
 guarantees, see the references below.

 The initial iterate is x0 if it is provided. Otherwise, a random point on
 the manifold is picked. To specify options whilst not specifying an
 initial iterate, give x0 as [] (the empty matrix).

 The two outputs 'x' and 'cost' are the last reached point on the manifold
 and its cost. 
 
 The output 'info' is a struct-array which contains information about the
 iterations:
   iter (integer)
       The iteration number. The initial guess is 0.
    cost (double)
       The corresponding cost value.
    gradnorm (double)
       The (Riemannian) norm of the gradient.
    time (double)
       The total elapsed time in seconds to reach the corresponding cost.
    stepsize (double)
       The size of the step from the previous to the new iterate.
   accepted (Boolean)
       true if step is accepted in the cautious update. 0 otherwise.
   And possibly additional information logged by options.statsfun.
 For example, type [info.gradnorm] to obtain a vector of the successive
 gradient norms reached at each iteration.

 The options structure is used to overwrite the default values. All
 options have a default value and are hence optional. To force an option
 value, pass an options structure with a field options.optionname, where
 optionname is one of the following and the default value is indicated
 between parentheses:

   tolgradnorm (1e-6)
       The algorithm terminates if the norm of the gradient drops below
       this. For well-scaled problems, a rule of thumb is that you can
       expect to reduce the gradient norm by 8 orders of magnitude
       (sqrt(eps)) compared to the gradient norm at a "typical" point (a
       rough initial iterate for example). Further decrease is sometimes
       possible, but inexact floating point arithmetic will eventually
       limit the final accuracy. If tolgradnorm is set too low, the
       algorithm may end up iterating forever (or at least until another
       stopping criterion triggers).
   maxiter (1000)
       The algorithm terminates if maxiter iterations were executed.
   maxtime (Inf)
       The algorithm terminates if maxtime seconds elapsed.
   minstepsize (1e-10)
     The minimum norm of the tangent vector that points from the current
     point to the next point. If the norm is less than minstepsize, the 
     program will terminate.
   memory (30)
     The number of previous iterations the program remembers. This is used 
     to approximate the inverse Hessian at the current point. Because of
     difficulty of maintaining a representation of operators in terms of
     coordinates, a recursive method is used. The number of steps in the
     recursion is at most options.memory. This parameter can take any
     integer value >= 0, or Inf, which is taken to be options.maxiter. If
     options.maxiter has value Inf, then it will take value 10000 and a
     warning will be displayed.
   linesearch (@linesearch_hint)
       Function handle to a line search function. The options structure is
       passed to the line search too, so you can pass it parameters. See
       each line search's documentation for info.
       By default, the intial multiplier tried is alpha = 1. This can be
       changed with options.linesearch: see help of linesearch_hint.
   strict_inc_func (@(t) t)
     The Cautious step needs a real function that has value 0 at t = 0,
     and  is strictly increasing. See details in Wen Huang's paper
     "A Riemannian BFGS Method without Differentiated Retraction for 
     Nonconvex Optimization Problems"
   statsfun (none)
       Function handle to a function that will be called after each
       iteration to provide the opportunity to log additional statistics.
       They will be returned in the info struct. See the generic Manopt
       documentation about solvers for further information. statsfun is
       called with the point x that was reached last.
   stopfun (none)
       Function handle to a function that will be called at each iteration
       to provide the opportunity to specify additional stopping criteria.
       See the generic Manopt documentation about solvers for further
       information.
   verbosity (2)
       Integer number used to tune the amount of output the algorithm
       generates during execution (mostly as text in the command window).
       The higher, the more output. 0 means silent. 3 and above includes a
       display of the options structure at the beginning of the execution.
   debug (false)
       Set to true to allow the algorithm to perform additional
       computations for debugging purposes. If a debugging test fails, you
       will be informed of it, usually via the command window. Be aware
       that these additional computations appear in the algorithm timings
       too, and may interfere with operations such as counting the number
       of cost evaluations, etc. (the debug calls get storedb too).
   storedepth (30)
       Maximum number of different points x of the manifold for which a
       store structure will be kept in memory in the storedb. If the
       caching features of Manopt are not used, this is irrelevant. If
       memory usage is an issue, you may try to lower this number.
       Profiling may then help to investigate if a performance hit was
       incurred as a result.


 Please cite the Manopt paper as well as the research paper:
 @InBook{Huang2016,
   title     = {A {R}iemannian {BFGS} Method for Nonconvex Optimization Problems},
   author    = {Huang, W. and Absil, P.-A. and Gallivan, K.A.},
   year      = {2016},
   publisher = {Springer International Publishing},
   editor    = {Karas{\"o}zen, B{\"u}lent and Manguo{\u{g}}lu, Murat and Tezer-Sezgin, M{\"u}nevver and G{\"o}ktepe, Serdar and U{\u{g}}ur, {\"O}m{\"u}r},
   address   = {Cham},
   booktitle = {Numerical Mathematics and Advanced Applications ENUMATH 2015},
   pages     = {627--634},
   doi       = {10.1007/978-3-319-39929-4_60}
 }

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [x, cost, info, options] = rlbfgs(problem, x0, options)
0002 % Riemannian limited memory BFGS solver for smooth objective functions.
0003 %
0004 % function [x, cost, info, options] = rlbfgs(problem)
0005 % function [x, cost, info, options] = rlbfgs(problem, x0)
0006 % function [x, cost, info, options] = rlbfgs(problem, x0, options)
0007 % function [x, cost, info, options] = rlbfgs(problem, [], options)
0008 %
0009 %
0010 % This is a Riemannian limited memory BFGS solver (quasi-Newton method),
0011 % which aims to minimize the cost function in the given problem structure.
0012 % It requires access to the gradient of the cost function.
0013 %
0014 % Parameter options.memory can be used to specify the number of iterations
0015 % the algorithm remembers and uses to approximate the inverse Hessian of
0016 % the cost. Default value is 30.
0017 % For unlimited memory, set options.memory = Inf.
0018 %
0019 %
0020 % For a description of the algorithm and theorems offering convergence
0021 % guarantees, see the references below.
0022 %
0023 % The initial iterate is x0 if it is provided. Otherwise, a random point on
0024 % the manifold is picked. To specify options whilst not specifying an
0025 % initial iterate, give x0 as [] (the empty matrix).
0026 %
0027 % The two outputs 'x' and 'cost' are the last reached point on the manifold
0028 % and its cost.
0029 %
0030 % The output 'info' is a struct-array which contains information about the
0031 % iterations:
0032 %   iter (integer)
0033 %       The iteration number. The initial guess is 0.
0034 %    cost (double)
0035 %       The corresponding cost value.
0036 %    gradnorm (double)
0037 %       The (Riemannian) norm of the gradient.
0038 %    time (double)
0039 %       The total elapsed time in seconds to reach the corresponding cost.
0040 %    stepsize (double)
0041 %       The size of the step from the previous to the new iterate.
0042 %   accepted (Boolean)
0043 %       true if step is accepted in the cautious update. 0 otherwise.
0044 %   And possibly additional information logged by options.statsfun.
0045 % For example, type [info.gradnorm] to obtain a vector of the successive
0046 % gradient norms reached at each iteration.
0047 %
0048 % The options structure is used to overwrite the default values. All
0049 % options have a default value and are hence optional. To force an option
0050 % value, pass an options structure with a field options.optionname, where
0051 % optionname is one of the following and the default value is indicated
0052 % between parentheses:
0053 %
0054 %   tolgradnorm (1e-6)
0055 %       The algorithm terminates if the norm of the gradient drops below
0056 %       this. For well-scaled problems, a rule of thumb is that you can
0057 %       expect to reduce the gradient norm by 8 orders of magnitude
0058 %       (sqrt(eps)) compared to the gradient norm at a "typical" point (a
0059 %       rough initial iterate for example). Further decrease is sometimes
0060 %       possible, but inexact floating point arithmetic will eventually
0061 %       limit the final accuracy. If tolgradnorm is set too low, the
0062 %       algorithm may end up iterating forever (or at least until another
0063 %       stopping criterion triggers).
0064 %   maxiter (1000)
0065 %       The algorithm terminates if maxiter iterations were executed.
0066 %   maxtime (Inf)
0067 %       The algorithm terminates if maxtime seconds elapsed.
0068 %   minstepsize (1e-10)
0069 %     The minimum norm of the tangent vector that points from the current
0070 %     point to the next point. If the norm is less than minstepsize, the
0071 %     program will terminate.
0072 %   memory (30)
0073 %     The number of previous iterations the program remembers. This is used
0074 %     to approximate the inverse Hessian at the current point. Because of
0075 %     difficulty of maintaining a representation of operators in terms of
0076 %     coordinates, a recursive method is used. The number of steps in the
0077 %     recursion is at most options.memory. This parameter can take any
0078 %     integer value >= 0, or Inf, which is taken to be options.maxiter. If
0079 %     options.maxiter has value Inf, then it will take value 10000 and a
0080 %     warning will be displayed.
0081 %   linesearch (@linesearch_hint)
0082 %       Function handle to a line search function. The options structure is
0083 %       passed to the line search too, so you can pass it parameters. See
0084 %       each line search's documentation for info.
0085 %       By default, the intial multiplier tried is alpha = 1. This can be
0086 %       changed with options.linesearch: see help of linesearch_hint.
0087 %   strict_inc_func (@(t) t)
0088 %     The Cautious step needs a real function that has value 0 at t = 0,
0089 %     and  is strictly increasing. See details in Wen Huang's paper
0090 %     "A Riemannian BFGS Method without Differentiated Retraction for
0091 %     Nonconvex Optimization Problems"
0092 %   statsfun (none)
0093 %       Function handle to a function that will be called after each
0094 %       iteration to provide the opportunity to log additional statistics.
0095 %       They will be returned in the info struct. See the generic Manopt
0096 %       documentation about solvers for further information. statsfun is
0097 %       called with the point x that was reached last.
0098 %   stopfun (none)
0099 %       Function handle to a function that will be called at each iteration
0100 %       to provide the opportunity to specify additional stopping criteria.
0101 %       See the generic Manopt documentation about solvers for further
0102 %       information.
0103 %   verbosity (2)
0104 %       Integer number used to tune the amount of output the algorithm
0105 %       generates during execution (mostly as text in the command window).
0106 %       The higher, the more output. 0 means silent. 3 and above includes a
0107 %       display of the options structure at the beginning of the execution.
0108 %   debug (false)
0109 %       Set to true to allow the algorithm to perform additional
0110 %       computations for debugging purposes. If a debugging test fails, you
0111 %       will be informed of it, usually via the command window. Be aware
0112 %       that these additional computations appear in the algorithm timings
0113 %       too, and may interfere with operations such as counting the number
0114 %       of cost evaluations, etc. (the debug calls get storedb too).
0115 %   storedepth (30)
0116 %       Maximum number of different points x of the manifold for which a
0117 %       store structure will be kept in memory in the storedb. If the
0118 %       caching features of Manopt are not used, this is irrelevant. If
0119 %       memory usage is an issue, you may try to lower this number.
0120 %       Profiling may then help to investigate if a performance hit was
0121 %       incurred as a result.
0122 %
0123 %
0124 % Please cite the Manopt paper as well as the research paper:
0125 % @InBook{Huang2016,
0126 %   title     = {A {R}iemannian {BFGS} Method for Nonconvex Optimization Problems},
0127 %   author    = {Huang, W. and Absil, P.-A. and Gallivan, K.A.},
0128 %   year      = {2016},
0129 %   publisher = {Springer International Publishing},
0130 %   editor    = {Karas{\"o}zen, B{\"u}lent and Manguo{\u{g}}lu, Murat and Tezer-Sezgin, M{\"u}nevver and G{\"o}ktepe, Serdar and U{\u{g}}ur, {\"O}m{\"u}r},
0131 %   address   = {Cham},
0132 %   booktitle = {Numerical Mathematics and Advanced Applications ENUMATH 2015},
0133 %   pages     = {627--634},
0134 %   doi       = {10.1007/978-3-319-39929-4_60}
0135 % }
0136 %
0137 
0138 
0139 % This file is part of Manopt: www.manopt.org.
0140 % Original author: Changshuo Liu, July 19, 2017.
0141 % Contributors: Nicolas Boumal
0142 % Change log:
0143 
0144 
0145     % Verify that the problem description is sufficient for the solver.
0146     if ~canGetCost(problem)
0147         warning('manopt:getCost', ...
0148             'No cost provided. The algorithm will likely abort.');
0149     end
0150     if ~canGetGradient(problem) && ~canGetApproxGradient(problem)
0151         % Note: we do not give a warning if an approximate gradient is
0152         % explicitly given in the problem description, as in that case the user
0153         % seems to be aware of the issue.
0154         warning('manopt:getGradient:approx', ...
0155            ['No gradient provided. Using an FD approximation instead (slow).\n' ...
0156             'It may be necessary to increase options.tolgradnorm.\n' ...
0157             'To disable this warning: warning(''off'', ''manopt:getGradient:approx'')']);
0158         problem.approxgrad = approxgradientFD(problem);
0159     end
0160     
0161     % Local defaults for the program
0162     localdefaults.minstepsize = 1e-10;
0163     localdefaults.maxiter = 1000;
0164     localdefaults.tolgradnorm = 1e-6;
0165     localdefaults.memory = 30;
0166     localdefaults.strict_inc_func = @(t) t;
0167     localdefaults.ls_max_steps  = 25;
0168     localdefaults.storedepth = 30;
0169     localdefaults.linesearch = @linesearch_hint;
0170     
0171     % Merge global and local defaults, then merge w/ user options, if any.
0172     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0173     if ~exist('options', 'var') || isempty(options)
0174         options = struct();
0175     end
0176     options = mergeOptions(localdefaults, options);
0177     
0178     % To make sure memory in range [0, Inf)
0179     options.memory = max(options.memory, 0);
0180     if options.memory == Inf
0181         if isinf(options.maxiter)
0182             options.memory = 10000;
0183             warning('rlbfgs:memory', ['options.memory and options.maxiter' ...
0184               ' are both Inf; options.memory has been changed to 10000.']);
0185         else
0186             options.memory = options.maxiter;
0187         end
0188     end
0189     
0190     M = problem.M;
0191     
0192     % Create a random starting point if no starting point is provided.
0193     if ~exist('x0', 'var')|| isempty(x0)
0194         xCur = M.rand(); 
0195     else
0196         xCur = x0;
0197     end
0198     
0199     timetic = tic();
0200     
0201     % Create a store database and get a key for the current x
0202     storedb = StoreDB(options.storedepth);
0203     key = storedb.getNewKey();
0204     
0205     % __________Initialization of variables______________
0206     % Number of iterations since the last restart
0207     k = 0;  
0208     % Total number of BFGS iterations
0209     iter = 0; 
0210     
0211     % This cell stores step vectors which point from x_{t} to x_{t+1} for t
0212     % indexing the last iterations, capped at options.memory.
0213     % That is, it stores up to options.memory of the most recent step
0214     % vectors. However, the implementation below does not need step vectors
0215     % in their respective tangent spaces at x_{t}'s. Rather, it requires
0216     % them transported to the current point's tangent space by vector
0217     % tranport. For details regarding the requirements on the the vector
0218     % tranport, see the reference paper by Huang et al.
0219     % In this implementation, those step vectors are iteratively
0220     % transported to the current point's tangent space after every
0221     % iteration. Thus, at every iteration, vectors in sHistory are in the
0222     % current point's tangent space.
0223     sHistory = cell(1, options.memory);
0224     
0225     % This cell stores the differences for latest t's of the gradient at
0226     % x_{t+1} and the gradient at x_{t}, transported to x_{t+1}'s tangent
0227     % space. The memory is also capped at options.memory.
0228     yHistory = cell(1, options.memory);
0229     
0230     % rhoHistory{t} stores the reciprocal of the inner product between
0231     % sHistory{t} and yHistory{t}.
0232     rhoHistory = cell(1, options.memory);
0233     
0234     % Scaling of direction given by getDirection for acceptable step
0235     alpha = 1; 
0236     
0237     % Scaling of initial matrix, Barzilai-Borwein.
0238     scaleFactor = 1;
0239     
0240     % Norm of the step
0241     stepsize = 1;
0242     
0243     % Stores whether the step is accepted by the cautious update check.
0244     accepted = true;
0245     
0246     % Query the cost function and its gradient
0247     [xCurCost, xCurGradient] = getCostGrad(problem, xCur, storedb, key);
0248     
0249     xCurGradNorm = M.norm(xCur, xCurGradient);
0250     
0251     % Line-search statistics for recording in info.
0252     lsstats = [];
0253     
0254     % Flag to control restarting scheme to avoid infinite loops (see below)
0255     ultimatum = false;
0256     
0257     % Save stats in a struct array info, and preallocate.
0258     stats = savestats();
0259     info(1) = stats;
0260     info(min(10000, options.maxiter+1)).iter = [];
0261     
0262     if options.verbosity >= 2
0263         fprintf(' iter                   cost val            grad. norm           alpha\n');
0264     end
0265     
0266     % Main iteration
0267     while true
0268 
0269         % Display iteration information
0270         if options.verbosity >= 2
0271         fprintf('%5d    %+.16e        %.8e      %.4e\n', ...
0272                 iter, xCurCost, xCurGradNorm, alpha);
0273         end
0274         
0275         % Start timing this iteration
0276         timetic = tic();
0277         
0278         % Run standard stopping criterion checks
0279         [stop, reason] = stoppingcriterion(problem, xCur, options, ...
0280                                            info, iter+1);
0281         
0282         % If none triggered, run specific stopping criterion check
0283         if ~stop 
0284             if stats.stepsize < options.minstepsize
0285                 % To avoid infinite loop and to push the search further
0286                 % in case BFGS approximation of Hessian is off towards
0287                 % the end, we erase the memory by setting k = 0;
0288                 % In this way, it starts off like a steepest descent.
0289                 % If even steepest descent does not work, then it is
0290                 % hopeless and we will terminate.
0291                 if ~ultimatum
0292                     if options.verbosity >= 2
0293                         fprintf(['stepsize is too small, restarting ' ...
0294                             'the bfgs procedure at the current point.\n']);
0295                     end
0296                     k = 0;
0297                     ultimatum = true;
0298                 else
0299                     stop = true;
0300                     reason = sprintf(['Last stepsize smaller than '  ...
0301                         'minimum allowed; options.minstepsize = %g.'], ...
0302                         options.minstepsize);
0303                 end
0304             else
0305                 % We are not in trouble: lift the ultimatum if it was on.
0306                 ultimatum = false;
0307             end
0308         end  
0309         
0310         if stop
0311             if options.verbosity >= 1
0312                 fprintf([reason '\n']);
0313             end
0314             break;
0315         end
0316 
0317         
0318         % Compute BFGS direction
0319         p = getDirection(M, xCur, xCurGradient, sHistory,...
0320                 yHistory, rhoHistory, scaleFactor, min(k, options.memory));
0321 
0322         % Execute line-search
0323         [stepsize, xNext, newkey, lsstats] = ...
0324             linesearch_hint(problem, xCur, p, xCurCost, ...
0325                             M.inner(xCur, xCurGradient, p), ...
0326                             options, storedb, key);
0327         
0328         % Record the BFGS step-multiplier alpha which as effectively
0329         % selected. Toward convergence, we hope to see alpha = 1.
0330         alpha = stepsize/M.norm(xCur, p);
0331         step = M.lincomb(xCur, alpha, p);
0332         
0333         
0334         % Query cost and gradient at the candidate new point.
0335         [xNextCost, xNextGrad] = getCostGrad(problem, xNext, storedb, newkey);
0336         
0337         % Compute sk and yk
0338         sk = M.transp(xCur, xNext, step);
0339         yk = M.lincomb(xNext, 1, xNextGrad, ...
0340                              -1, M.transp(xCur, xNext, xCurGradient));
0341 
0342         % Computation of the BFGS step is invariant under scaling of sk and
0343         % yk by a common factor. For numerical reasons, we scale sk and yk
0344         % so that sk is a unit norm vector.
0345         norm_sk = M.norm(xNext, sk);
0346         sk = M.lincomb(xNext, 1/norm_sk, sk);
0347         yk = M.lincomb(xNext, 1/norm_sk, yk);
0348         
0349         inner_sk_yk = M.inner(xNext, sk, yk);
0350         inner_sk_sk = M.norm(xNext, sk)^2;    % ensures nonnegativity
0351         
0352         
0353         % If the cautious step is accepted (which is the intended
0354         % behavior), we record sk, yk and rhok and need to do some
0355         % housekeeping. If the cautious step is rejected, these are not
0356         % recorded. In all cases, xNext is the next iterate: the notion of
0357         % accept/reject here is limited to whether or not we keep track of
0358         % sk, yk, rhok to update the BFGS operator.
0359         cap = options.strict_inc_func(xCurGradNorm);
0360         if inner_sk_sk ~= 0 && (inner_sk_yk / inner_sk_sk) >= cap
0361             
0362             accepted = true;
0363             
0364             rhok = 1/inner_sk_yk;
0365             
0366             scaleFactor = inner_sk_yk / M.norm(xNext, yk)^2;
0367             
0368             % Time to store the vectors sk, yk and the scalar rhok.
0369             % Remember: we need to transport all vectors to the most
0370             % current tangent space.
0371             
0372             % If we are out of memory
0373             if k >= options.memory
0374                 
0375                 % sk and yk are saved from 1 to the end with the most
0376                 % current recorded to the rightmost hand side of the cells
0377                 % that are occupied. When memory is full, do a shift so
0378                 % that the rightmost is earliest and replace it with the
0379                 % most recent sk, yk.
0380                 for  i = 2 : options.memory
0381                     sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0382                     yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0383                 end
0384                 if options.memory > 1
0385                     sHistory = sHistory([2:end, 1]);
0386                     yHistory = yHistory([2:end, 1]);
0387                     rhoHistory = rhoHistory([2:end 1]);
0388                 end
0389                 if options.memory > 0
0390                     sHistory{options.memory} = sk;
0391                     yHistory{options.memory} = yk;
0392                     rhoHistory{options.memory} = rhok;
0393                 end
0394                 
0395             % If we are not out of memory
0396             else
0397                 
0398                 for  i = 1:k
0399                     sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0400                     yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0401                 end
0402                 sHistory{k+1} = sk;
0403                 yHistory{k+1} = yk;
0404                 rhoHistory{k+1} = rhok;
0405                 
0406             end
0407             
0408             k = k + 1;
0409             
0410         % The cautious step is rejected: we do not store sk, yk, rhok but
0411         % we still need to transport stored vectors to the new tangent
0412         % space.
0413         else
0414             
0415             accepted = false;
0416             
0417             for  i = 1 : min(k, options.memory)
0418                 sHistory{i} = M.transp(xCur, xNext, sHistory{i});
0419                 yHistory{i} = M.transp(xCur, xNext, yHistory{i});
0420             end
0421             
0422         end
0423         
0424         % Update variables to new iterate
0425         iter = iter + 1;
0426         xCur = xNext;
0427         key = newkey;
0428         xCurGradient = xNextGrad;
0429         xCurGradNorm = M.norm(xNext, xNextGrad);
0430         xCurCost = xNextCost;
0431         
0432         
0433         % Make sure we don't use too much memory for the store database
0434         % (this is independent from the BFGS memory.)
0435         storedb.purge();
0436         
0437         
0438         % Log statistics for freshly executed iteration
0439         stats = savestats();
0440         info(iter+1) = stats; 
0441         
0442     end
0443 
0444     
0445     % Housekeeping before we return
0446     info = info(1:iter+1);
0447     x = xCur;
0448     cost = xCurCost;
0449 
0450     if options.verbosity >= 1
0451         fprintf('Total time is %f [s] (excludes statsfun)\n', ...
0452                 info(end).time);
0453     end
0454 
0455     
0456     % Routine in charge of collecting the current iteration stats
0457     function stats = savestats()
0458         stats.iter = iter;
0459         stats.cost = xCurCost;
0460         stats.gradnorm = xCurGradNorm;
0461         if iter == 0
0462             stats.stepsize = NaN;
0463             stats.time = toc(timetic);
0464             stats.accepted = NaN;
0465         else
0466             stats.stepsize = stepsize;
0467             stats.time = info(iter).time + toc(timetic);
0468             stats.accepted = accepted;
0469         end
0470         stats.linesearch = lsstats;
0471         stats = applyStatsfun(problem, xCur, storedb, key, options, stats);
0472     end
0473 
0474 end
0475 
0476 
0477 
0478 
0479 % BFGS step, see Wen's paper for details. This functon takes in a tangent
0480 % vector g, and applies an approximate inverse Hessian P to it to get Pg.
0481 % Then, -Pg is returned.
0482 %
0483 % Theory requires the vector transport to be isometric and to satisfy the
0484 % locking condition (see paper), but these properties do not seem to be
0485 % crucial in practice. If your manifold provides M.isotransp, it may be
0486 % good to do M.transp = M.isotransp; after loading M with a factory.
0487 %
0488 % This implementation operates in the tangent space of the most recent
0489 % point since all vectors in sHistory and yHistory have been transported
0490 % there.
0491 function dir = getDirection(M, xCur, xCurGradient, sHistory, yHistory, ...
0492                             rhoHistory, scaleFactor, k)
0493     
0494     q = xCurGradient;
0495     
0496     inner_s_q = zeros(1, k);
0497     
0498     for i = k : -1 : 1
0499         inner_s_q(1, i) = rhoHistory{i} * M.inner(xCur, sHistory{i}, q);
0500         q = M.lincomb(xCur, 1, q, -inner_s_q(1, i), yHistory{i});
0501     end
0502     
0503     r = M.lincomb(xCur, scaleFactor, q);
0504     
0505     for i = 1 : k
0506          omega = rhoHistory{i} * M.inner(xCur, yHistory{i}, r);
0507          r = M.lincomb(xCur, 1, r, inner_s_q(1, i)-omega, sHistory{i});
0508     end
0509     
0510     dir = M.lincomb(xCur, -1, r);
0511     
0512 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005