Home > manopt > solvers > arc > arc.m

# arc

## PURPOSE

Adaptive regularization by cubics (ARC) minimization algorithm for Manopt

## SYNOPSIS

function [x, cost, info, options] = arc(problem, x, options)

## DESCRIPTION

``` Adaptive regularization by cubics (ARC) minimization algorithm for Manopt

function [x, cost, info, options] = arc(problem)
function [x, cost, info, options] = arc(problem, x0)
function [x, cost, info, options] = arc(problem, x0, options)
function [x, cost, info, options] = arc(problem, [], options)

Apply the ARC minimization algorithm to the problem defined in the
problem structure, starting at x0 if it is provided (otherwise, at a
random point on the manifold). To specify options whilst not specifying
an initial guess, give x0 as [] (the empty matrix).

In most of the examples bundled with the toolbox (see link below), the
solver can be replaced by the present one as is.

The outputs x and cost are the last reached point on the manifold and its
cost. The struct-array info contains information about the iterations:
iter (integer)
The (outer) iteration number, i.e., number of steps considered
so far (whether accepted or rejected). The initial guess is 0.
cost (double)
The corresponding cost value.
The (Riemannian) norm of the gradient.
hessiancalls (integer)
The number of Hessian calls issued by the subproblem solver to
compute this iterate.
time (double)
The total elapsed time in seconds to reach the corresponding cost.
rho (double)
The regularized performance ratio for the iterate.
See options.rho_regularization.
rhonum, rhoden (double)
Numerator and denominator of the performance ratio, before
regularization.
accepted (boolean)
Whether the proposed iterate was accepted or not.
stepsize (double)
The (Riemannian) norm of the vector returned by the subproblem
solver and which is retracted to obtain the proposed next iterate.
If accepted = true for the corresponding iterate, this is the size
of the step from the previous to the new iterate. If accepted is
false, the step was not executed and this is the size of the
rejected step.
sigma (double)
The cubic regularization parameter at the outer iteration.
And possibly additional information logged by options.statsfun or by
the subproblem solver.
For example, type [info.gradnorm] to obtain a vector of the successive
gradient norms reached and [info.time] to obtain a vector with the
corresponding computation times to reach that point.

The options structure is used to overwrite the default values. All
options have a default value and are hence optional. To force an option
value, pass an options structure with a field options.optionname, where
optionname is one of the following. The default value is indicated
between parentheses. The subproblem solver may also accept options.

The algorithm terminates if the norm of the gradient drops below this.
maxiter (1000)
The algorithm terminates if maxiter (outer) iterations have been executed.
maxtime (Inf)
The algorithm terminates if maxtime seconds elapsed.
sigma_0 (100 / trust-regions default maximum radius)
Initial regularization parameter.
sigma_min (1e-7)
Minimum regularization parameter.
eta_1 (0.1)
If rho is below this, the step is unsuccessful (rejected).
eta_2 (0.9)
If rho exceeds this, the step is very successful.
gamma_1 (0.1)
Shrinking factor for regularization parameter if very successful.
gamma_2 (1.5)
Expansion factor for regularization parameter if unsuccessful.
subproblemsolver (@arc_lanczos)
Function handle to a subproblem solver. The subproblem solver will
also see this options structure, so that parameters can be passed
to it through here as well.
rho_regularization (1e3)
See help for the same parameter in the trustregions solver.
statsfun (none)
Function handle to a function that will be called after each
iteration to provide the opportunity to log additional statistics.
They will be returned in the info struct. See the generic Manopt
documentation about solvers for further information.
stopfun (none)
Function handle to a function that will be called at each iteration
to provide the opportunity to specify additional stopping criteria.
See the generic Manopt documentation about solvers for further
information.
verbosity (3)
Integer number used to tune the amount of output the algorithm
generates during execution (mostly as text in the command window).
The higher, the more output. 0 means silent.
storedepth (2)
Maximum number of different points x of the manifold for which a
store structure will be kept in memory in the storedb. If the
caching features of Manopt are not used, this is irrelevant. As of
version 5.0, this is not particularly important.

Please cite the Manopt paper as well as the research paper:
@article{agarwal2018arcfirst,
author  = {Agarwal, N. and Boumal, N. and Bullins, B. and Cartis, C.},
title   = {Adaptive regularization with cubics on manifolds with a first-order analysis},
journal = {arXiv preprint arXiv:1806.00065},
year    = {2018}
}

## CROSS-REFERENCE INFORMATION

This function calls:
• StoreDB
• applyStatsfun Apply the statsfun function to a stats structure (for solvers).
• canGetApproxGradient Checks whether an approximate gradient can be computed for this problem.
• canGetApproxHessian Checks whether an approximate Hessian can be computed for this problem.
• canGetCost Checks whether the cost function can be computed for a problem structure.
• canGetGradient Checks whether the gradient can be computed for a problem structure.
• canGetHessian Checks whether the Hessian can be computed for a problem structure.
• getCostGrad Computes the cost function and the gradient at x in one call if possible.
• getGlobalDefaults Returns a structure with default option values for Manopt.
• mergeOptions Merges two options structures with one having precedence over the other.
• stoppingcriterion Checks for standard stopping criteria, as a helper to solvers.
• arc_lanczos Subproblem solver for ARC based on a Lanczos process.
• approxgradientFD Gradient approx. fnctn handle based on finite differences of the cost.
• approxhessianFD Hessian approx. fnctn handle based on finite differences of the gradient.
This function is called by:

## SOURCE CODE

```0001 function [x, cost, info, options] = arc(problem, x, options)
0002 % Adaptive regularization by cubics (ARC) minimization algorithm for Manopt
0003 %
0004 % function [x, cost, info, options] = arc(problem)
0005 % function [x, cost, info, options] = arc(problem, x0)
0006 % function [x, cost, info, options] = arc(problem, x0, options)
0007 % function [x, cost, info, options] = arc(problem, [], options)
0008 %
0009 % Apply the ARC minimization algorithm to the problem defined in the
0010 % problem structure, starting at x0 if it is provided (otherwise, at a
0011 % random point on the manifold). To specify options whilst not specifying
0012 % an initial guess, give x0 as [] (the empty matrix).
0013 %
0014 % In most of the examples bundled with the toolbox (see link below), the
0015 % solver can be replaced by the present one as is.
0016 %
0017 % The outputs x and cost are the last reached point on the manifold and its
0018 % cost. The struct-array info contains information about the iterations:
0019 %   iter (integer)
0020 %       The (outer) iteration number, i.e., number of steps considered
0021 %       so far (whether accepted or rejected). The initial guess is 0.
0022 %    cost (double)
0023 %       The corresponding cost value.
0025 %       The (Riemannian) norm of the gradient.
0026 %    hessiancalls (integer)
0027 %       The number of Hessian calls issued by the subproblem solver to
0028 %       compute this iterate.
0029 %    time (double)
0030 %       The total elapsed time in seconds to reach the corresponding cost.
0031 %    rho (double)
0032 %       The regularized performance ratio for the iterate.
0033 %       See options.rho_regularization.
0034 %    rhonum, rhoden (double)
0035 %       Numerator and denominator of the performance ratio, before
0036 %       regularization.
0037 %    accepted (boolean)
0038 %       Whether the proposed iterate was accepted or not.
0039 %    stepsize (double)
0040 %       The (Riemannian) norm of the vector returned by the subproblem
0041 %       solver and which is retracted to obtain the proposed next iterate.
0042 %       If accepted = true for the corresponding iterate, this is the size
0043 %       of the step from the previous to the new iterate. If accepted is
0044 %       false, the step was not executed and this is the size of the
0045 %       rejected step.
0046 %    sigma (double)
0047 %       The cubic regularization parameter at the outer iteration.
0048 %   And possibly additional information logged by options.statsfun or by
0049 %   the subproblem solver.
0050 % For example, type [info.gradnorm] to obtain a vector of the successive
0051 % gradient norms reached and [info.time] to obtain a vector with the
0052 % corresponding computation times to reach that point.
0053 %
0054 % The options structure is used to overwrite the default values. All
0055 % options have a default value and are hence optional. To force an option
0056 % value, pass an options structure with a field options.optionname, where
0057 % optionname is one of the following. The default value is indicated
0058 % between parentheses. The subproblem solver may also accept options.
0059 %
0061 %       The algorithm terminates if the norm of the gradient drops below this.
0062 %   maxiter (1000)
0063 %       The algorithm terminates if maxiter (outer) iterations have been executed.
0064 %   maxtime (Inf)
0065 %       The algorithm terminates if maxtime seconds elapsed.
0066 %   sigma_0 (100 / trust-regions default maximum radius)
0067 %       Initial regularization parameter.
0068 %   sigma_min (1e-7)
0069 %       Minimum regularization parameter.
0070 %   eta_1 (0.1)
0071 %       If rho is below this, the step is unsuccessful (rejected).
0072 %   eta_2 (0.9)
0073 %       If rho exceeds this, the step is very successful.
0074 %   gamma_1 (0.1)
0075 %       Shrinking factor for regularization parameter if very successful.
0076 %   gamma_2 (1.5)
0077 %       Expansion factor for regularization parameter if unsuccessful.
0078 %   subproblemsolver (@arc_lanczos)
0079 %       Function handle to a subproblem solver. The subproblem solver will
0080 %       also see this options structure, so that parameters can be passed
0081 %       to it through here as well.
0082 %   rho_regularization (1e3)
0083 %       See help for the same parameter in the trustregions solver.
0084 %   statsfun (none)
0085 %       Function handle to a function that will be called after each
0086 %       iteration to provide the opportunity to log additional statistics.
0087 %       They will be returned in the info struct. See the generic Manopt
0088 %       documentation about solvers for further information.
0089 %   stopfun (none)
0090 %       Function handle to a function that will be called at each iteration
0091 %       to provide the opportunity to specify additional stopping criteria.
0092 %       See the generic Manopt documentation about solvers for further
0093 %       information.
0094 %   verbosity (3)
0095 %       Integer number used to tune the amount of output the algorithm
0096 %       generates during execution (mostly as text in the command window).
0097 %       The higher, the more output. 0 means silent.
0098 %   storedepth (2)
0099 %       Maximum number of different points x of the manifold for which a
0100 %       store structure will be kept in memory in the storedb. If the
0101 %       caching features of Manopt are not used, this is irrelevant. As of
0102 %       version 5.0, this is not particularly important.
0103 %
0104 %
0105 % Please cite the Manopt paper as well as the research paper:
0106 % @article{agarwal2018arcfirst,
0107 %   author  = {Agarwal, N. and Boumal, N. and Bullins, B. and Cartis, C.},
0108 %   title   = {Adaptive regularization with cubics on manifolds with a first-order analysis},
0109 %   journal = {arXiv preprint arXiv:1806.00065},
0110 %   year    = {2018}
0111 % }
0112 %
0113 %
0115
0116 % This file is part of Manopt: www.manopt.org.
0117 % Original authors: May 1, 2018,
0118 %    Naman Agarwal, Brian Bullins, Nicolas Boumal and Coralia Cartis.
0119 % Contributors:
0120 % Change log:
0121
0122     M = problem.M;
0123
0124     % Verify that the problem description is sufficient for the solver.
0125     if ~canGetCost(problem)
0126         warning('manopt:getCost', ...
0127                 'No cost provided. The algorithm will likely abort.');
0128     end
0130         % Note: we do not give a warning if an approximate gradient is
0131         % explicitly given in the problem description, as in that case the
0132         % user seems to be aware of the issue.
0134                 'Using an FD approximation instead (slow).\n' ...
0135                 'It may be necessary to increase options.tolgradnorm.\n'...
0136                 'To disable this warning: ' ...
0139     end
0140     if ~canGetHessian(problem) && ~canGetApproxHessian(problem)
0141         % Note: we do not give a warning if an approximate Hessian is
0142         % explicitly given in the problem description, as in that case the
0143         % user seems to be aware of the issue.
0144         warning('manopt:getHessian:approx', ['No Hessian provided. ' ...
0145                 'Using an FD approximation instead.\n' ...
0146                 'To disable this warning: ' ...
0147                 'warning(''off'', ''manopt:getHessian:approx'')']);
0148         problem.approxhess = approxhessianFD(problem);
0149     end
0150
0151     % Set local defaults here
0153     localdefaults.maxiter = 1000;
0154     localdefaults.maxtime = inf;
0155     localdefaults.sigma_min = 1e-7;
0156     localdefaults.eta_1 = 0.1;
0157     localdefaults.eta_2 = 0.9;
0158     localdefaults.gamma_1 = 0.1;
0159     localdefaults.gamma_2 = 1.5;
0160     localdefaults.storedepth = 2;
0161     localdefaults.subproblemsolver = @arc_lanczos;
0162     localdefaults.rho_regularization = 1e3;
0163
0164     % Merge global and local defaults, then merge w/ user options, if any.
0165     localdefaults = mergeOptions(getGlobalDefaults(), localdefaults);
0166     if ~exist('options', 'var') || isempty(options)
0167         options = struct();
0168     end
0169     options = mergeOptions(localdefaults, options);
0170
0171     % Default initial sigma_0 is based on the initial Delta_bar of the
0172     % trustregions solver.
0173     if ~isfield(options, 'sigma_0')
0174         if isfield(M, 'typicaldist')
0175             options.sigma_0 = 100/M.typicaldist();
0176         else
0177             options.sigma_0 = 100/sqrt(M.dim());
0178         end
0179     end
0180
0181
0182     timetic = tic();
0183
0184     % If no initial point x is given by the user, generate one at random.
0185     if ~exist('x', 'var') || isempty(x)
0186         x = M.rand();
0187     end
0188
0189     % Create a store database and get a key for the current x.
0190     storedb = StoreDB(options.storedepth);
0191     key = storedb.getNewKey();
0192
0193     % Compute objective-related quantities for x.
0196
0197     % Initialize regularization parameter.
0198     sigma = options.sigma_0;
0199
0200     % Iteration counter.
0201     % At any point, iter is the number of fully executed iterations so far.
0202     iter = 0;
0203
0204     % Save stats in a struct array info, and preallocate.
0205     stats = savestats(problem, x, storedb, key, options);
0206     info(1) = stats;
0207     info(min(10000, options.maxiter+1)).iter = [];
0208
0209     if options.verbosity >= 2
0210         fprintf(' iter\t\t\t\t\tcost val\t\t grad norm       sigma   #Hess\n');
0211     end
0212
0213     % Iterate until stopping criterion triggers.
0214     while true
0215
0216         % Display iteration information.
0217         if options.verbosity >= 2
0218             fprintf('%5d\t %+.16e\t %.8e   %.2e', ...
0220         end
0221
0222         % Start timing this iteration.
0223         timetic = tic();
0224
0225         % Run standard stopping criterion checks.
0226         [stop, reason] = stoppingcriterion(problem, x, options, ...
0227                                                              info, iter+1);
0228
0229         if stop
0230             if options.verbosity >= 1
0231                 fprintf(['\n' reason '\n']);
0232             end
0233             break;
0234         end
0235
0236         % Solve the ARC subproblem.
0237         % Outputs: eta is the tentative step (it is a tangent vector at x);
0238         % Heta is the result of applying the Hessian at x along eta (this
0239         % is often a natural by-product of the subproblem solver);
0240         % hesscalls is the number of Hessian calls issued in the solver;
0241         % stop_str is a string describing why the solver terminated; and
0242         % substats is some statistics about the solver's work to be logged.
0243         [eta, Heta, hesscalls, stop_str, substats] = ...
0245                                              sigma, options, storedb, key);
0246
0247         etanorm = M.norm(x, eta);
0248
0249         % Get a tentative next x by retracting the proposed step.
0250         newx = M.retr(x, eta);
0251         newkey = storedb.getNewKey();
0252
0253         % Compute the new cost-related quantities for proposal x.
0254         % We could just compute the cost here, as the gradient is only
0255         % necessary if the step is accepted; but we expect most steps are
0256         % accepted, and sometimes the gradient can be computed faster if it
0257         % is computed in conjunction with the cost.
0259
0260         % Compute a regularized ratio between actual and model improvement.
0261         rho_num = cost - newcost;
0262         vec_rho = M.lincomb(x, 1, grad, .5, Heta);
0263         rho_den = -M.inner(x, eta, vec_rho);
0264         rho_reg = options.rho_regularization*eps*max(1, abs(cost));
0265         rho = (rho_num+rho_reg) / (rho_den+rho_reg);
0266
0267         % Determine if the tentative step should be accepted or not.
0268         if rho >= options.eta_1
0269             accept = true;
0270             arc_str = 'acc ';
0271             % We accepted this step: erase cache of the previous point.
0272             storedb.removefirstifdifferent(key, newkey);
0273             x = newx;
0274             key = newkey;
0275             cost = newcost;
0278         else
0279             accept = false;
0280             arc_str = 'REJ ';
0281             % We rejected this step: erase cache of the tentative point.
0282             storedb.removefirstifdifferent(newkey, key);
0283         end
0284
0285         % Update the regularization parameter.
0286         if rho >= options.eta_2
0287             % Very successful step
0288             arc_str(4) = '-';
0289             sigma = max(options.sigma_min, options.gamma_1 * sigma);
0290         elseif rho >= options.eta_1
0291             % Successful step
0292             arc_str(4) = ' ';
0293         else
0294             % Unsuccessful step
0295             arc_str(4) = '+';
0296             sigma = options.gamma_2 * sigma;
0297         end
0298
0299         % iter is the number of iterations we have completed.
0300         iter = iter + 1;
0301
0302         % Make sure we don't use too much memory for the store database.
0303         storedb.purge();
0304
0305         % Log statistics for freshly executed iteration.
0306         stats = savestats(problem, x, storedb, key, options);
0307         info(iter+1) = stats;
0308
0309         if options.verbosity >= 2
0310             fprintf('   %5d  %s\n', hesscalls, [arc_str ' ' stop_str]);
0311         end
0312
0313     end
0314
0315     % Truncate the struct-array to the part we populated
0316     info = info(1:iter+1);
0317
0318     if options.verbosity >= 1
0319         fprintf('Total time is %f [s] (excludes statsfun)\n', info(end).time);
0320     end
0321
0322
0323     % Routine in charge of collecting the current iteration statistics
0324     function stats = savestats(problem, x, storedb, key, options)
0325
0326         stats.iter = iter;
0327         stats.cost = cost;
0329         stats.sigma = sigma;
0330
0331         if iter == 0
0332             stats.hessiancalls = 0;
0333             stats.stepsize = NaN;
0334             stats.time = toc(timetic);
0335             stats.rho = inf;
0336             stats.rhonum = NaN;
0337             stats.rhoden = NaN;
0338             stats.accepted = true;
0339             stats.subproblem = struct();
0340         else
0341             stats.hessiancalls = hesscalls;
0342             stats.stepsize = etanorm;
0343             stats.time = info(iter).time + toc(timetic);
0344             stats.rho = rho;
0345             stats.rhonum = rho_num;
0346             stats.rhoden = rho_den;
0347             stats.accepted = accept;
0348             stats.subproblem = substats;
0349         end
0350
0351         % Similar to statsfun with trustregions: the x and store passed to
0352         % statsfun are that of the most recently accepted point after the
0353         % iteration fully executed.
0354         stats = applyStatsfun(problem, x, storedb, key, options, stats);
0355
0356     end
0357
0358 end```

Generated on Mon 10-Sep-2018 11:48:06 by m2html © 2005