Computes an approx. of the Hessian w/ finite differences of the gradient. function hessfd = getHessianFD(problem, x, d) function hessfd = getHessianFD(problem, x, d, storedb) function hessfd = getHessianFD(problem, x, d, storedb, key) Returns a finite difference approximation of the Hessian at x along d of the cost function described in the problem structure. The finite difference is based on computations of the gradient. storedb is a StoreDB object, key is the StoreDB key to point x. If the gradient cannot be computed, an exception is thrown. See also: approxhessianFD
0001 function hessfd = getHessianFD(problem, x, d, storedb, key) 0002 % Computes an approx. of the Hessian w/ finite differences of the gradient. 0003 % 0004 % function hessfd = getHessianFD(problem, x, d) 0005 % function hessfd = getHessianFD(problem, x, d, storedb) 0006 % function hessfd = getHessianFD(problem, x, d, storedb, key) 0007 % 0008 % Returns a finite difference approximation of the Hessian at x along d of 0009 % the cost function described in the problem structure. The finite 0010 % difference is based on computations of the gradient. 0011 % 0012 % storedb is a StoreDB object, key is the StoreDB key to point x. 0013 % 0014 % If the gradient cannot be computed, an exception is thrown. 0015 % 0016 % See also: approxhessianFD 0017 0018 % This file is part of Manopt: www.manopt.org. 0019 % Original author: Nicolas Boumal, Dec. 30, 2012. 0020 % Contributors: 0021 % Change log: 0022 % 0023 % Feb. 19, 2015 (NB): 0024 % It is sufficient to ensure positive radial linearity to guarantee 0025 % (together with other assumptions) that this approximation of the 0026 % Hessian will confer global convergence to the trust-regions method. 0027 % Formerly, in-code comments referred to the necessity of having 0028 % complete radial linearity, and that this was harder to achieve. 0029 % This appears not to be necessary after all, which simplifies the 0030 % code. 0031 % 0032 % April 3, 2015 (NB): 0033 % Works with the new StoreDB class system. 0034 % 0035 % Nov. 1, 2016 (NB): 0036 % Removed exception in case of unavailable gradient, as getGradient 0037 % now knows to fall back to an approximate gradient if need be. 0038 % 0039 % March 17, 2020 (NB): 0040 % Following a bug report by Marco Sutti, added the instruction 0041 % storedb.remove(key1); 0042 % to avoid memory usage ramping up when many inner iterations are 0043 % needed inside of tCG for trustregions. The deciding factor is that 0044 % there is no need to cache the gradient at the artificially produced 0045 % point used here for finite differencing, as this point is not 0046 % visible outside of this function: there is no reason we would visit 0047 % it again. 0048 0049 % Allow omission of the key, and even of storedb. 0050 if ~exist('key', 'var') 0051 if ~exist('storedb', 'var') 0052 storedb = StoreDB(); 0053 end 0054 key = storedb.getNewKey(); 0055 end 0056 0057 % Step size 0058 norm_d = problem.M.norm(x, d); 0059 0060 % First, check whether the step d is not too small 0061 if norm_d < eps 0062 hessfd = problem.M.zerovec(x); 0063 return; 0064 end 0065 0066 % Parameter: how far do we look? 0067 % If you need to change this parameter, use approxhessianFD explicitly. 0068 % A power of 2 is chosen so that scaling by epsilon does not incur any 0069 % round-off error in IEEE arithmetic. 0070 epsilon = 2^-14; 0071 0072 c = epsilon/norm_d; 0073 0074 % Compute the gradient at the current point. 0075 grad = getGradient(problem, x, storedb, key); 0076 0077 % Compute a point a little further along d and the gradient there. 0078 % Since this is a new point, we need a new key for it, for the storedb. 0079 x1 = problem.M.retr(x, d, c); 0080 key1 = storedb.getNewKey(); 0081 grad1 = getGradient(problem, x1, storedb, key1); 0082 0083 % Remove any caching associated to that new point, since there is no 0084 % reason we would be visiting it again. 0085 storedb.remove(key1); 0086 0087 % Transport grad1 back from x1 to x. 0088 grad1 = problem.M.transp(x1, x, grad1); 0089 0090 % Return the finite difference of them. 0091 hessfd = problem.M.lincomb(x, 1/c, grad1, -1/c, grad); 0092 0093 % Note: if grad and grad1 are relatively large vectors, then computing 0094 % their difference to obtain hessfd can result in large errors due to 0095 % floating point arithmetic. As a result, even though grad and grad1 0096 % are supposed to be tangent up to machine precision, the resulting 0097 % vector hessfd can be significantly further from being tangent. If so, 0098 % this will show in the 'residual check' in checkhessian. Thus, when 0099 % using a finite difference approximation, the residual should be 0100 % judged as compared to the norm of the gradient at the point under 0101 % consideration. This seems not to have caused trouble. If this should 0102 % become an issue for some application, the easy fix is to project the 0103 % result of the FD approximation: hessfd = problem.M.proj(x, hessfd). 0104 0105 end