ShapeFit formulation for sensor network localization from pair directions function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J) This example in based on the paper http://arxiv.org/abs/1506.01437: ShapeFit: Exact location recovery from corrupted pairwise directions, 2015 by Paul Hand, Choongbum Lee and Vladislav Voroninski. The problem is the following: there are n points t_1, ..., t_n in R^d which need to be estimated (localized). To this end, we are given measurements of some of the pairwise directions, v_ij = (t_i - t_j) / norm(t_i - t_j) + noise. Assume there are m such pairwise measurements, defining a graph with m edges over n nodes. J is the signed incidence matrix of this graph (see in code). To build J from lists I, J in R^m of nodes, use: J = sparse([I ; J], [(1:m)' ; (1:m)'], [ones(m, 1), -ones(m, 1)], n, m, 2*m); The measurements are arranged in the matrix V of size d x m. From V, we attempt to estimate t_1, ..., t_n, arranged in T, a matrix of size d x n. The estimation can only be done up to translation and scaling. The returned T's are centered: the columns sum to zero. ShapeFit is a formulation of this estimation problem which is robust to outliers. It is a nonsmooth, convex optimization problem over an affine space, i.e., a linear manifold. We smooth the cost using the pseudo-Huber loss cost and solve the problem using Manopt. This requires two ingredients: (1) a factory to describe the affine space, see shapefitfactory; (2) defining the cost and its derivative, and minimizing it while progressively tightening the smooth approximation (with warm-start). Simply run the example to see the results on random data. It compares the smoothed ShapeFit formulation against a least-squares formulation, when the measurements include outliers. See in code to vary the noise parameters, dimension d, number of nodes n, number of measurements m, ... Note: since the problem is convex, this returns the global optimum. This example also illustrates the use of Manopt for optimization under linear constraints: admittedly a simple subcase of optimization on manifolds. See also: shapefitfactory
0001 function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J) 0002 % ShapeFit formulation for sensor network localization from pair directions 0003 % 0004 % function [T_hub, T_lsq, T_cvx] = shapefit_smoothed(V, J) 0005 % 0006 % This example in based on the paper http://arxiv.org/abs/1506.01437: 0007 % ShapeFit: Exact location recovery from corrupted pairwise directions, 2015 0008 % by Paul Hand, Choongbum Lee and Vladislav Voroninski. 0009 % 0010 % The problem is the following: there are n points t_1, ..., t_n in R^d 0011 % which need to be estimated (localized). To this end, we are given 0012 % measurements of some of the pairwise directions, 0013 % v_ij = (t_i - t_j) / norm(t_i - t_j) + noise. 0014 % Assume there are m such pairwise measurements, defining a graph with m 0015 % edges over n nodes. J is the signed incidence matrix of this graph (see 0016 % in code). To build J from lists I, J in R^m of nodes, use: 0017 % J = sparse([I ; J], [(1:m)' ; (1:m)'], [ones(m, 1), -ones(m, 1)], n, m, 2*m); 0018 % 0019 % The measurements are arranged in the matrix V of size d x m. From V, we 0020 % attempt to estimate t_1, ..., t_n, arranged in T, a matrix of size d x n. 0021 % The estimation can only be done up to translation and scaling. The 0022 % returned T's are centered: the columns sum to zero. 0023 % 0024 % ShapeFit is a formulation of this estimation problem which is robust to 0025 % outliers. It is a nonsmooth, convex optimization problem over an affine 0026 % space, i.e., a linear manifold. We smooth the cost using the pseudo-Huber 0027 % loss cost and solve the problem using Manopt. This requires two 0028 % ingredients: (1) a factory to describe the affine space, see 0029 % shapefitfactory; (2) defining the cost and its derivative, and minimizing 0030 % it while progressively tightening the smooth approximation (with 0031 % warm-start). 0032 % 0033 % Simply run the example to see the results on random data. It compares the 0034 % smoothed ShapeFit formulation against a least-squares formulation, when 0035 % the measurements include outliers. See in code to vary the noise 0036 % parameters, dimension d, number of nodes n, number of measurements m, ... 0037 % 0038 % Note: since the problem is convex, this returns the global optimum. 0039 % This example also illustrates the use of Manopt for optimization under 0040 % linear constraints: admittedly a simple subcase of optimization on 0041 % manifolds. 0042 % 0043 % 0044 % See also: shapefitfactory 0045 0046 % This file is part of Manopt: www.manopt.org. 0047 % Original author: Nicolas Boumal, June 18, 2015. 0048 % Contributors: 0049 % Change log: 0050 % 0051 % Jan. 4, 2021 (NB): 0052 % Changes for compatibility with Octave 6.1.0. 0053 % 0054 % Aug. 23, 2021 (XJ): 0055 % Added AD to compute the egrad and the ehess 0056 0057 % Generic useful functions 0058 center_cols = @(A) bsxfun(@minus, A, mean(A, 2)); 0059 normalize_cols = @(A) bsxfun(@times, A, 1./sqrt(sum(A.^2, 1))); 0060 sqnorm_cols = @(A) sum(A.^2, 1); 0061 0062 0063 % DATA GENERATION 0064 % 0065 % If no inputs are specified, generate some random data for 0066 % illustration purposes. 0067 if nargin == 0 0068 0069 % We estimate n points in R^d 0070 d = 2; 0071 n = 500; 0072 0073 % Those points are the columns of T : they are what we need to 0074 % estimate, up to scaling and translation. We center T for 0075 % convenience. 0076 T_tru = center_cols(rand(d, n)); 0077 0078 % We get a measurement of some pairs of relative directions. 0079 % Which pairs is encoded in this graph, with J being the (signed, 0080 % transposed) incidence matrix. J is n x m, sparse. 0081 % There are roughly edge_fraction * n * (n-1) / 2 measurements. 0082 edge_fraction = 0.10; 0083 % [ii, jj] = erdosrenyi(n, edge_fraction); 0084 [ii, jj] = randomgraph(n, edge_fraction*nchoosek(n, 2)); 0085 m = length(ii); 0086 J = sparse([ii ; jj], [(1:m)' ; (1:m)'], ... 0087 [ones(m, 1), -ones(m, 1)], n, m, 2*m); 0088 0089 % The measurements give the directions from one point to another. 0090 % That is: we get the position difference, normalized. Here, with 0091 % Gaussian noise. Least-squares will be well-suited for this. 0092 sigma = .0; 0093 V = normalize_cols(T_tru*J + sigma*randn(d, m)); % d x m 0094 0095 % Outliers: we replace some of the direction measurements by 0096 % uniformly random unit-norm vectors. 0097 outlier_fraction = .3; 0098 outliers = rand(1, m) < outlier_fraction; 0099 V(:, outliers) = normalize_cols(randn(d, sum(outliers))); 0100 0101 end % done generating random data 0102 0103 0104 0105 0106 0107 [d, m] = size(V); 0108 n = size(J, 1); 0109 assert(size(J, 2) == m, 'J must be n x m, with V of size d x m.'); 0110 0111 VJt = full(V*J'); 0112 0113 % This "manifold" describes the Euclidean space of matrices T of size 0114 % d x n such that <VJt, T> = 1 and T has centered columns: T1 = 0. 0115 problem.M = shapefitfactory(VJt); 0116 0117 % This linear operator computes the orthogonal projection of each 0118 % difference ti - tj on the orthogonal space to v_ij. 0119 % If the alignment is compatible with the data, then this is zero. 0120 % A(T) is a d x m matrix. 0121 A = @(T) Afun(T, V, J); 0122 function AT = Afun(T, V, J) 0123 TJ = T*J; 0124 AT = TJ - bsxfun(@times, V, sum(V .* TJ, 1)); 0125 end 0126 0127 % Need the adjoint of A, too. Input is d x m, output is d x n. 0128 Astar = @(W) (W - bsxfun(@times, V, sum(V.*W, 1)))*J'; 0129 0130 0131 0132 % LEAST-SQUARES 0133 % 0134 % First, work with a least-squares formulation of the problem. 0135 % That is, we minimize a (very nice) convex cost over an affine space. 0136 % Since the smooth solvers in Manopt converge to critical points, this 0137 % means they converge to global optimizers. 0138 problem.cost = @(T) 0.5*norm(A(T), 'fro')^2; 0139 problem.egrad = @(T) Astar(A(T)); 0140 problem.ehess = @(T, Tdot) Astar(A(Tdot)); 0141 0142 % An alternative way to compute the egrad and the ehess is to use 0143 % automatic differentiation provided in the deep learning toolbox (slower) 0144 % Notice that the function norm is not supported for AD so far. 0145 % Replace norm(...,'fro')^2 with cnormsqfro described in the file 0146 % manoptADhelp.m. Also operations between sparse matrices and dlarrys 0147 % is not supported so far. Transform V,J into full matrices. AD does 0148 % not support bsxfunc as well. Translate it into the expression of 0149 % repmat and .*. The whole thing would make the computation much slower 0150 % than specifying the egrad and the ehess. 0151 % V_full = full(V); 0152 % J_full = full(J); 0153 % problem.cost = @(T) 0.5*cnormsqfro(A_AD(T)); 0154 % function AT = A_AD(T) 0155 % sum1 = sum(V_full .* (T*J_full), 1); 0156 % repsum1 = repmat(sum1,size(sum1,1),1); 0157 % AT = T*J_full - V_full.*repsum1; 0158 % end 0159 % call manoptAD to prepare AD for the problem structure 0160 % problem = manoptAD(problem); 0161 0162 T_lsq = trustregions(problem); 0163 0164 0165 0166 % PSEUDO-HUBER SMOOTHED SHAPEFIT 0167 % 0168 % Now solve the same, but with a pseudo-Huber loss instead of 0169 % least-squares. 0170 % We iteratively sharpen the Huber function, i.e., reduce delta. 0171 % It is important to warm start in such a fashion: trying to optimize 0172 % with a random initial guess and a very small delta is typically slow. 0173 % How fast one should decrease delta, and how accurately one should 0174 % optimize at each intermediate stage, is open for research. 0175 delta = 1; 0176 T_hub = []; % We could use T_lsq as initial guess, too. 0177 problem = rmfield(problem, 'ehess'); 0178 warning('off', 'manopt:getHessian:approx'); 0179 for iter = 1 : 12 0180 0181 delta = delta / 2; 0182 0183 h = @(x2) sqrt(x2 + delta^2) - delta; % pseudo-Huber loss 0184 0185 problem.cost = @(T) sum(h(sqnorm_cols(A(T)))); 0186 problem.egrad = @(T) Astar(bsxfun(@times, A(T), ... 0187 1./sqrt(sqnorm_cols(A(T)) + delta^2))); 0188 0189 % AD version 0190 % problem = rmfield(problem, 'egrad'); 0191 % problem = rmfield(problem, 'autogradfunc'); 0192 % problem = manoptAD(problem,'egrad'); 0193 0194 % Solve, using the previous solution as initial guess. 0195 T_hub = trustregions(problem, T_hub); 0196 0197 end 0198 0199 0200 0201 % CVX SHAPEFIT 0202 % 0203 % Actual ShapeFit cost (nonsmooth), with CVX. 0204 % You can get CVX from http://cvxr.com/. 0205 use_cvx_if_available = false; 0206 if use_cvx_if_available && exist('cvx_version', 'file') 0207 T_cvx = shapefit_cvx(V, J); 0208 else 0209 T_cvx = NaN(d, n); 0210 end 0211 0212 0213 0214 % VISUALIZATION 0215 % 0216 % If T_true is available, for display, we scale the estimators to match 0217 % the norm of the target. The scaling factor is obtained by minimizing 0218 % the norm of the discrepancy : norm(T_tru - scale*T_xxx, 'fro'). 0219 % A plot is produced if d is 2 or 3. 0220 if exist('T_tru', 'var') && (d == 2 || d == 3) 0221 0222 T_lsq = T_lsq * trace(T_tru'*T_lsq) / norm(T_lsq, 'fro')^2; 0223 T_hub = T_hub * trace(T_tru'*T_hub) / norm(T_hub, 'fro')^2; 0224 T_cvx = T_cvx * trace(T_tru'*T_cvx) / norm(T_cvx, 'fro')^2; 0225 0226 0227 switch d 0228 case 2 0229 plot(T_tru(1, :), T_tru(2, :), 'bo', ... 0230 T_lsq(1, :), T_lsq(2, :), 'rx', ... 0231 T_hub(1, :), T_hub(2, :), 'k.', ... 0232 T_cvx(1, :), T_cvx(2, :), 'g.'); 0233 case 3 0234 plot3(T_tru(1, :), T_tru(2, :), T_tru(3, :), 'bo', ... 0235 T_lsq(1, :), T_lsq(2, :), T_lsq(3, :), 'rx', ... 0236 T_hub(1, :), T_hub(2, :), T_hub(3, :), 'k.', ... 0237 T_cvx(1, :), T_cvx(2, :), T_cvx(3, :), 'g.'); 0238 end 0239 0240 legend('ground truth', 'least squares', ... 0241 sprintf('pseudo-huber, \\delta = %.1e', delta), ... 0242 'CVX ShapeFit'); 0243 0244 title(sprintf(['ShapeFit problem : d = %d, n = %d, edge ' ... 0245 'fraction = %.2g, sigma = %.2g, outlier ' ... 0246 'fraction = %.2g'], d, n, edge_fraction, sigma, ... 0247 outlier_fraction)); 0248 axis equal; 0249 0250 end 0251 0252 end 0253 0254 0255 % If CVX is available, it can be used to solve the nonsmooth problem 0256 % directly, very elegantly. 0257 function T_cvx = shapefit_cvx(V, J) 0258 d = size(V, 1); 0259 n = size(J, 1); %#ok<NASGU> 0260 VJt = full(V*J'); 0261 cvx_begin 0262 variable T_cvx(d, n) 0263 % We want to minimize this: 0264 % minimize sum( norms( A(T_cvx), 2, 1 ) ) 0265 % But unfortunately, CVX doesn't handle bsxfun. Instead, we use 0266 % repmat, which is slower, and hence hurts the comparison in 0267 % disfavor of CVX. 0268 minimize sum( norms( T_cvx*J - V .* repmat(sum(V .* (T_cvx*J), 1), [d, 1]) , 2, 1 ) ) 0269 sum(T_cvx, 2) == zeros(d, 1); %#ok<NODEF,EQEFF> 0270 VJt(:).' * T_cvx(:) == 1; %#ok<EQEFF> 0271 cvx_end 0272 end 0273 0274 0275 function [I, J, A] = erdosrenyi(n, p) %#ok<DEFNU> 0276 % Generate a random Erdos-Renyi graph with n nodes and edge probability p. 0277 % 0278 % [I, J, A] = erdosrenyi(n, p) 0279 % 0280 % Returns a list of edges (I(k), J(k)) for a random, undirected Erdos-Renyi 0281 % graph with n nodes and edge probability p. A is the adjacency matrix. 0282 % 0283 % I(k) < J(k) for all k, i.e., all(I<J) is true. 0284 % 0285 % The memory requirements for this simple implementation scale as O(n^2). 0286 0287 X = rand(n); 0288 mask = X <= p; 0289 X( mask) = 1; 0290 X(~mask) = 0; 0291 X = triu(X, 1); 0292 0293 % A is the adjacency matrix 0294 A = X + X'; 0295 0296 [I, J] = find(X); 0297 0298 end 0299 0300 0301 function [I, J, A] = randomgraph(n, m) 0302 % Generates a random graph over n nodes with at most m edges. 0303 % 0304 % function [I, J, A] = randomgraph(n, m) 0305 % 0306 % Selects m (undirected) edges from a graph over n nodes, uniformly at 0307 % random, with replacement. The self edges and repeated edges are then 0308 % discarded. The remaining number of edges is at most m, and should be 0309 % close to m if m is much smaller than nchoosek(n, 2). 0310 % 0311 % The output satisfies all(I < J). A is the corresponding adjacency matrix. 0312 % 0313 % Uses O(m) memory (not O(n^2)), making it fit for large, sparse graphs. 0314 0315 % Generate m edges at random, with replacement, and remove repetitions. 0316 IJ = unique(sort(randi(n, m, 2), 2), 'rows'); 0317 0318 % Remove self-edges if any. 0319 IJ = IJ(IJ(:, 1) ~= IJ(:, 2), :); 0320 0321 % Actual number of selected edges 0322 k = size(IJ, 1); 0323 0324 I = IJ(:, 1); 0325 J = IJ(:, 2); 0326 0327 % Build the adjacency matrix of the graph. 0328 A = sparse([I ; J], [J ; I], ones(2*k, 1), n, n, 2*k); 0329 0330 end