Description of robust

 Computes a robust version of PCA (principal component analysis) on data.
 
 function [U, cost] = robustpca(X, d)

 Given a matrix X of size p by n, such that each column represents a
 point in R^p, this computes U: an orthonormal basis of size p by d such
 that the column space of U captures the points X as well as possible.
 More precisely, the function attempts to compute U as the minimizer
 over the Grassmann manifold (the set of linear subspaces) of:

  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||

 The output cost represents the average distance achieved with the
 returned U. Notice that norms are not squared, for robustness.

 In practice, because this function is nonsmooth, it is smoothed with a
 pseudo-Huber loss function of parameter epsilon (noted e for short), and
 the smoothing parameter is iteratively reduced (with warm starts):

   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)

   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).

 The intermediate optimization of the smooth cost over the Grassmann
 manifold is performed using the Manopt toolbox.

 Ideally, the non-outlier data should be centered. If not, this
 pre-processing centers all the data, but bear in mind that outliers will
 shift the center of mass too.
 X = X - repmat(mean(X, 2), [1, size(X, 2)]);

 There are no guarantees that this code will return the optimal U.
 This code is distributed to illustrate one possible way of optimizing
 a nonsmooth cost function over a manifold, using Manopt with smoothing.
 For practical use, the constants in the code would need to be tuned.

0001 function [U, cost] = robust_pca(X, d)
0002 % Computes a robust version of PCA (principal component analysis) on data.
0003 %
0004 % function [U, cost] = robustpca(X, d)
0005 %
0006 % Given a matrix X of size p by n, such that each column represents a
0007 % point in R^p, this computes U: an orthonormal basis of size p by d such
0008 % that the column space of U captures the points X as well as possible.
0009 % More precisely, the function attempts to compute U as the minimizer
0010 % over the Grassmann manifold (the set of linear subspaces) of:
0011 %
0012 %  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
0013 %       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||
0014 %
0015 % The output cost represents the average distance achieved with the
0016 % returned U. Notice that norms are not squared, for robustness.
0017 %
0018 % In practice, because this function is nonsmooth, it is smoothed with a
0019 % pseudo-Huber loss function of parameter epsilon (noted e for short), and
0020 % the smoothing parameter is iteratively reduced (with warm starts):
0021 %
0022 %   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)
0023 %
0024 %   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).
0025 %
0026 % The intermediate optimization of the smooth cost over the Grassmann
0027 % manifold is performed using the Manopt toolbox.
0028 %
0029 % Ideally, the non-outlier data should be centered. If not, this
0030 % pre-processing centers all the data, but bear in mind that outliers will
0031 % shift the center of mass too.
0032 % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0033 %
0034 % There are no guarantees that this code will return the optimal U.
0035 % This code is distributed to illustrate one possible way of optimizing
0036 % a nonsmooth cost function over a manifold, using Manopt with smoothing.
0037 % For practical use, the constants in the code would need to be tuned.
0038 
0039 % This file is part of Manopt and is copyrighted. See the license file.
0040 %
0041 % Main author: Nicolas Boumal and Teng Zhang, May 2, 2014
0042 % Contributors:
0043 %
0044 % Change log:
0045 %
0046 %   March 4, 2015 (NB):
0047 %       Uses a pseudo-Huber loss rather than a Huber loss: this has the
0048 %       nice advantage of being smooth and simpler to code (no if's).
0049 %
0050 %   April 8, 2015 (NB):
0051 %       Built-in test data for quick tests; added comment about centering.
0052 %
0053 %   Aug  20, 2021 (XJ):
0054 %       Added AD to compute the egrad
0055 
0056 
0057     % If no inputs, generate random data for illustration purposes.
0058     if nargin == 0
0059         % Generate some data points aligned on a subspace
0060         X = rand(2, 1)*(1:30) + .05*randn(2, 30).*[(1:30);(1:30)];
0061         % And add some random outliers to the mix
0062         P = randperm(size(X, 2));
0063         outliers = 10;
0064         X(:, P(1:outliers)) = 30*randn(2, outliers);
0065         % Center the data
0066         % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0067         d = 1;
0068     end
0069 
0070 
0071 
0072 
0073 
0074     % Prepare a Manopt problem structure for optimization of the given
0075     % cost (defined below) over the Grassmann manifold.
0076     [p, n] = size(X);
0077     manifold = grassmannfactory(p, d);
0078     problem.M = manifold;
0079     problem.cost = @robustpca_cost;
0080     problem.egrad = @robustpca_gradient;
0081     
0082     % Do classical PCA for the initial guess.
0083     % This is just one idea: it is not necessarily useful or ideal.
0084     % Using a random initial guess, and starting over for a few different
0085     % ones is probably much better. For this example, we keep it simple.
0086     [U, ~, ~] = svds(X, d);
0087 
0088     
0089     % Iteratively reduce the smoothing constant epsilon and optimize
0090     % the cost function over Grassmann.
0091     epsilon = 1;
0092     n_iterations = 6;
0093     reduction = .5;
0094     options.verbosity = 2; % Change this number for more or less output
0095     warning('off', 'manopt:getHessian:approx');
0096     
0097     % An alternative way to compute the egrad is to use automatic
0098     % differentiation provided in the deep learning toolbox (slower).
0099     % Call manoptAD to automatically obtain the egrad
0100     % problem = manoptAD(problem,'egrad');
0101     
0102     for iter = 1 : n_iterations
0103         U = trustregions(problem, U, options);
0104         epsilon = epsilon * reduction;
0105     end
0106     warning('on', 'manopt:getHessian:approx');
0107     
0108     
0109     % Return the cost as the actual sum of distances, not smoothed.
0110     epsilon = 0;
0111     cost = robustpca_cost(U);
0112     
0113     
0114     
0115     % If working with the auto-generated input, plot the results.
0116     if nargin == 0
0117         figure;
0118         scatter(X(1,:), X(2,:));
0119         hold on;
0120         plot(U(1)*[-1, 1]*100, U(2)*[-1 1]*100, 'r');
0121         hold off;
0122         % Compare to a standard PCA
0123         [Upca, ~, ~] = svds(X,1);
0124         hold on;
0125         plot(Upca(1)*[-1, 1]*100, Upca(2)*[-1 1]*100, 'k');
0126         hold off;
0127         xlim(1.1*[min(X(1,:)), max(X(1,:))]);
0128         ylim(1.1*[min(X(2,:)), max(X(2,:))]);
0129         legend('data points', 'Robust PCA fit', 'Standard PCA fit');
0130     end
0131 
0132     
0133     
0134     % Smoothed cost
0135     function value = robustpca_cost(U)
0136 
0137         vecs = U*(U'*X) - X;
0138         sqnrms = sum(vecs.^2, 1);
0139         vals = sqrt(sqnrms + epsilon^2) - epsilon;
0140         value = mean(vals);
0141 
0142     end
0143 
0144     % Euclidean gradient of the smoothed cost (it will be transformed into
0145     % the Riemannian gradient automatically by Manopt).
0146     function G = robustpca_gradient(U)
0147 
0148         % Note that the computation of vecs and sqnrms is redundant
0149         % with their computation in the cost function. To speed
0150         % up the code, it would be wise to use the caching capabilities
0151         % of Manopt (the store structure). See online documentation.
0152         % It is not done here to keep the code a bit simpler.
0153         UtX = U'*X;
0154         vecs = U*UtX-X;
0155         sqnrms = sum(vecs.^2, 1);
0156         % This explicit loop is a bit slow: the code below is equivalent
0157         % and faster to compute the gradient.
0158         % G = zeros(p, d);
0159         % for i=1:n
0160         %     G = G + (1/sqrt(sqnrms(i) + epsilon^2)) * vecs(:,i) * UtX(:,i)';
0161         % end
0162         % G = G/n;
0163         G = mean(multiscale(1./sqrt(sqnrms + epsilon^2), ...
0164                            multiprod(reshape(vecs, [p, 1, n]), ...
0165                               multitransp(reshape(UtX, [d, 1, n])))), 3);
0166     end
0167 
0168 end

robust_pca

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE