Home > examples > robust_pca.m

robust_pca

PURPOSE ^

Computes a robust version of PCA (principal component analysis) on data.

SYNOPSIS ^

function [U, cost] = robust_pca(X, d)

DESCRIPTION ^

 Computes a robust version of PCA (principal component analysis) on data.
 
 function [U, cost] = robustpca(X, d)

 Given a matrix X of size p by n, such that each column represents a
 point in R^p, this computes U: an orthonormal basis of size p by d such
 that the column space of U captures the points X as well as possible.
 More precisely, the function attempts to compute U as the minimizer
 over the Grassmann manifold (the set of linear subspaces) of:

  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||

 The output cost represents the average distance achieved with the
 returned U. Notice that norms are not squared, for robustness.

 In practice, because this function is nonsmooth, it is smoothed with a
 pseudo-Huber loss function of parameter epsilon (noted e for short), and
 the smoothing parameter is iteratively reduced (with warm starts):

   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)

   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).

 The intermediate optimization of the smooth cost over the Grassmann
 manifold is performed using the Manopt toolbox.

 Ideally, the non-outlier data should be centered. If not, this
 pre-processing centers all the data, but bear in mind that outliers will
 shift the center of mass too.
 X = X - repmat(mean(X, 2), [1, size(X, 2)]);

 There are no guarantees that this code will return the optimal U.
 This code is distributed to illustrate one possible way of optimizing
 a nonsmooth cost function over a manifold, using Manopt with smoothing.
 For practical use, the constants in the code would need to be tuned.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [U, cost] = robust_pca(X, d)
0002 % Computes a robust version of PCA (principal component analysis) on data.
0003 %
0004 % function [U, cost] = robustpca(X, d)
0005 %
0006 % Given a matrix X of size p by n, such that each column represents a
0007 % point in R^p, this computes U: an orthonormal basis of size p by d such
0008 % that the column space of U captures the points X as well as possible.
0009 % More precisely, the function attempts to compute U as the minimizer
0010 % over the Grassmann manifold (the set of linear subspaces) of:
0011 %
0012 %  f(U) = (1/n) Sum_{i = 1:n} dist(X(:, i), the space spanned by U)
0013 %       = (1/n) Sum_{i = 1:n} || U*U'*X(:, i) - X(:, i) ||
0014 %
0015 % The output cost represents the average distance achieved with the
0016 % returned U. Notice that norms are not squared, for robustness.
0017 %
0018 % In practice, because this function is nonsmooth, it is smoothed with a
0019 % pseudo-Huber loss function of parameter epsilon (noted e for short), and
0020 % the smoothing parameter is iteratively reduced (with warm starts):
0021 %
0022 %   f_e(U) = (1/n) Sum_{i = 1:n} l_e(|| U*U'*X(:, i) - X(:, i) ||)
0023 %
0024 %   with l_e(x) = sqrt(x^2 + e^2) - e (for e = 0, this is absolute value).
0025 %
0026 % The intermediate optimization of the smooth cost over the Grassmann
0027 % manifold is performed using the Manopt toolbox.
0028 %
0029 % Ideally, the non-outlier data should be centered. If not, this
0030 % pre-processing centers all the data, but bear in mind that outliers will
0031 % shift the center of mass too.
0032 % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0033 %
0034 % There are no guarantees that this code will return the optimal U.
0035 % This code is distributed to illustrate one possible way of optimizing
0036 % a nonsmooth cost function over a manifold, using Manopt with smoothing.
0037 % For practical use, the constants in the code would need to be tuned.
0038 
0039 % This file is part of Manopt and is copyrighted. See the license file.
0040 %
0041 % Main author: Nicolas Boumal and Teng Zhang, May 2, 2014
0042 % Contributors:
0043 %
0044 % Change log:
0045 %
0046 %   March 4, 2015 (NB):
0047 %       Uses a pseudo-Huber loss rather than a Huber loss: this has the
0048 %       nice advantage of being smooth and simpler to code (no if's).
0049 %
0050 %   April 8, 2015 (NB):
0051 %       Built-in test data for quick tests; added comment about centering.
0052 
0053 
0054 
0055     % If no inputs, generate random data for illustration purposes.
0056     if nargin == 0
0057         % Generate some data points aligned on a subspace
0058         X = rand(2, 1)*(1:30) + .05*randn(2, 30).*[(1:30);(1:30)];
0059         % And add some random outliers to the mix
0060         P = randperm(size(X, 2));
0061         outliers = 10;
0062         X(:, P(1:outliers)) = 30*randn(2, outliers);
0063         % Center the data
0064         % X = X - repmat(mean(X, 2), [1, size(X, 2)]);
0065         d = 1;
0066     end
0067 
0068 
0069 
0070 
0071 
0072     % Prepare a Manopt problem structure for optimization of the given
0073     % cost (defined below) over the Grassmann manifold.
0074     [p, n] = size(X);
0075     manifold = grassmannfactory(p, d);
0076     problem.M = manifold;
0077     problem.cost = @robustpca_cost;
0078     problem.egrad = @robustpca_gradient;
0079     
0080     % Do classical PCA for the initial guess.
0081     % This is just one idea: it is not necessarily useful or ideal.
0082     % Using a random initial guess, and starting over for a few different
0083     % ones is probably much better. For this example, we keep it simple.
0084     [U, ~, ~] = svds(X, d);
0085 
0086     
0087     % Iteratively reduce the smoothing constant epsilon and optimize
0088     % the cost function over Grassmann.
0089     epsilon = 1;
0090     n_iterations = 6;
0091     reduction = .5;
0092     options.verbosity = 2; % Change this number for more or less output
0093     warning('off', 'manopt:getHessian:approx');
0094     for iter = 1 : n_iterations
0095         U = trustregions(problem, U, options);
0096         epsilon = epsilon * reduction;
0097     end
0098     warning('on', 'manopt:getHessian:approx');
0099     
0100     
0101     % Return the cost as the actual sum of distances, not smoothed.
0102     epsilon = 0;
0103     cost = robustpca_cost(U);
0104     
0105     
0106     
0107     % If working with the auto-generated input, plot the results.
0108     if nargin == 0
0109         scatter(X(1,:), X(2,:));
0110         hold on;
0111         plot(U(1)*[-1, 1]*100, U(2)*[-1 1]*100, 'r');
0112         hold off;
0113         % Compare to a standard PCA
0114         [Upca, ~, ~] = svds(X,1);
0115         hold on;
0116         plot(Upca(1)*[-1, 1]*100, Upca(2)*[-1 1]*100, 'k');
0117         hold off;
0118         xlim(1.1*[min(X(1,:)), max(X(1,:))]);
0119         ylim(1.1*[min(X(2,:)), max(X(2,:))]);
0120         legend('data points', 'Robust PCA fit', 'Standard PCA fit');
0121     end
0122 
0123     
0124     
0125     % Smoothed cost
0126     function value = robustpca_cost(U)
0127 
0128         vecs = U*(U'*X) - X;
0129         sqnrms = sum(vecs.^2, 1);
0130         vals = sqrt(sqnrms + epsilon^2) - epsilon;
0131         value = mean(vals);
0132 
0133     end
0134 
0135     % Euclidean gradient of the smoothed cost (it will be transformed into
0136     % the Riemannian gradient automatically by Manopt).
0137     function G = robustpca_gradient(U)
0138 
0139         % Note that the computation of vecs and sqnrms is redundant
0140         % with their computation in the cost function. To speed
0141         % up the code, it would be wise to use the caching capabilities
0142         % of Manopt (the store structure). See online documentation.
0143         % It is not done here to keep the code a bit simpler.
0144         UtX = U'*X;
0145         vecs = U*UtX-X;
0146         sqnrms = sum(vecs.^2, 1);
0147         % This explicit loop is a bit slow: the code below is equivalent
0148         % and faster to compute the gradient.
0149         % G = zeros(p, d);
0150         % for i=1:n
0151         %     G = G + (1/sqrt(sqnrms(i) + epsilon^2)) * vecs(:,i) * UtX(:,i)';
0152         % end
0153         % G = G/n;
0154         G = mean(multiscale(1./sqrt(sqnrms + epsilon^2), ...
0155                            multiprod(reshape(vecs, [p, 1, n]), ...
0156                               multitransp(reshape(UtX, [d, 1, n])))), 3);
0157     end
0158 
0159 end

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005