Home > manopt > manifolds > fixedranktensors > fixedrankfactory_tucker_preconditioned.m

# fixedrankfactory_tucker_preconditioned

## PURPOSE

Manifold of fixed multilinear rank tensors in Tucker format.

## SYNOPSIS

function M = fixedrankfactory_tucker_preconditioned(tensor_size, tensor_rank)

## DESCRIPTION

``` Manifold of fixed multilinear rank tensors in Tucker format.

function M = fixedrankfactory_tucker_preconditioned(tensor_size, tensor_rank)

n1 = tensor_size(1);
n2 = tensor_size(2);
n3 = tensor_size(3);
r1 = tensor_rank(1);
r2 = tensor_rank(2);
r3 = tensor_rank(3);

A point X on the manifold is represented as a structure with four
fields: U1, U2, U3 and G. The matrices U1 (n1-by-r1), U2 (n2-by-r2),
and U3 (n3-by-r3) are orthogonal matrices. G (r1-by-r2-by-r3) is a
multidimensional array.

Tangent vectors are represented as a structure with four fields:
U1, U2, U3, and G.

We exploit the quotient nature of Tucker decompositions to impose a
scaled inner product on the manifold. This suits least-squares problems.
For details, refer to the technical report:
"{R}iemannian preconditioning for tensor completion",
H. Kasai and B. Mishra, Arxiv preprint arXiv:1506.02159, 2015.

Please cite the Manopt paper as well as the research paper:
@TechReport{kasai2015precon,
Title   = {{R}iemannian preconditioning for tensor completion},
Author  = {Kasai, H. and Mishra, B.},
Journal = {Arxiv preprint arXiv:1506.02159},
Year    = {2015}
}```

## CROSS-REFERENCE INFORMATION

This function calls:
• hashmd5 Computes the MD5 hash of input data.
• lincomb Computes a linear combination of tangent vectors in the Manopt framework.
This function is called by:

## SOURCE CODE

```0001 function M = fixedrankfactory_tucker_preconditioned(tensor_size, tensor_rank)
0002 % Manifold of fixed multilinear rank tensors in Tucker format.
0003 %
0004 % function M = fixedrankfactory_tucker_preconditioned(tensor_size, tensor_rank)
0005 %
0006 % n1 = tensor_size(1);
0007 % n2 = tensor_size(2);
0008 % n3 = tensor_size(3);
0009 % r1 = tensor_rank(1);
0010 % r2 = tensor_rank(2);
0011 % r3 = tensor_rank(3);
0012 %
0013 % A point X on the manifold is represented as a structure with four
0014 % fields: U1, U2, U3 and G. The matrices U1 (n1-by-r1), U2 (n2-by-r2),
0015 % and U3 (n3-by-r3) are orthogonal matrices. G (r1-by-r2-by-r3) is a
0016 % multidimensional array.
0017 %
0018 % Tangent vectors are represented as a structure with four fields:
0019 % U1, U2, U3, and G.
0020 %
0021 % We exploit the quotient nature of Tucker decompositions to impose a
0022 % scaled inner product on the manifold. This suits least-squares problems.
0023 % For details, refer to the technical report:
0024 % "{R}iemannian preconditioning for tensor completion",
0025 % H. Kasai and B. Mishra, Arxiv preprint arXiv:1506.02159, 2015.
0026 %
0027 % Paper link: http://arxiv.org/abs/1506.02159.
0028 %
0029 % Please cite the Manopt paper as well as the research paper:
0030 %     @TechReport{kasai2015precon,
0031 %       Title   = {{R}iemannian preconditioning for tensor completion},
0032 %       Author  = {Kasai, H. and Mishra, B.},
0033 %       Journal = {Arxiv preprint arXiv:1506.02159},
0034 %       Year    = {2015}
0035 %     }
0036
0037 % Original authors: Hiroyuki Kasai and Bamdev Mishra, June 5, 2015.
0038 % Contributors:
0039 % Change log:
0040
0041     if length(tensor_rank) > 3
0042         error('Bad usage of fixedrankfactory_tucker_preconditioned. Currently, only handles 3-order tensors.');
0043     end
0044
0045     % Tensor size
0046     n1 = tensor_size(1);
0047     n2 = tensor_size(2);
0048     n3 = tensor_size(3);
0049
0050     % Core size or multilinear rank
0051     r1 = tensor_rank(1);
0052     r2 = tensor_rank(2);
0053     r3 = tensor_rank(3);
0054
0055
0056     speyer1 = speye(r1); % Sparse version of identity that is used in M.proj
0057     speyer2 = speye(r2);
0058     speyer3 = speye(r3);
0059
0060
0061     M.name = @() sprintf('G x U1 x U2 x U3 quotient Tucker manifold of %d-by-%d-by-%d tensor of rank %d-by-%d-by-%d.', n1, n2, n3, r1, r2, r3);
0062
0063     M.dim = @() n1*r1-r1^2 + n2*r2-r2^2 + n3*r3-r3^2 + r1*r2*r3;
0064
0065     % Some precomputations at point X to be used in the inner product (and
0066     % pretty much everywhere else)
0067     function X = prepare(X)
0068         if ~all(isfield(X,{'G1G1t','G1',...
0069                 'G2G2t','G2', ...
0070                 'G3G3t','G3'}) == 1)
0071
0072             X.G1 =  reshape(X.G, r1, r2*r3);
0073             X.G1G1t = X.G1*X.G1'; % Positive definite
0074
0075
0076             X.G2 = reshape(permute(X.G, [2 1 3]), r2, r1*r3);
0077             X.G2G2t = X.G2*X.G2'; % Positive definite
0078
0079
0080             X.G3 = reshape(permute(X.G, [3 1 2]), r3, r1*r2);
0081             X.G3G3t = X.G3*X.G3'; % Positive definite
0082         end
0083
0084
0085     end
0086
0087     % Choice of metric is motivated by symmetry and tuned to least-squares
0088     % cost function
0089     M.inner = @iproduct;
0090     function ip = iproduct(X, eta, zeta)
0091         X = prepare(X);
0092         ip =  trace(X.G1G1t*(eta.U1'*zeta.U1)) ...
0093             + trace(X.G2G2t*(eta.U2'*zeta.U2)) ...
0094             + trace(X.G3G3t*(eta.U3'*zeta.U3)) ...
0095             + (eta.G(:)'*zeta.G(:));
0096     end
0097     M.norm = @(X, eta) sqrt(M.inner(X, eta, eta));
0098
0099     M.dist = @(x, y) error('fixedrankfactory_tucker_preconditioned.dist not implemented yet.');
0100
0101     M.typicaldist = @() 10*n1*r1; % BM: To do
0102
0103     skew = @(X) .5*(X-X');
0104     symm = @(X) .5*(X+X');
0105
0108         X = prepare(X); % Reuse already computed terms
0109
0110         SSU1 = X.G1G1t;
0111         ASU1 = 2*symm(SSU1*(X.U1' * egrad.U1));
0112
0113         SSU2 = X.G2G2t;
0114         ASU2 = 2*symm(SSU2*(X.U2' * egrad.U2));
0115
0116         SSU3 = X.G3G3t;
0117         ASU3 = 2*symm(SSU3*(X.U3' * egrad.U3));
0118
0119
0120         BU1 = lyap(SSU1, -ASU1);
0121         BU2 = lyap(SSU2, -ASU2);
0122         BU3 = lyap(SSU3, -ASU3);
0123
0124         % The lyap solutions ensure that the Riemannian gradient rgrad
0125         % is now on the tangent space. From the Riemannian submersion
0126         % theory, it also belongs to the horizontal space. Therefore,
0127         % no need to further project it on the horizontal space.
0128
0133
0134
0135     end
0136
0137
0138
0139     M.ehess2rhess = @ehess2rhess;
0140     function Hess = ehess2rhess(X, egrad, ehess, eta)
0141         X = prepare(X); % Reuse already computed terms
0142
0143         % Riemannian gradient
0144         SSU1 = X.G1G1t;
0145         ASU1 = 2*symm(SSU1*(X.U1' * egrad.U1));
0146         SSU2 = X.G2G2t;
0147         ASU2 = 2*symm(SSU2*(X.U2' * egrad.U2));
0148         SSU3 = X.G3G3t;
0149         ASU3 = 2*symm(SSU3*(X.U3' * egrad.U3));
0150
0151         BU1 = lyap(SSU1, -ASU1);
0152         BU2 = lyap(SSU2, -ASU2);
0153         BU3 = lyap(SSU3, -ASU3);
0154
0159
0160         % Directional derivative of Riemannian gradient
0161
0162         eta_G1 = reshape(eta.G, r1, r2*r3); % double(tenmat(eta.G,1));
0163         eta_G2 = reshape(permute(eta.G, [2 1 3]), r2, r1*r3); % double(tenmat(eta.G,2));
0164         eta_G3 = reshape(permute(eta.G, [3 1 2]), r3, r1*r2); % double(tenmat(eta.G,3));
0166         egrad_G2 = reshape(permute(egrad.G, [2 1 3]), r2, r1*r3); % double(tenmat(egrad.G,2));
0167         egrad_G3 = reshape(permute(egrad.G, [3 1 2]), r3, r1*r2); % double(tenmat(egrad.G,3));
0168         ehess_G1 = reshape(ehess.G, r1, r2*r3); % double(tenmat(ehess.G,1));
0169         ehess_G2 = reshape(permute(ehess.G, [2 1 3]), r2, r1*r3); % double(tenmat(ehess.G,2));
0170         ehess_G3 = reshape(permute(ehess.G, [3 1 2]), r3, r1*r2); % double(tenmat(ehess.G,3));
0172         rgrad_G2 = reshape(permute(rgrad.G, [2 1 3]), r2, r1*r3); % double(tenmat(rgrad.G,2));
0173         rgrad_G3 = reshape(permute(rgrad.G, [3 1 2]), r3, r1*r2); % double(tenmat(rgrad.G,3));
0174
0175         ASU1dot = 2*symm((2*symm(X.G1*eta_G1')*(egrad_G1*X.G1')) + X.G1G1t*(ehess_G1*X.G1' + egrad_G1*eta_G1')) - 4*symm(symm(eta_G1*X.G1')*BU1);
0176         ASU2dot = 2*symm((2*symm(X.G2*eta_G2')*(egrad_G2*X.G2')) + X.G2G2t*(ehess_G2*X.G2' + egrad_G2*eta_G2')) - 4*symm(symm(eta_G2*X.G2')*BU2);
0177         ASU3dot = 2*symm((2*symm(X.G3*eta_G3')*(egrad_G3*X.G3')) + X.G3G3t*(ehess_G3*X.G3' + egrad_G3*eta_G3')) - 4*symm(symm(eta_G3*X.G3')*BU3);
0178
0179
0180         SSU1dot = X.G1G1t;
0181         SSU2dot = X.G2G2t;
0182         SSU3dot = X.G3G3t;
0183         BU1dot = lyap(SSU1dot, -ASU1dot);
0184         BU2dot = lyap(SSU2dot, -ASU2dot);
0185         BU3dot = lyap(SSU3dot, -ASU3dot);
0186
0187
0188         Hess.U1 = (ehess.U1 - eta.U1*BU1 - X.U1*BU1dot - 2*rgrad.U1*symm(eta_G1*X.G1'))/X.G1G1t;
0189         Hess.U2 = (ehess.U2 - eta.U2*BU2 - X.U2*BU2dot - 2*rgrad.U2*symm(eta_G2*X.G2'))/X.G2G2t;
0190         Hess.U3 = (ehess.U3 - eta.U3*BU3 - X.U3*BU3dot - 2*rgrad.U3*symm(eta_G3*X.G3'))/X.G3G3t;
0191         Hess.G = ehess.G;
0192
0193
0194
0195         % BM: we need a correction factor for the non-constant metric
0196         % The correction factor owes itself to the Koszul formula.
0197         % This is the Riemannian connection in the Euclidean space with the
0198         % scaled metric.
0199
0200
0201         Hess.U1 = Hess.U1 + (eta.U1*symm(rgrad_G1*X.G1') + rgrad.U1*symm(eta_G1*X.G1'))/X.G1G1t;
0202         Hess.U2 = Hess.U2 + (eta.U2*symm(rgrad_G2*X.G2') + rgrad.U2*symm(eta_G2*X.G2'))/X.G2G2t;
0203         Hess.U3 = Hess.U3 + (eta.U3*symm(rgrad_G3*X.G3') + rgrad.U3*symm(eta_G3*X.G3'))/X.G3G3t;
0204         Hess.G = Hess.G  - permute(reshape(symm(rgrad.U1'*eta.U1)*X.G1,r1,r2,r3), [1 2 3]) ...
0205             - permute(reshape(symm(rgrad.U2'*eta.U2)*X.G2,r2,r1,r3), [2 1 3]) ...
0206             - permute(reshape(symm(rgrad.U3'*eta.U3)*X.G3,r3,r1,r2), [2 3 1]);
0207
0208         % The Riemannian connection on the quotient space is the
0209         % projection on the tangent space of the total space and then onto the horizontal
0210         % space. This is accomplished with the following operation.
0211
0212         Hess = M.proj(X, Hess);
0213
0214
0215     end
0216
0217
0218
0219
0220     M.proj = @projection;
0221     function etaproj = projection(X, eta)
0222         X = prepare(X); % Reuse already computed terms
0223
0224         % First, projection onto tangent space of total space
0225         SSU1 = X.G1G1t;
0226         ASU1 = 2*symm(X.G1G1t*(X.U1'*eta.U1)*X.G1G1t);
0227         BU1 = lyap(SSU1, -ASU1);
0228         eta.U1 = eta.U1 - X.U1*(BU1/X.G1G1t);
0229
0230         SSU2 = X.G2G2t;
0231         ASU2 = 2*symm(X.G2G2t*(X.U2'*eta.U2)*X.G2G2t);
0232         BU2 = lyap(SSU2, -ASU2);
0233         eta.U2 = eta.U2 - X.U2*(BU2/X.G2G2t);
0234
0235         SSU3 = X.G3G3t;
0236         ASU3 = 2*symm(X.G3G3t*(X.U3'*eta.U3)*X.G3G3t);
0237         BU3 = lyap(SSU3, -ASU3);
0238         eta.U3 = eta.U3 - X.U3*(BU3/X.G3G3t);
0239
0240
0241         eta_G1 = reshape(eta.G, r1, r2*r3);
0242         eta_G2 = reshape(permute(eta.G, [2 1 3]), r2, r1*r3);
0243         eta_G3 = reshape(permute(eta.G, [3 1 2]), r3, r1*r2);
0244
0245
0246         % Project onto the horizontal space.
0247         PU1 = skew((X.U1'*eta.U1)*X.G1G1t) + skew(X.G1*eta_G1');
0248         PU2 = skew((X.U2'*eta.U2)*X.G2G2t) + skew(X.G2*eta_G2');
0249         PU3 = skew((X.U3'*eta.U3)*X.G3G3t) + skew(X.G3*eta_G3');
0250
0251         % Calculate Omega1, Omega2, Omega3 that are required in finding the
0252         % horizontal component.
0253         % We use the Matlab's pcg function to solve the system efficiently.
0254         % We exploit the structure by designing a good preconditioner as well.
0255         % The preconditioner takes the block positive definite part of the
0256         % linear system.
0257
0258         % Options for PCG
0259         tol_omegax_pcg = 1e-6; % BM: standard tolerance as suggested in PCG.
0260         max_iterations_pcg = 15;% BM: fix this to 15 for simulations. In practice, it requires 7 to 10 iteraions.
0261
0262         % Preconditioner for PCG
0263         M1 = kron(speyer1,SSU1) + kron(SSU1, speyer1);
0264         M2 = kron(speyer2,SSU2) + kron(SSU2, speyer2);
0265         M3 = kron(speyer3,SSU3) + kron(SSU3, speyer3);
0266
0267         Mprecon_pcg = sparse(zeros(r1^2 + r2^2 + r3^2));
0268         Mprecon_pcg(1 : r1^2, 1 : r1^2 ) = M1;
0269         Mprecon_pcg(1 + r1^2 : r1^2 + r2^2, 1 + r1^2 : r1^2 + r2^2) = M2;
0270         Mprecon_pcg(1 + r1^2 + r2^2 : end, 1 + r1^2 + r2^2 : end) = M3;
0271
0272         % Call PCG
0273         [Omegaxsol, unused] = pcg(@compute_residual, [PU1(:); PU2(:); PU3(:)],  tol_omegax_pcg, max_iterations_pcg, Mprecon_pcg);
0274
0275         Omega1 = reshape(Omegaxsol(1:r1^2), r1, r1);
0276         Omega2 = reshape(Omegaxsol(1 + r1^2 : r1^2 + r2^2), r2, r2);
0277         Omega3 = reshape(Omegaxsol(1 + r1^2 + r2^2 : end), r3, r3);
0278
0279         function AOmegax = compute_residual(Omegax)
0280             Omegax1 = reshape(Omegax(1:r1^2), r1, r1);
0281             Omegax2 = reshape(Omegax(1 + r1^2 : r1^2 + r2^2), r2, r2);
0282             Omegax3 = reshape(Omegax(1 + r1^2 + r2^2 : end), r3, r3);
0283
0284             OffsetU1 = X.G1*((kron(speyer3,Omegax2) + kron(Omegax3, speyer2))*X.G1');
0285             OffsetU2 = X.G2*((kron(speyer3,Omegax1) + kron(Omegax3, speyer1))*X.G2');
0286             OffsetU3 = X.G3*((kron(speyer2,Omegax1) + kron(Omegax2, speyer1))*X.G3');
0287
0288             residual1 = Omegax1*SSU1 + SSU1*Omegax1 - OffsetU1;
0289             residual2 = Omegax2*SSU2 + SSU2*Omegax2 - OffsetU2;
0290             residual3 = Omegax3*SSU3 + SSU3*Omegax3 - OffsetU3;
0291
0292             AOmegax = [residual1(:); residual2(:); residual3(:)];
0293         end
0294
0295
0296         % Calculate projection along U1, U2, and U3
0297         etaproj.U1 = eta.U1 - (X.U1*Omega1);
0298         etaproj.U2 = eta.U2 - (X.U2*Omega2);
0299         etaproj.U3 = eta.U3 - (X.U3*Omega3);
0300
0301         % Calculate projection algong G
0302         GOmega1 = reshape(Omega1*X.G1, r1, r2, r3);
0303         GOmega2 = permute(reshape(Omega2*X.G2, r2, r1, r3), [2 1 3]);
0304         GOmega3 = permute(reshape(Omega3*X.G3, r3, r1, r2), [2 3 1]);
0305         etaproj.G = eta.G -(-(GOmega1+GOmega2+GOmega3));
0306
0307     end
0308
0309
0310
0311     M.tangent = M.proj;
0312     M.tangent2ambient = @(X, eta) eta;
0313
0314     M.retr = @retraction;
0315     function Y = retraction(X, eta, t)
0316         if nargin < 3
0317             t = 1.0;
0318         end
0319
0320         Y.G = (X.G + t*eta.G);
0321         Y.U1 = uf((X.U1 + t*eta.U1)); % U factor of Polar factorization
0322         Y.U2 = uf((X.U2 + t*eta.U2));
0323         Y.U3 = uf((X.U3 + t*eta.U3));
0324
0325         Y = prepare(Y);
0326     end
0327
0328     M.exp = @exponential;
0329     function Y = exponential(X, eta, t)
0330         if nargin < 3
0331             t = 1.0;
0332         end
0333         Y = retraction(X, eta, t);
0334         warning('manopt:fixedrankfactory_tucker_preconditioned:exp', ...
0335             ['Exponential for fixed rank ' ...
0336             'Tucker manifold not implemented yet. Used retraction instead.']);
0337     end
0338
0339     M.hash = @(X) ['z' hashmd5([sum(X.U1(:)) ; sum(X.U2(:)); sum(X.U3(:)); sum(X.G(:)) ])]; % Efficient, suggested by Bart Vandereycken.
0340     % M.hash = @(X) ['z' hashmd5([X.U1(:); X.U2(:); X.U3(:); X.G(:)])];
0341
0342     M.rand = @random;
0343     function X = random()
0344         %         % Random generator on the total space
0345         %         % Factors U1, U2, and U3 are on Stiefel manifolds, hence we reuse
0346         %         % their random generator.
0347         %         stiefell = stiefelfactory(n1, r1);
0348         %         stiefelm = stiefelfactory(n2, r2);
0349         %         stiefeln = stiefelfactory(n3, r3);
0350         %
0351         %         X.U1 = stiefell.rand();
0352         %         X.U2 = stiefelm.rand();
0353         %         X.U3 = stiefeln.rand();
0354         %
0355         %         % Random initialization: generalization of randn(r1, r1 = r2) in the
0356         %         % matrix case.
0357         %         X.G = randn(r1,r2,r3);
0358
0359
0360         %  Random generator on the fixed-rank space from a uniform distribution on [0, 1].
0361         [U1, R1] = qr(rand(n1, r1), 0);
0362         [U2, R2] = qr(rand(n2, r2), 0);
0363         [U3, R3] = qr(rand(n3, r3), 0);
0364         C  = rand(r1, r2, r3);
0365
0366         C1 = reshape(C, r1, r2*r3);
0367         CR1 = reshape(R1*C1, r1, r2, r3); % Multplication by R1
0368
0369         C2 = reshape(permute(CR1, [2 1 3]), r2, r1*r3);
0370         CR1R2 = permute(reshape(R2*C2, r2, r1, r3), [2 1 3]); % Multplication by R2
0371
0372         C3 = reshape(permute(CR1R2, [3 1 2]), r3, r1*r2);
0373         CR1R2R3 = permute(reshape(R3*C3, r3, r1, r2), [2 3 1]); % Multplication by R3
0374
0375         X.U1 = U1;
0376         X.U2 = U2;
0377         X.U3 = U3;
0378         X.G = CR1R2R3;
0379
0380
0381         % Compute some terms that are used subsequently.
0382         X = prepare(X);
0383
0384     end
0385
0386     M.randvec = @randomvec;
0387     function eta = randomvec(X)
0388         % A random vector on the horizontal space
0389         eta.U1 = randn(n1, r1);
0390         eta.U2 = randn(n2, r2);
0391         eta.U3 = randn(n3, r3);
0392         eta.G = randn(r1, r2, r3);
0393         eta = projection(X, eta);
0394         nrm = M.norm(X, eta);
0395         eta.U1 = eta.U1 / nrm;
0396         eta.U2 = eta.U2 / nrm;
0397         eta.U3 = eta.U3 / nrm;
0398         eta.G = eta.G / nrm;
0399     end
0400
0401     M.lincomb = @lincomb;
0402
0403     M.zerovec = @(X) struct('U1', zeros(n1, r1), 'U2', zeros(n2, r2), ...
0404         'U3', zeros(n3, r3), 'G', zeros(r1, r2, r3));
0405
0406     M.transp = @(x1, x2, d) projection(x2, d);
0407
0408     % vec and mat are not isometries, because of the scaled metric.
0409     M.vec = @(X, U1) [U1.U1(:); U1.U2(:); U1.U3(:); U1.G(:)];
0410     M.mat = @(X, u) struct ...
0411         ('U1', reshape(u(1  : n1*r1), n1, r1), ...
0412         'U2', reshape(u(n1*r1 + 1 : n1*r1 + n2*r2), n2, r2), ...
0413         'U3', reshape(u(n1*r1 + n2*r2 + 1 : n1*r1 + n2*r2 + n3*r3), n3, r3), ...
0414         'G', reshape(u(n1*r1 + n2*r2 + n3*r3 + 1 : end), r1, r2, r3));
0415     M.vecmatareisometries = @() false;
0416
0417 end
0418
0419 % Linear combination of tangent vectors
0420 function d = lincomb(X, a1, d1, a2, d2) %#ok<INUSL>
0421
0422     if nargin == 3
0423         d.U1 = a1*d1.U1;
0424         d.U2 = a1*d1.U2;
0425         d.U3 = a1*d1.U3;
0426         d.G = a1*d1.G;
0427     elseif nargin == 5
0428         d.U1 = a1*d1.U1 + a2*d2.U1;
0429         d.U2 = a1*d1.U2 + a2*d2.U2;
0430         d.U3 = a1*d1.U3 + a2*d2.U3;
0431         d.G = a1*d1.G + a2*d2.G;
0432     else
0433         error('Bad use of fixedrankfactory_tucker_preconditioned.lincomb.');
0434     end
0435
0436 end
0437
0438 function U = uf(A) % U factor of Polar factorization of a tall matrix A.
0439     [L, unused, R] = svd(A, 0); %#ok
0440     U = L*R';
0441 end
0442
0443
0444```

Generated on Fri 08-Sep-2017 12:43:19 by m2html © 2005