Manopt, symmetric fixed-rank manifold

The set $S_+(n, k)$ of rank-$k$ symmetric positive semidefinite matrices $X$ of size $n\times n$ is parametrized as $X = YY^T$ where $Y \in \mathbb{R}^{n\times k}_*$ is a full column rank matrix of size $n\times k$. Notice that for all orthogonal matrices $O$ of size $k$, the transformation $Y \rightarrow YO$ leaves $X$ unchanged. As a result, we may consider all matrices of the form $YO$ as being equivalent for our purpose. Quotienting this equivalence relation out results in a quotient space, $\mathbb{R}^{n \times k}_* / {\rm O}(k)$, with ${\rm O}(k)$ the group of orthogonal matrices. A point on the quotient space is (conceptually) an equivalence class $[Y] = \{YO : O \in {\rm O}(k)\}$. Numerically, we represent that equivalence class with (any) matrix $Y$ in $[Y]$.

A function $\phi : S_+(n,k) \to \mathbb{R}: X \mapsto \phi(X) $ induces a function $f: \mathbb{R}^{n \times k}_* \rightarrow \mathbb{R}: Y \mapsto f(Y)$ defined as $f(Y) = \phi(YY^T)$. Of course, $f$ has the appropriate invariance $f(Y) = f(YO)$ for all orthogonal $O$. Thus, $f$ is defined on the total space $\mathbb{R}^{n\times k}_*$ and descends as a well-defined function on the quotient space $\mathbb{R}^{n \times k}_* / {\rm O}(k)$. Endowing the total space with the usual Riemannian structure of a Euclidean space (the inner product $\langle H_1, H_2 \rangle = \operatorname{trace}(H_1^T H_2)$), a Riemannian structure for the quotient space follows.

See the paper [JBAS10] referenced below for details on this geometry, and [AMS08] for general background.

Factory call: M = symfixedrankYYfactory(n, k).

Name	Formula	Numerical representation
Set	$S_+(n,k) = \{X \in \mathbb{R}^{n\times n} : X=X^T\succeq 0, {\rm rank}(X) = k\} \\ = \{ YY^T: Y \in \mathbb{R}^{n\times k}, {\rm rank}(Y) = k\}$	$X$ is represented by means of $Y$ as a matrix `Y` of size $n\times k$.
Vertical space at $Y$	$\{ Y\Omega : \Omega \in \mathbb{R}^{k\times k}, \Omega + \Omega^T = 0 \}$	A vertical vector $U$ at $Y$ is represented as a matrix `U` of size $n\times k$: it is tangent to the equivalence classes.
Horizontal space at $Y$	$\{ U \in \mathbb{R}^{n\times k}: U^T Y = Y^T U \}$	A horizontal vector $U$ at $Y$ is represented as a matrix `U` of size $n\times k$: it is orthogonal to all vertical vectors.
Ambient space	$\mathbb{R}^{n\times k}$	Points and vectors in the ambient space are, naturally, represented as matrices of size $n\times k$.

The following table shows some of the nontrivial available functions in the structure M. The tutorial page gives more details about the functionality implemented by each function.

Name	Field usage	Formula
Dimension	`M.dim()`	$\operatorname{dim}\mathcal{M} = kn - k(k-1)/2$
Metric	`M.inner(Y, U, V)`	$\langle U, V\rangle_Y = \operatorname{trace}(U^T V)$
Norm	`M.norm(Y, U)`	$\\|U\\|_Y = \sqrt{\langle U, U \rangle_Y}$
Horizontal space projector	`M.proj(Y, H)`	$P_Y(H) = H - Y\Omega$, where `H` represents a vector in the ambient space. $\Omega$ is a skew-symmetric matrix obtained as the unique solution to the Sylvester equation $\Omega (Y^T Y) + (Y^T Y)\Omega = Y^T H - H^T Y $
Euclidean to Riemannian gradient	`M.egrad2rgrad(Y, egrad)`	$\operatorname{grad} f(Y) = \nabla f(Y)$, where `egrad` represents the Euclidean gradient $\nabla f(Y)$, which is a vector in the ambient space. There is indeed nothing to do, since the Euclidean gradient of $f$ is already a horizontal vector thanks to the fact that $f(Y) = f(YO)$ for all orthogonal $O$.
Euclidean to Riemannian Hessian	`M.ehess2rhess(Y, egrad, ehess, U)`	$\operatorname{Hess} f(Y)[U] = P_Y(\nabla^2 f(Y)[U])$, where `egrad` represents the Euclidean gradient $\nabla f(Y)$ (which happens to not be necessary in this case) and `ehess` represents the Euclidean Hessian $\nabla^2 f(Y)[U]$, both being vectors in the ambient space.
Retraction	`M.retr(Y, U, t)`	$\operatorname{Retr}_Y(tU) = Y + tU$
Random point	`M.rand()`	Returns a point at random as follows: generate $Y$ with i.i.d. normal entries; return $Y$. With high probability, $Y$ will indeed be full-rank.
Random vector	`M.randvec(Y)`	Returns a horizontal vector at $Y$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return $P_Y(H)/\\|P_Y(H)\\|_Y$.
Vector transport	`M.transp(Y, Z, U)`	$\operatorname{Transp}_{Z\leftarrow Y}(U) = P_Z(U)$, where $U$ is a horizontal vector at $Y$ that is transported to the horizontal space at $Z$.

This is an example of low-rank matrix approximation. Let $A\in\mathbb{R}^{n\times n}$ be a symmetric matrix. We minimize the following cost function:

$$\phi(X) =\| X - A \|_F ^2,$$

such that $X \in S_+(n, k)$. $X$ is factorized as $YY^T$ with $Y \in \mathbb{R}^{n \times k}_*$. The function $f:\mathbb{R}^{n \times k}_* \rightarrow \mathbb{R}$ is defined as $f(Y) = \| YY^T - A \|_F ^2 $. Compute the Euclidean gradient and Hessian of $f$:

$$\nabla f(Y) = 4(YY^T - A)Y,$$

$$\nabla^2 f(Y)[U] = 4(YU^T + UY^T)Y + 4(YY^T - A)U.$$

The Riemannian gradient and Hessian are obtained by applying the M.egrad2rgrad and M.ehess2rhess operators:

$$\operatorname{grad} f(x) = 4(YY^T - A)Y,$$

$$\operatorname{Hess} f(Y)[U] = \nabla^2 f(Y)[U] - Y\Omega$$ where $\Omega$ is obtained by solving the Sylvester equation, $$ \Omega (Y^T Y) + (Y^T Y )\Omega = Y^T \nabla^2 f(Y)[U] - \nabla^2 f(Y)[U] ^T Y. $$

The Sylvester equation can be solved efficiently using lyap of MATLAB. Notice that there is no need to compute these explicitly: it suffices to write code for the Euclidean quantities and to apply the conversion tools on them to obtain the Riemannian quantities, as in the following code:

% Generate the problem data. 
n = 1000; 
k = 5;
Y_org = randn(n,k);
A = Y_org*Y_org'; 

% Create the problem structure.
manifold = symfixedrankYYfactory(n,k);
problem.M = manifold;

% Define the problem cost function and its gradient.

problem.cost = @(Y) norm( Y*Y' - A, 'fro')^2;
egrad = @(Y) 4*(Y*Y' - A)*Y;
ehess = @(Y, U) 4*(Y*U' + U*Y')*Y + 4*(Y*Y' - A)*U;
problem.grad = @(Y) manifold.egrad2rgrad(Y, egrad(Y));
problem.hess = @(Y, U) manifold.ehess2rhess(Y, egrad(Y), ehess(Y, U), U);

% Numerically check the differentials.
checkgradient(problem); pause;
checkhessian(problem); pause;

% Execute the optimization
Y = trustregions(problem);

[JBAS10] M. Journée, F. Bach, P.-A. Absil and R. Sepulchre, Low-rank optimization on the cone of positive semidefinite matrices, SIOPT, 2010.
[AMS08] P.-A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton University Press, 2008.

We are still working on building a collection of examples for Manopt. The relevant ones will be referenced here when the time comes.

A Riemannian quotient geometry for the set of fixed-rank symmetric positive semidefinite matrices

Toolset

Example

References

List of examples

To contact us

A Riemannian quotient geometry for the set of fixed-rank symmetric positive semidefinite matrices

Toolset

Example

References

List of examples