The oblique manifold $\mathcal{OB}(n, m)$ (the set of matrices of size nxm with unit-norm columns) is endowed with a Riemannian manifold structure by considering it as a Riemannian submanifold of the embedding Euclidean space $\mathbb{R}^{n\times m}$ endowed with the usual inner product $\langle H_1, H_2 \rangle = \operatorname{trace}(H_1^T H_2)$. Its geometry is exactly the same as that of the product manifold of spheres $\mathbb{S}^{n-1}\times \cdots \times \mathbb{S}^{n-1}$ ($m$ copies), see the sphere manifold.

Factory call: M = obliquefactory(n, m).

 Name Formula Numerical representation Set $\mathcal{OB}(n, m) = \{ X \in \mathbb{R}^{n\times m} : (X^TX)_{ii} = 1, i = 1:m \}$ $X$ is represented as a matrix X of size nxm whose columns have unit 2-norm, i.e., X(:, i).'*X(:, i) = 1 for i = 1:m. Tangent space at $X$ $T_X \mathcal{OB}(n, m) = \{ U \in \mathbb{R}^{n\times m} : (X^TU)_{ii} = 0, i = 1:m \}$ A tangent vector $U$ at $X$ is represented as a matrix U of size nxm such that each column of U is orthogonal to the corresponding column of X, i.e., X(:, i).'*U(:, i) = 0 for i = 1:m. Ambient space $\mathbb{R}^{n\times m}$ Points and vectors in the ambient space are, naturally, represented as matrices of size nxm.

The following table shows some of the nontrivial available functions in the structure M. The norm $\|\cdot\|$ refers to the norm in the ambient space, which is the Frobenius norm. The tutorial page gives more details about the functionality implemented by each function.

 Name Field usage Formula Dimension M.dim()  $\operatorname{dim}\mathcal{M} = m(n-1)$ Metric M.inner(X, U, V)  $\langle U, V\rangle_X = \operatorname{trace}(U^T V)$ Norm M.norm(X, U)  $\|U\|_X = \sqrt{\langle U, U \rangle_X}$ Distance M.dist(X, Y)  $\operatorname{dist}(X, Y) = \sqrt{\sum_{i=1}^m \arccos^2((X^T Y)_{ii})}$ Typical distance M.typicaldist()  $\pi\sqrt{m}$ Tangent space projector M.proj(X, H)  $P_X(H) = H - X\operatorname{ddiag}(X^T H)$, where H represents a vector in the ambient space and $\operatorname{ddiag}$ sets all off-diagonal entries of a matrix to zero. Euclidean to Riemannian gradient M.egrad2rgrad(X, egrad) $\operatorname{grad} f(X) = P_X(\nabla f(X))$, where egrad represents the Euclidean gradient $\nabla f(X)$, which is a vector in the ambient space. Euclidean to Riemannian Hessian M.ehess2rhess(X, egrad, ehess, U) $\operatorname{Hess} f(X)[U] = P_X(\nabla^2 f(X)[U]) - U \operatorname{ddiag}(X^T \nabla f(X))$, where egrad represents the Euclidean gradient $\nabla f(X)$ and ehess represents the Euclidean Hessian $\nabla^2 f(X)[U]$, both being vectors in the ambient space. Exponential map M.exp(X, U, t)  See the sphere manifold: the same exponential map is applied column-wise. Retraction M.retr(X, U, t)  $\operatorname{Retr}_X(tU) = \operatorname{normalize}(X+tU)$, where $\operatorname{normalize}$ scales each column of the input matrix to have norm 1. Logarithmic map M.log(X, Y)  See the sphere manifold: the same logarithmic map is applied column-wise. Random point M.rand()  Returns a point uniformly at random w.r.t. the natural measure as follows: generate $X$ with i.i.d. normal entries; return $\operatorname{normalize}(X)$. Random vector M.randvec(X)  Returns a unit-norm tangent vector at $X$ with uniformly random direction, obtained as follows: generate $H$ with i.i.d. normal entries; return: $U = P_X(H) / \|P_X(H)\|$. Vector transport M.transp(X, Y, U)  $\operatorname{Transp}_{Y\leftarrow X}(U) = P_Y(U)$, where $U$ is a tangent vector at $X$ that is transported to the tangent space at $Y$. Pair mean M.pairmean(X, Y)  $\operatorname{mean}(X, Y) = \operatorname{normalize}(X+Y)$

Let $A\in\mathbb{R}^{n\times m}$ be any matrix. We search for the matrix with unit-norm columns that is closest to $A$ according to the Frobenius norm. Of course, this problem has an obvious solution (simply normalize the columns of $A$). We treat it merely for the sake of example. We minimize the following cost function:

$$f(X) = \frac{1}{2} \|X-A\|^2,$$

such that $X \in \mathcal{OB}(n, m)$. Compute the Euclidean gradient and Hessian of $f$:

$$\nabla f(X) = X-A,$$

$$\nabla^2 f(X)[U] = U.$$

The Riemannian gradient and Hessian are obtained by applying the M.egrad2rgrad and M.ehess2rhess operators. Notice that there is no need to compute these explicitly: it suffices to write code for the Euclidean quantities and to apply the conversion tools on them to obtain the Riemannian quantities, as in the following code:

% Generate the problem data.
n = 5;
m = 8;
A = randn(n, m);

% Create the problem structure.
manifold = obliquefactory(n, m);
problem.M = manifold;

% Define the problem cost function and its derivatives.
problem.cost = @(X) .5*norm(X-A, 'fro')^2;
ehess = @(X, U) U;
problem.hess = @(X, U) manifold.ehess2rhess(X, egrad(X), ehess(X, U), U);

% Numerically check the differentials.
checkhessian(problem); pause;


Of course, this is not efficient: there are redundant computations. See the various ways of describing the cost function for better alternatives.

Let us consider a second, more interesting, example. A correlation matrix $C \in \mathbb{R}^{n\times n}$ is a symmetric, positive semidefinite matrix with 1's on the diagonal. If $C$ is of rank $k$, there always exists a matrix $X \in \mathcal{OB}(k, n)$ such that $C = X^TX$. In fact, there exist many such matrices: given such an $X$, a whole manifold of equivalent matrices is obtained by considering $QX$ with $Q$ an orthogonal matrix of size $k$. Disregarding this equivalence relation (see help elliptopefactory), we can address the problem of nearest low-rank correlation matrix as follows:

Let $A \in \mathbb{R}^{n\times n}$ be a given symmetric matrix. We wish to find the correlation matrix $C = X^TX$ of rank at most $k$ which is closest to $A$, according to the Frobenius norm [Hig01]. That is, we wish to minimize:

$$f(X) = \frac{1}{4} \|X^TX - A\|^2$$

with $X \in \mathcal{OB}(k, n)$.The Euclidean gradient and Hessian are given by:

$$\nabla f(X) = X(X^TX - A),$$

$$\nabla^2 f(X)[U] = X(U^TX + X^TU) + U(X^TX-A).$$

In Manopt code, this yields:

% Generate the problem data.
n = 10;
k = 3;
A = randn(n);
A = (A + A.')/2;

% Create the problem structure.
manifold = obliquefactory(k, n);
problem.M = manifold;

% Define the problem cost function and its derivatives.
problem.cost = @(X) .25*norm(X.'*X-A, 'fro')^2;
ehess = @(X, U) X*(U.'*X+X.'*U) + U*(X.'*X-A);
problem.hess = @(X, U) manifold.ehess2rhess(X, egrad(X), ehess(X, U), U);

% Numerically check the differentials.
checkhessian(problem); pause;

% Solve
X = trustregions(problem);
C = X.'*X;
% C is a rank k (at most) symmetric, positive semidefinite matrix with ones on the diagonal:
disp(C);
disp(eig(C));


Again, there is a fair bit of redundant computations in this formulation. See the various ways of describing the cost function for better alternatives.

For theory on Riemannian submanifolds, see [AMS08], section 3.6.1 (first-order derivatives) and section 5.3.3 (second-order derivatives, i.e., connections).

For content specifically about the oblique manifold with applications, see [AG06].