Trivialization¶

arxiv-link Trivializations for Gradient-Based Optimization on Manifolds

arxiv-link Geometric Optimisation on Manifolds with Applications to Deep Learning

Trivialization: given a manifold $\mathcal{M}$, trivialization is defined as a surjective map from the Euclidean space onto the manifold:

$$ \phi:\mathbb{R}^n\to \mathcal{M} $$

A constrained optimization problem over some manifold $\mathcal{M}$

$$ \min_{x\in\mathcal{M}} f(x) $$

can be converted to an unconstrained optimization problem via a trivialization:

$$ \min_{\theta\in\mathbb{R}^n} f(\phi(\theta)) $$

In this notebook, we will list some common manifolds and their trivializations.

Math notation¶

$\mathbb{R}$: real number
$\mathbb{R}_+$: real positive number
$\mathbb{R}^d$: $d$-dimensional real vector
$x\in\mathbb{R}^d,x> 0$: $x$ is a $d$-dimensional real vector and all elements are positive
$x\in\mathbb{R}^{d},x\succeq 0$: $x$ is a $d$-dimensional real vector and all elements are non-negative
$\mathbb{R}^{m\times n}$: $m\times n$ real matrix
$\mathbb{R}^{m\times m},x\succ 0$: $m\times m$ real definite positive matrix (all eigenvalues are positive)
$\mathbb{R}^{m\times m},x\succeq 0$: $m\times m$ real semi-definite positive matrix (all eigenvalues are non-negative)

In [1]:

Copied!





import numpy as np
import torch

try:
    import numqi
except ImportError:
    %pip install numqi
    import numqi
import numpy as np
import torch

try:
    import numqi
except ImportError:
    %pip install numqi
    import numqi

Positive real number¶

$$ \mathbb{R}_+ = \{x\in\mathbb{R}:x>0\} $$

One trivialization is the SoftPlus function:

$$ \phi(\theta) = \log(1+\exp(\theta)):\mathbb{R}\to\mathbb{R}_+ $$

Another trivialization is the Exponential function:

$$ \phi(\theta) = \exp(\theta):\mathbb{R}\to\mathbb{R}_+ $$

In [2]:

Copied!





manifold = numqi.manifold.PositiveReal(batch_size=5, method='softplus')
point = manifold().detach().numpy() #random point
print('softplus:', point)

manifold = numqi.manifold.PositiveReal(batch_size=5, method='exp')
point = manifold().detach().numpy() #random point
print('exp:', point)
manifold = numqi.manifold.PositiveReal(batch_size=5, method='softplus')
point = manifold().detach().numpy() #random point
print('softplus:', point)

manifold = numqi.manifold.PositiveReal(batch_size=5, method='exp')
point = manifold().detach().numpy() #random point
print('exp:', point)

softplus: [0.51739221 0.58316317 0.49824579 0.56750014 0.88144903]
exp: [1.00362837 0.88417718 1.18390397 0.89086946 1.6291605 ]

Discrete Probability Simplex¶

$$ \Delta^{n-1}_+ = \{x\in\mathbb{R}^n:x_i>0,x_1+x_2+\cdots +x_n = 1\} $$

Trivialization can be composed: Let $g$ be any trivialization map of $\mathcal{R}_+$, then

$$ \phi(\theta)=(x_1,x_2,\cdots,x_n):\mathrm{dom}(g)^n\to\Delta_+^{n-1} $$

wth

$$ x_i = \frac{g(\theta_i)}{\sum_j g(\theta_j)} $$

gives a trivialization of $\Delta_+^{n-1}$. Specifically, the SoftMax function corresponds to $g(\theta) = \exp(\theta)$.

In [3]:

Copied!





manifold = numqi.manifold.DiscreteProbability(5)
point = manifold().detach().numpy()
print('point:', point)
print('sum(point):', np.sum(point))
manifold = numqi.manifold.DiscreteProbability(5)
point = manifold().detach().numpy()
print('point:', point)
print('sum(point):', np.sum(point))

point: [0.22057787 0.18214024 0.25769734 0.19781478 0.14176977]
sum(point): 1.0

Sphere¶

wiki-link

The sphere $S^n$ is defined as

$$ S^n = \{x\in\mathbb{R}^{n+1}:\lVert x\rVert_2=1\} $$

The following quotient map is a trivialization of the sphere:

$$\phi(\theta)=\frac{\theta}{\lVert \theta\rVert}: \mathbb{R}^{n+1}\to S^n$$

PS: the origin point is divergent in the trivialization, so we should be careful in initialization. We just hope the optimization will not jump to the origin, and if that happens, we should reinitialize the optimization. (such cases seem rare if random initialization is used).

In [4]:

Copied!





dim = 5
manifold = numqi.manifold.Sphere(dim, method='quotient')
point = manifold().detach().numpy()
print('point:', point)
print('norm:', np.linalg.norm(point))
dim = 5
manifold = numqi.manifold.Sphere(dim, method='quotient')
point = manifold().detach().numpy()
print('point:', point)
print('norm:', np.linalg.norm(point))

point: [ 0.47254695 -0.49881106 -0.58225862 -0.02010881 -0.43411685]
norm: 0.9999999999999999

Stiefel Manifold¶

wiki-link

The Stiefel manifold $\mathrm{St}(n,r)$ is defined as

$$ \mathrm{St}(n,r) = \{X\in\mathbb{R}^{n\times r}:X^TX=I_r\} $$

QR decomposition is a trivialization of the Stiefel manifold:

$$ \phi(\theta)=Q: \mathbb{R}^{n\times r}\to \mathrm{St}(n,r) $$

where $\theta=QR$ is the QR decomposition of $X$, and $Q$ is an orthogonal matrix.

PS: if the rank of matrix $X$ is smaller than $r$, then the QR decomposition will fail. Still, nothing we can do except hoping the optimization will not jump this singular point or reinitializing the optimization (such cases seem rare if random initialization is used).

TODO doi-link A Global Cayley Parametrization of Stiefel Manifold for Direct Utilization of Optimization Mechanisms Over Vector Spaces

In [5]:

Copied!





manifold = numqi.manifold.Stiefel(dim=5, rank=3, method='qr')
point = manifold().detach().numpy()
print('point:', point, sep='\n')
print('\nX^T X:', point.T @ point, sep='\n')
manifold = numqi.manifold.Stiefel(dim=5, rank=3, method='qr')
point = manifold().detach().numpy()
print('point:', point, sep='\n')
print('\nX^T X:', point.T @ point, sep='\n')

point:
[[-0.60153743  0.10697793  0.61505246]
 [ 0.06647886 -0.62756526  0.46743873]
 [-0.02975613 -0.67862215 -0.0247061 ]
 [-0.42334032 -0.35381689 -0.62503726]
 [ 0.67352122 -0.09488502  0.10922247]]

X^T X:
[[1.00000000e+00 9.22785256e-18 6.46041439e-17]
 [9.22785256e-18 1.00000000e+00 8.18746162e-17]
 [6.46041439e-17 8.18746162e-17 1.00000000e+00]]

Symmetric and Hermitian Matrices¶

wiki-link/symmetric-matrix

wiki-link/Hermitian-matrix

The set of symmetric matrices $\mathrm{Sym}^n$ is defined as

$$ \mathrm{Sym}^n = \{X\in\mathbb{R}^{n\times n}:X=X^T\} $$

The set of Hermitian matrices $\mathrm{Herm}^n$ is defined as

$$ \mathrm{Herm}^n = \{X\in\mathbb{C}^{n\times n}:X=X^\dagger\} $$

Both of them are vector spaces and we can find their basis $\{E_i\}$. E.g., For Hermitian matrix, the basis is Gell-Mann matrices (see tutorial/gellmann) and the identity matrix. The trivialization is just the vectorization of the matrix:

$$\phi(\theta)=\sum_i\theta_iE_i $$

In [6]:

Copied!

manifold_sym = numqi.manifold.SymmetricMatrix(3)
point = manifold_sym().detach().numpy()
print('point:', point, sep='\n')
manifold_sym = numqi.manifold.SymmetricMatrix(3)
point = manifold_sym().detach().numpy()
print('point:', point, sep='\n')

point:
[[-0.81115209 -0.42325889  0.15562708]
 [-0.42325889 -0.85951043  0.27368955]
 [ 0.15562708  0.27368955  0.82422752]]

In [7]:

Copied!

manifold_herm = numqi.manifold.SymmetricMatrix(3, dtype=torch.complex128)
point = manifold_herm().detach().numpy()
print('point:', point, sep='\n')
manifold_herm = numqi.manifold.SymmetricMatrix(3, dtype=torch.complex128)
point = manifold_herm().detach().numpy()
print('point:', point, sep='\n')

point:
[[ 0.1484649 +0.j          0.11923554+0.41801755j  0.17652009+0.40606163j]
 [ 0.11923554-0.41801755j  0.09251729+0.j          0.45413704+0.29005642j]
 [ 0.17652009-0.40606163j  0.45413704-0.29005642j -0.77214314+0.j        ]]

Rank-$r$ Positive Semi-Definite Matrices¶

The set of real rank-$r$ positive semi-definite matrices $\mathrm{PSD}^n_r$ is defined as

$$ \mathrm{Sym}^{(n,r)}_+ = \{X\in\mathbb{R}^{n\times n}:X\succeq 0,\mathrm{rank}(X)=r\} $$

To make it bounded, numqi adds a constraint that the trace of the matrix is 1. The trivialization is the reverse of Cholesky decomposition:

$$ \phi(\theta)=g(\theta)g(\theta)^T: \mathrm{dom}(g)\to \mathrm{Sym}^{(n,r)}_+ $$

where $g$ is a trivialization map of the lower triangular matrix $\mathrm{image}(g)=L^{(n,r)}_+$

$$ L^{(n,r)}_+=\{X\in\mathbb{R}^{n\times r}: X_{ii}>0,X_{ij,j>i}=0\}$$

In [8]:

Copied!





manifold = numqi.manifold.Trace1PSD(dim=4, rank=2, method='cholesky')
point = manifold().detach().numpy()
print(point)
print('eigenvalue:', np.linalg.eigvalsh(point)) #eigenvalues are positive (up to machine precision)
print('trace(point):', np.trace(point))
manifold = numqi.manifold.Trace1PSD(dim=4, rank=2, method='cholesky')
point = manifold().detach().numpy()
print(point)
print('eigenvalue:', np.linalg.eigvalsh(point)) #eigenvalues are positive (up to machine precision)
print('trace(point):', np.trace(point))

[[ 0.29664101 -0.10768363 -0.07985672  0.01866052]
 [-0.10768363  0.38736454 -0.19180935  0.2243213 ]
 [-0.07985672 -0.19180935  0.16147877 -0.1515327 ]
 [ 0.01866052  0.2243213  -0.1515327   0.15451568]]
eigenvalue: [6.28097136e-18 5.26996268e-17 3.50357291e-01 6.49642709e-01]
trace(point): 1.0

Similarly, we can define the set of trace-1 complex rank-$r$ positive semi-definite matrices $\mathrm{PSD}^n_r$

$$ \mathrm{Herm}^{(n,r)}_+ = \{X\in\mathbb{C}^{n\times n}:X\succeq 0,\mathrm{rank}(X)=r,\mathrm{trace}(X)=1\} $$

In [9]:

Copied!





manifold = numqi.manifold.Trace1PSD(dim=3, rank=2, method='cholesky', dtype=torch.complex128)
point = manifold().detach().numpy()
print(point)
print('eigenvalue:', np.linalg.eigvalsh(point))
print('trace(point):', np.trace(point))
manifold = numqi.manifold.Trace1PSD(dim=3, rank=2, method='cholesky', dtype=torch.complex128)
point = manifold().detach().numpy()
print(point)
print('eigenvalue:', np.linalg.eigvalsh(point))
print('trace(point):', np.trace(point))

[[ 2.33641328e-01+0.j         -9.74943542e-02-0.20011814j
   3.61762964e-04+0.05429129j]
 [-9.74943542e-02+0.20011814j  5.07662785e-01+0.j
  -2.35080260e-01-0.21529573j]
 [ 3.61762964e-04-0.05429129j -2.35080260e-01+0.21529573j
   2.58695887e-01+0.j        ]]
eigenvalue: [3.77866883e-18 1.93843601e-01 8.06156399e-01]
trace(point): (1+0j)

Special Orthogonal Group¶

wiki/orthogonal-group

The special orthogonal group $\mathrm{SO}(n)$ is defined as

$$ \mathrm{SO}(n) = \{X\in\mathbb{R}^{n\times n}:X^TX=I_n,\det(X)=1\} $$

The trivialization can be Cayley transform wiki-link or matrix exponential.

Similarly, for special unitary group $\mathrm{SU}(n)$

$$ \mathrm{SU}(n) = \{X\in\mathbb{C}^{n\times n}:X^\dagger X=I_n,\det(X)=1\} $$

In [10]:

Copied!





manifold_so = numqi.manifold.SpecialOrthogonal(dim=4)
point = manifold_so().detach().numpy()
print('point:', point, sep='\n')
print('point.T @ point:', point.T @ point, sep='\n')
manifold_so = numqi.manifold.SpecialOrthogonal(dim=4)
point = manifold_so().detach().numpy()
print('point:', point, sep='\n')
print('point.T @ point:', point.T @ point, sep='\n')

point:
[[ 0.86824691 -0.00102545 -0.48648457  0.09736023]
 [-0.1045995   0.94250733 -0.2320141  -0.21658333]
 [ 0.39068595  0.32155393  0.77312687  0.38241654]
 [-0.28735125  0.09100525 -0.33433077  0.8929559 ]]
point.T @ point:
[[1.00000000e+00 1.06572667e-17 1.65922223e-16 4.75430831e-17]
 [1.06572667e-17 1.00000000e+00 9.82127540e-19 5.04977227e-17]
 [1.65922223e-16 9.82127540e-19 1.00000000e+00 1.02411918e-16]
 [4.75430831e-17 5.04977227e-17 1.02411918e-16 1.00000000e+00]]

In [11]:

Copied!





manifold_su = numqi.manifold.SpecialOrthogonal(dim=3, dtype=torch.complex128)
point = manifold_su().detach().numpy()
print('point:', point, sep='\n')
print('point.H @ point:', point.T.conj() @ point, sep='\n')
manifold_su = numqi.manifold.SpecialOrthogonal(dim=3, dtype=torch.complex128)
point = manifold_su().detach().numpy()
print('point:', point, sep='\n')
print('point.H @ point:', point.T.conj() @ point, sep='\n')

point:
[[ 0.91458945+0.12669967j -0.19288592+0.10570393j -0.07148203+0.3065703j ]
 [ 0.12218722+0.2443428j   0.82146715-0.14001828j  0.43635541+0.20136396j]
 [-0.08965708+0.25456201j -0.45721574+0.21945858j  0.81844165-0.01023499j]]
point.H @ point:
[[ 1.00000000e+00+0.00000000e+00j -2.08166817e-17+1.45716772e-16j
   0.00000000e+00+1.11022302e-16j]
 [-2.08166817e-17-1.45716772e-16j  1.00000000e+00+0.00000000e+00j
  -2.77555756e-16-9.71445147e-17j]
 [ 0.00000000e+00-1.11022302e-16j -2.77555756e-16+9.71445147e-17j
   1.00000000e+00+0.00000000e+00j]]

Connection with Quantum Information¶

TODO

pure quantum states
Hamiltonian
density matrix
quantum gate
quantum channel

In [12]:

Copied!

fig,ax = numqi.manifold.plot_qobject_trivialization_map()
fig,ax = numqi.manifold.plot_qobject_trivialization_map()

No description has been provided for this image

In [13]:

Copied!

fig,ax = numqi.manifold.plot.plot_cha_trivialization_map()
fig,ax = numqi.manifold.plot.plot_cha_trivialization_map()

In [14]:

Copied!

fig,ax = numqi.manifold.plot.plot_tensor_rank_sigmar_trivialization_map()
fig,ax = numqi.manifold.plot.plot_tensor_rank_sigmar_trivialization_map()

In [15]:

Copied!

fig,ax = numqi.manifold.plot.plot_pureb_trivialization_map()
fig,ax = numqi.manifold.plot.plot_pureb_trivialization_map()

In [16]:

Copied!

fig,ax = numqi.manifold.plot.plot_uda_trivialization_map()
fig,ax = numqi.manifold.plot.plot_uda_trivialization_map()

In [17]:

Copied!

fig,ax = numqi.manifold.plot.plot_udp_trivialization_map()
fig,ax = numqi.manifold.plot.plot_udp_trivialization_map()