Nearest Semi Orthonormal Matrix Using the Entry Wise $ {ell}_{1} $ Norm
$begingroup$
Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is
$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$
A solution can be found by using Lagrangian or polar decomposition, and is known to be
$$hat{R} := M(M^TM)^{-1/2}$$
If $|cdot|_F$ is replaced by the entry-wise $1$-norm
$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$
the problem becomes
$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$
What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.
optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds
$endgroup$
add a comment |
$begingroup$
Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is
$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$
A solution can be found by using Lagrangian or polar decomposition, and is known to be
$$hat{R} := M(M^TM)^{-1/2}$$
If $|cdot|_F$ is replaced by the entry-wise $1$-norm
$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$
the problem becomes
$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$
What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.
optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds
$endgroup$
add a comment |
$begingroup$
Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is
$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$
A solution can be found by using Lagrangian or polar decomposition, and is known to be
$$hat{R} := M(M^TM)^{-1/2}$$
If $|cdot|_F$ is replaced by the entry-wise $1$-norm
$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$
the problem becomes
$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$
What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.
optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds
$endgroup$
Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is
$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$
A solution can be found by using Lagrangian or polar decomposition, and is known to be
$$hat{R} := M(M^TM)^{-1/2}$$
If $|cdot|_F$ is replaced by the entry-wise $1$-norm
$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$
the problem becomes
$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$
What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.
optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds
optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds
edited Jan 24 at 12:56
Royi
3,54012353
3,54012353
asked Nov 2 '17 at 5:32


FrancisFrancis
601315
601315
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$
begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}
where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is
$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$
Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)
$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$
and, thus,
$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Numerical experiments
The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:
from cvxpy import *
import numpy as np
(m,n) = (30,3)
A = np.random.rand(m,n)
X = Variable(m,n)
# define optimization problem
prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])
# solve optimization problem
print prob.solve()
print prob.status
# print results
print "X = n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "X.T * X = n", (X.value).T * (X.value)
print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)
Unfortunately, it produces
X.T * X =
[[ 0.50912521 0.22478268 0.25341844]
[ 0.22478268 0.4961892 0.2654229 ]
[ 0.25341844 0.2654229 0.46896397]]
which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.
However, using the Frobenius norm instead of the entry-wise $1$-norm:
prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])
we obtain the following output
3.85187827124
optimal
X =
[[ 0.13146134 -0.14000011 0.32874275]
[-0.01234834 0.11837379 0.06604536]
[-0.10490978 0.10542352 0.25211362]
[ 0.25263062 0.14707194 0.03884215]
[ 0.00181182 0.4702248 -0.20126385]
[ 0.13013444 -0.07199484 0.08727077]
[ 0.19077548 -0.01423872 -0.05960523]
[ 0.2865637 -0.00202074 0.02844798]
[ 0.01602302 0.04395754 0.00154713]
[ 0.17932924 0.30926775 -0.05940074]
[ 0.42908676 -0.21953956 0.12394825]
[ 0.1255695 0.16415755 0.14634119]
[ 0.28144817 0.08592836 0.08426443]
[ 0.18209884 0.25983065 -0.02550957]
[ 0.11077068 -0.10874038 0.23649308]
[ 0.01565326 0.14043185 0.01186364]
[-0.04374642 0.20360714 0.01079417]
[-0.00440237 0.17746665 0.12931808]
[ 0.18899948 0.08389032 0.23493301]
[ 0.25802202 0.16171055 0.09263858]
[-0.01921053 0.26863496 0.13077382]
[-0.11175044 0.06137184 0.32758781]
[-0.20321302 0.37803842 0.17629377]
[-0.09301606 -0.0783033 0.36603431]
[-0.00572361 0.17620931 -0.04822991]
[ 0.24944001 0.18830197 -0.05522287]
[-0.2049578 0.1091767 0.38943593]
[ 0.36823908 -0.10026829 0.04425441]
[ 0.03582131 0.03531081 0.08337656]
[-0.07315648 -0.0739467 0.35741916]]
Spectral norm of X = 1.00018499995
X.T * X =
[[ 0.99843209 -0.00168429 -0.0019862 ]
[-0.00168429 0.99833293 -0.00222193]
[-0.0019862 -0.00222193 0.99790575]]
Round(X.T * X) =
[[ 0.998 -0.002 -0.002]
[-0.002 0.998 -0.002]
[-0.002 -0.002 0.998]]
where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.
I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:
Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).
Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.
By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.
$endgroup$
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2500881%2fnearest-semi-orthonormal-matrix-using-the-entry-wise-ell-1-norm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$
begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}
where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is
$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$
Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)
$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$
and, thus,
$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Numerical experiments
The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:
from cvxpy import *
import numpy as np
(m,n) = (30,3)
A = np.random.rand(m,n)
X = Variable(m,n)
# define optimization problem
prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])
# solve optimization problem
print prob.solve()
print prob.status
# print results
print "X = n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "X.T * X = n", (X.value).T * (X.value)
print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)
Unfortunately, it produces
X.T * X =
[[ 0.50912521 0.22478268 0.25341844]
[ 0.22478268 0.4961892 0.2654229 ]
[ 0.25341844 0.2654229 0.46896397]]
which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.
However, using the Frobenius norm instead of the entry-wise $1$-norm:
prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])
we obtain the following output
3.85187827124
optimal
X =
[[ 0.13146134 -0.14000011 0.32874275]
[-0.01234834 0.11837379 0.06604536]
[-0.10490978 0.10542352 0.25211362]
[ 0.25263062 0.14707194 0.03884215]
[ 0.00181182 0.4702248 -0.20126385]
[ 0.13013444 -0.07199484 0.08727077]
[ 0.19077548 -0.01423872 -0.05960523]
[ 0.2865637 -0.00202074 0.02844798]
[ 0.01602302 0.04395754 0.00154713]
[ 0.17932924 0.30926775 -0.05940074]
[ 0.42908676 -0.21953956 0.12394825]
[ 0.1255695 0.16415755 0.14634119]
[ 0.28144817 0.08592836 0.08426443]
[ 0.18209884 0.25983065 -0.02550957]
[ 0.11077068 -0.10874038 0.23649308]
[ 0.01565326 0.14043185 0.01186364]
[-0.04374642 0.20360714 0.01079417]
[-0.00440237 0.17746665 0.12931808]
[ 0.18899948 0.08389032 0.23493301]
[ 0.25802202 0.16171055 0.09263858]
[-0.01921053 0.26863496 0.13077382]
[-0.11175044 0.06137184 0.32758781]
[-0.20321302 0.37803842 0.17629377]
[-0.09301606 -0.0783033 0.36603431]
[-0.00572361 0.17620931 -0.04822991]
[ 0.24944001 0.18830197 -0.05522287]
[-0.2049578 0.1091767 0.38943593]
[ 0.36823908 -0.10026829 0.04425441]
[ 0.03582131 0.03531081 0.08337656]
[-0.07315648 -0.0739467 0.35741916]]
Spectral norm of X = 1.00018499995
X.T * X =
[[ 0.99843209 -0.00168429 -0.0019862 ]
[-0.00168429 0.99833293 -0.00222193]
[-0.0019862 -0.00222193 0.99790575]]
Round(X.T * X) =
[[ 0.998 -0.002 -0.002]
[-0.002 0.998 -0.002]
[-0.002 -0.002 0.998]]
where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.
I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:
Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).
Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.
By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.
$endgroup$
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
|
show 1 more comment
$begingroup$
We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$
begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}
where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is
$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$
Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)
$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$
and, thus,
$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Numerical experiments
The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:
from cvxpy import *
import numpy as np
(m,n) = (30,3)
A = np.random.rand(m,n)
X = Variable(m,n)
# define optimization problem
prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])
# solve optimization problem
print prob.solve()
print prob.status
# print results
print "X = n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "X.T * X = n", (X.value).T * (X.value)
print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)
Unfortunately, it produces
X.T * X =
[[ 0.50912521 0.22478268 0.25341844]
[ 0.22478268 0.4961892 0.2654229 ]
[ 0.25341844 0.2654229 0.46896397]]
which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.
However, using the Frobenius norm instead of the entry-wise $1$-norm:
prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])
we obtain the following output
3.85187827124
optimal
X =
[[ 0.13146134 -0.14000011 0.32874275]
[-0.01234834 0.11837379 0.06604536]
[-0.10490978 0.10542352 0.25211362]
[ 0.25263062 0.14707194 0.03884215]
[ 0.00181182 0.4702248 -0.20126385]
[ 0.13013444 -0.07199484 0.08727077]
[ 0.19077548 -0.01423872 -0.05960523]
[ 0.2865637 -0.00202074 0.02844798]
[ 0.01602302 0.04395754 0.00154713]
[ 0.17932924 0.30926775 -0.05940074]
[ 0.42908676 -0.21953956 0.12394825]
[ 0.1255695 0.16415755 0.14634119]
[ 0.28144817 0.08592836 0.08426443]
[ 0.18209884 0.25983065 -0.02550957]
[ 0.11077068 -0.10874038 0.23649308]
[ 0.01565326 0.14043185 0.01186364]
[-0.04374642 0.20360714 0.01079417]
[-0.00440237 0.17746665 0.12931808]
[ 0.18899948 0.08389032 0.23493301]
[ 0.25802202 0.16171055 0.09263858]
[-0.01921053 0.26863496 0.13077382]
[-0.11175044 0.06137184 0.32758781]
[-0.20321302 0.37803842 0.17629377]
[-0.09301606 -0.0783033 0.36603431]
[-0.00572361 0.17620931 -0.04822991]
[ 0.24944001 0.18830197 -0.05522287]
[-0.2049578 0.1091767 0.38943593]
[ 0.36823908 -0.10026829 0.04425441]
[ 0.03582131 0.03531081 0.08337656]
[-0.07315648 -0.0739467 0.35741916]]
Spectral norm of X = 1.00018499995
X.T * X =
[[ 0.99843209 -0.00168429 -0.0019862 ]
[-0.00168429 0.99833293 -0.00222193]
[-0.0019862 -0.00222193 0.99790575]]
Round(X.T * X) =
[[ 0.998 -0.002 -0.002]
[-0.002 0.998 -0.002]
[-0.002 -0.002 0.998]]
where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.
I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:
Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).
Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.
By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.
$endgroup$
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
|
show 1 more comment
$begingroup$
We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$
begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}
where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is
$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$
Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)
$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$
and, thus,
$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Numerical experiments
The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:
from cvxpy import *
import numpy as np
(m,n) = (30,3)
A = np.random.rand(m,n)
X = Variable(m,n)
# define optimization problem
prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])
# solve optimization problem
print prob.solve()
print prob.status
# print results
print "X = n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "X.T * X = n", (X.value).T * (X.value)
print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)
Unfortunately, it produces
X.T * X =
[[ 0.50912521 0.22478268 0.25341844]
[ 0.22478268 0.4961892 0.2654229 ]
[ 0.25341844 0.2654229 0.46896397]]
which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.
However, using the Frobenius norm instead of the entry-wise $1$-norm:
prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])
we obtain the following output
3.85187827124
optimal
X =
[[ 0.13146134 -0.14000011 0.32874275]
[-0.01234834 0.11837379 0.06604536]
[-0.10490978 0.10542352 0.25211362]
[ 0.25263062 0.14707194 0.03884215]
[ 0.00181182 0.4702248 -0.20126385]
[ 0.13013444 -0.07199484 0.08727077]
[ 0.19077548 -0.01423872 -0.05960523]
[ 0.2865637 -0.00202074 0.02844798]
[ 0.01602302 0.04395754 0.00154713]
[ 0.17932924 0.30926775 -0.05940074]
[ 0.42908676 -0.21953956 0.12394825]
[ 0.1255695 0.16415755 0.14634119]
[ 0.28144817 0.08592836 0.08426443]
[ 0.18209884 0.25983065 -0.02550957]
[ 0.11077068 -0.10874038 0.23649308]
[ 0.01565326 0.14043185 0.01186364]
[-0.04374642 0.20360714 0.01079417]
[-0.00440237 0.17746665 0.12931808]
[ 0.18899948 0.08389032 0.23493301]
[ 0.25802202 0.16171055 0.09263858]
[-0.01921053 0.26863496 0.13077382]
[-0.11175044 0.06137184 0.32758781]
[-0.20321302 0.37803842 0.17629377]
[-0.09301606 -0.0783033 0.36603431]
[-0.00572361 0.17620931 -0.04822991]
[ 0.24944001 0.18830197 -0.05522287]
[-0.2049578 0.1091767 0.38943593]
[ 0.36823908 -0.10026829 0.04425441]
[ 0.03582131 0.03531081 0.08337656]
[-0.07315648 -0.0739467 0.35741916]]
Spectral norm of X = 1.00018499995
X.T * X =
[[ 0.99843209 -0.00168429 -0.0019862 ]
[-0.00168429 0.99833293 -0.00222193]
[-0.0019862 -0.00222193 0.99790575]]
Round(X.T * X) =
[[ 0.998 -0.002 -0.002]
[-0.002 0.998 -0.002]
[-0.002 -0.002 0.998]]
where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.
I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:
Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).
Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.
By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.
$endgroup$
We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$
begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}
where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is
$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$
Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)
$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$
and, thus,
$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
Numerical experiments
The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:
from cvxpy import *
import numpy as np
(m,n) = (30,3)
A = np.random.rand(m,n)
X = Variable(m,n)
# define optimization problem
prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])
# solve optimization problem
print prob.solve()
print prob.status
# print results
print "X = n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "X.T * X = n", (X.value).T * (X.value)
print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)
Unfortunately, it produces
X.T * X =
[[ 0.50912521 0.22478268 0.25341844]
[ 0.22478268 0.4961892 0.2654229 ]
[ 0.25341844 0.2654229 0.46896397]]
which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.
However, using the Frobenius norm instead of the entry-wise $1$-norm:
prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])
we obtain the following output
3.85187827124
optimal
X =
[[ 0.13146134 -0.14000011 0.32874275]
[-0.01234834 0.11837379 0.06604536]
[-0.10490978 0.10542352 0.25211362]
[ 0.25263062 0.14707194 0.03884215]
[ 0.00181182 0.4702248 -0.20126385]
[ 0.13013444 -0.07199484 0.08727077]
[ 0.19077548 -0.01423872 -0.05960523]
[ 0.2865637 -0.00202074 0.02844798]
[ 0.01602302 0.04395754 0.00154713]
[ 0.17932924 0.30926775 -0.05940074]
[ 0.42908676 -0.21953956 0.12394825]
[ 0.1255695 0.16415755 0.14634119]
[ 0.28144817 0.08592836 0.08426443]
[ 0.18209884 0.25983065 -0.02550957]
[ 0.11077068 -0.10874038 0.23649308]
[ 0.01565326 0.14043185 0.01186364]
[-0.04374642 0.20360714 0.01079417]
[-0.00440237 0.17746665 0.12931808]
[ 0.18899948 0.08389032 0.23493301]
[ 0.25802202 0.16171055 0.09263858]
[-0.01921053 0.26863496 0.13077382]
[-0.11175044 0.06137184 0.32758781]
[-0.20321302 0.37803842 0.17629377]
[-0.09301606 -0.0783033 0.36603431]
[-0.00572361 0.17620931 -0.04822991]
[ 0.24944001 0.18830197 -0.05522287]
[-0.2049578 0.1091767 0.38943593]
[ 0.36823908 -0.10026829 0.04425441]
[ 0.03582131 0.03531081 0.08337656]
[-0.07315648 -0.0739467 0.35741916]]
Spectral norm of X = 1.00018499995
X.T * X =
[[ 0.99843209 -0.00168429 -0.0019862 ]
[-0.00168429 0.99833293 -0.00222193]
[-0.0019862 -0.00222193 0.99790575]]
Round(X.T * X) =
[[ 0.998 -0.002 -0.002]
[-0.002 0.998 -0.002]
[-0.002 -0.002 0.998]]
where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.
I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:
Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).
Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.
By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.
edited Apr 6 '18 at 11:09
answered Apr 4 '18 at 23:02
Rodrigo de AzevedoRodrigo de Azevedo
13k41960
13k41960
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
|
show 1 more comment
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
1
1
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17
1
1
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29
|
show 1 more comment
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2500881%2fnearest-semi-orthonormal-matrix-using-the-entry-wise-ell-1-norm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown