Nearest Semi Orthonormal Matrix Using the Entry Wise $ {ell}

Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is

$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$

A solution can be found by using Lagrangian or polar decomposition, and is known to be

$$hat{R} := M(M^TM)^{-1/2}$$

If $|cdot|_F$ is replaced by the entry-wise $1$-norm

$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$

the problem becomes

$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$

What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

add a comment |

Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is

$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$

A solution can be found by using Lagrangian or polar decomposition, and is known to be

$$hat{R} := M(M^TM)^{-1/2}$$

If $|cdot|_F$ is replaced by the entry-wise $1$-norm

$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$

the problem becomes

$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

add a comment |

Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is

$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$

A solution can be found by using Lagrangian or polar decomposition, and is known to be

$$hat{R} := M(M^TM)^{-1/2}$$

If $|cdot|_F$ is replaced by the entry-wise $1$-norm

$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$

the problem becomes

$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is

$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$

A solution can be found by using Lagrangian or polar decomposition, and is known to be

$$hat{R} := M(M^TM)^{-1/2}$$

If $|cdot|_F$ is replaced by the entry-wise $1$-norm

$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$

the problem becomes

$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$

optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

edited Jan 24 at 12:56

Royi

3,54012353

edited Jan 24 at 12:56

Royi

3,54012353

edited Jan 24 at 12:56

Royi

3,54012353

asked Nov 2 '17 at 5:32

Francis

601315

asked Nov 2 '17 at 5:32

Francis

601315

asked Nov 2 '17 at 5:32

Francis

601315

add a comment |

1 Answer
1

active

oldest

votes

We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$

begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}

where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is

$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$

Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)

$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$

and, thus,

$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$

$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Numerical experiments

The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:

from cvxpy import *

import numpy as np



(m,n) = (30,3)



A = np.random.rand(m,n)



X = Variable(m,n)



# define optimization problem

prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])



# solve optimization problem

print prob.solve()

print prob.status



# print results

print "X = n", X.value

print "Spectral norm of X = ", np.linalg.norm(X.value,2)

print "X.T * X = n", (X.value).T * (X.value)

print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)

Unfortunately, it produces

X.T * X = 

[[ 0.50912521  0.22478268  0.25341844]

 [ 0.22478268  0.4961892   0.2654229 ]

 [ 0.25341844  0.2654229   0.46896397]]

which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.

However, using the Frobenius norm instead of the entry-wise $1$-norm:

prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])

we obtain the following output

3.85187827124

optimal

X = 

[[ 0.13146134 -0.14000011  0.32874275]

 [-0.01234834  0.11837379  0.06604536]

 [-0.10490978  0.10542352  0.25211362]

 [ 0.25263062  0.14707194  0.03884215]

 [ 0.00181182  0.4702248  -0.20126385]

 [ 0.13013444 -0.07199484  0.08727077]

 [ 0.19077548 -0.01423872 -0.05960523]

 [ 0.2865637  -0.00202074  0.02844798]

 [ 0.01602302  0.04395754  0.00154713]

 [ 0.17932924  0.30926775 -0.05940074]

 [ 0.42908676 -0.21953956  0.12394825]

 [ 0.1255695   0.16415755  0.14634119]

 [ 0.28144817  0.08592836  0.08426443]

 [ 0.18209884  0.25983065 -0.02550957]

 [ 0.11077068 -0.10874038  0.23649308]

 [ 0.01565326  0.14043185  0.01186364]

 [-0.04374642  0.20360714  0.01079417]

 [-0.00440237  0.17746665  0.12931808]

 [ 0.18899948  0.08389032  0.23493301]

 [ 0.25802202  0.16171055  0.09263858]

 [-0.01921053  0.26863496  0.13077382]

 [-0.11175044  0.06137184  0.32758781]

 [-0.20321302  0.37803842  0.17629377]

 [-0.09301606 -0.0783033   0.36603431]

 [-0.00572361  0.17620931 -0.04822991]

 [ 0.24944001  0.18830197 -0.05522287]

 [-0.2049578   0.1091767   0.38943593]

 [ 0.36823908 -0.10026829  0.04425441]

 [ 0.03582131  0.03531081  0.08337656]

 [-0.07315648 -0.0739467   0.35741916]]

Spectral norm of X =  1.00018499995

X.T * X = 

[[ 0.99843209 -0.00168429 -0.0019862 ]

 [-0.00168429  0.99833293 -0.00222193]

 [-0.0019862  -0.00222193  0.99790575]]

Round(X.T * X) = 

[[ 0.998 -0.002 -0.002]

 [-0.002  0.998 -0.002]

 [-0.002 -0.002  0.998]]

where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.

I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:

Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).

Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.

By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

1

$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13

$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17

$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17

1

$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25

$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29

|
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2500881%2fnearest-semi-orthonormal-matrix-using-the-entry-wise-ell-1-norm%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$

begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}

$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$

Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)

$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$

and, thus,

$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$

Numerical experiments

The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:

from cvxpy import *

import numpy as np



(m,n) = (30,3)



A = np.random.rand(m,n)



X = Variable(m,n)



# define optimization problem

prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])



# solve optimization problem

print prob.solve()

print prob.status



# print results

print "X = n", X.value

print "Spectral norm of X = ", np.linalg.norm(X.value,2)

print "X.T * X = n", (X.value).T * (X.value)

print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)

Unfortunately, it produces

X.T * X = 

[[ 0.50912521  0.22478268  0.25341844]

 [ 0.22478268  0.4961892   0.2654229 ]

 [ 0.25341844  0.2654229   0.46896397]]

which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.

However, using the Frobenius norm instead of the entry-wise $1$-norm:

prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])

we obtain the following output

3.85187827124

optimal

X = 

[[ 0.13146134 -0.14000011  0.32874275]

 [-0.01234834  0.11837379  0.06604536]

 [-0.10490978  0.10542352  0.25211362]

 [ 0.25263062  0.14707194  0.03884215]

 [ 0.00181182  0.4702248  -0.20126385]

 [ 0.13013444 -0.07199484  0.08727077]

 [ 0.19077548 -0.01423872 -0.05960523]

 [ 0.2865637  -0.00202074  0.02844798]

 [ 0.01602302  0.04395754  0.00154713]

 [ 0.17932924  0.30926775 -0.05940074]

 [ 0.42908676 -0.21953956  0.12394825]

 [ 0.1255695   0.16415755  0.14634119]

 [ 0.28144817  0.08592836  0.08426443]

 [ 0.18209884  0.25983065 -0.02550957]

 [ 0.11077068 -0.10874038  0.23649308]

 [ 0.01565326  0.14043185  0.01186364]

 [-0.04374642  0.20360714  0.01079417]

 [-0.00440237  0.17746665  0.12931808]

 [ 0.18899948  0.08389032  0.23493301]

 [ 0.25802202  0.16171055  0.09263858]

 [-0.01921053  0.26863496  0.13077382]

 [-0.11175044  0.06137184  0.32758781]

 [-0.20321302  0.37803842  0.17629377]

 [-0.09301606 -0.0783033   0.36603431]

 [-0.00572361  0.17620931 -0.04822991]

 [ 0.24944001  0.18830197 -0.05522287]

 [-0.2049578   0.1091767   0.38943593]

 [ 0.36823908 -0.10026829  0.04425441]

 [ 0.03582131  0.03531081  0.08337656]

 [-0.07315648 -0.0739467   0.35741916]]

Spectral norm of X =  1.00018499995

X.T * X = 

[[ 0.99843209 -0.00168429 -0.0019862 ]

 [-0.00168429  0.99833293 -0.00222193]

 [-0.0019862  -0.00222193  0.99790575]]

Round(X.T * X) = 

[[ 0.998 -0.002 -0.002]

 [-0.002  0.998 -0.002]

 [-0.002 -0.002  0.998]]

where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.

I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:

Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).

Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.

By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

1

$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13

$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17

$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17

1

$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25

$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29

|
show 1 more comment

We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$

begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}

$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$

Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)

$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$

and, thus,

$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$

Numerical experiments

The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:

from cvxpy import *

import numpy as np



(m,n) = (30,3)



A = np.random.rand(m,n)



X = Variable(m,n)



# define optimization problem

prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])



# solve optimization problem

print prob.solve()

print prob.status



# print results

print "X = n", X.value

print "Spectral norm of X = ", np.linalg.norm(X.value,2)

print "X.T * X = n", (X.value).T * (X.value)

print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)

Unfortunately, it produces

X.T * X = 

[[ 0.50912521  0.22478268  0.25341844]

 [ 0.22478268  0.4961892   0.2654229 ]

 [ 0.25341844  0.2654229   0.46896397]]

which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.

However, using the Frobenius norm instead of the entry-wise $1$-norm:

prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])

we obtain the following output

3.85187827124

optimal

X = 

[[ 0.13146134 -0.14000011  0.32874275]

 [-0.01234834  0.11837379  0.06604536]

 [-0.10490978  0.10542352  0.25211362]

 [ 0.25263062  0.14707194  0.03884215]

 [ 0.00181182  0.4702248  -0.20126385]

 [ 0.13013444 -0.07199484  0.08727077]

 [ 0.19077548 -0.01423872 -0.05960523]

 [ 0.2865637  -0.00202074  0.02844798]

 [ 0.01602302  0.04395754  0.00154713]

 [ 0.17932924  0.30926775 -0.05940074]

 [ 0.42908676 -0.21953956  0.12394825]

 [ 0.1255695   0.16415755  0.14634119]

 [ 0.28144817  0.08592836  0.08426443]

 [ 0.18209884  0.25983065 -0.02550957]

 [ 0.11077068 -0.10874038  0.23649308]

 [ 0.01565326  0.14043185  0.01186364]

 [-0.04374642  0.20360714  0.01079417]

 [-0.00440237  0.17746665  0.12931808]

 [ 0.18899948  0.08389032  0.23493301]

 [ 0.25802202  0.16171055  0.09263858]

 [-0.01921053  0.26863496  0.13077382]

 [-0.11175044  0.06137184  0.32758781]

 [-0.20321302  0.37803842  0.17629377]

 [-0.09301606 -0.0783033   0.36603431]

 [-0.00572361  0.17620931 -0.04822991]

 [ 0.24944001  0.18830197 -0.05522287]

 [-0.2049578   0.1091767   0.38943593]

 [ 0.36823908 -0.10026829  0.04425441]

 [ 0.03582131  0.03531081  0.08337656]

 [-0.07315648 -0.0739467   0.35741916]]

Spectral norm of X =  1.00018499995

X.T * X = 

[[ 0.99843209 -0.00168429 -0.0019862 ]

 [-0.00168429  0.99833293 -0.00222193]

 [-0.0019862  -0.00222193  0.99790575]]

Round(X.T * X) = 

[[ 0.998 -0.002 -0.002]

 [-0.002  0.998 -0.002]

 [-0.002 -0.002  0.998]]

where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.

I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:

Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).

Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.

By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

1

$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13

$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17

$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17

1

$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25

$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29

|
show 1 more comment

We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$

begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}

$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$

Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)

$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$

and, thus,

$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$

Numerical experiments

The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:

from cvxpy import *

import numpy as np



(m,n) = (30,3)



A = np.random.rand(m,n)



X = Variable(m,n)



# define optimization problem

prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])



# solve optimization problem

print prob.solve()

print prob.status



# print results

print "X = n", X.value

print "Spectral norm of X = ", np.linalg.norm(X.value,2)

print "X.T * X = n", (X.value).T * (X.value)

print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)

Unfortunately, it produces

X.T * X = 

[[ 0.50912521  0.22478268  0.25341844]

 [ 0.22478268  0.4961892   0.2654229 ]

 [ 0.25341844  0.2654229   0.46896397]]

which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.

However, using the Frobenius norm instead of the entry-wise $1$-norm:

prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])

we obtain the following output

3.85187827124

optimal

X = 

[[ 0.13146134 -0.14000011  0.32874275]

 [-0.01234834  0.11837379  0.06604536]

 [-0.10490978  0.10542352  0.25211362]

 [ 0.25263062  0.14707194  0.03884215]

 [ 0.00181182  0.4702248  -0.20126385]

 [ 0.13013444 -0.07199484  0.08727077]

 [ 0.19077548 -0.01423872 -0.05960523]

 [ 0.2865637  -0.00202074  0.02844798]

 [ 0.01602302  0.04395754  0.00154713]

 [ 0.17932924  0.30926775 -0.05940074]

 [ 0.42908676 -0.21953956  0.12394825]

 [ 0.1255695   0.16415755  0.14634119]

 [ 0.28144817  0.08592836  0.08426443]

 [ 0.18209884  0.25983065 -0.02550957]

 [ 0.11077068 -0.10874038  0.23649308]

 [ 0.01565326  0.14043185  0.01186364]

 [-0.04374642  0.20360714  0.01079417]

 [-0.00440237  0.17746665  0.12931808]

 [ 0.18899948  0.08389032  0.23493301]

 [ 0.25802202  0.16171055  0.09263858]

 [-0.01921053  0.26863496  0.13077382]

 [-0.11175044  0.06137184  0.32758781]

 [-0.20321302  0.37803842  0.17629377]

 [-0.09301606 -0.0783033   0.36603431]

 [-0.00572361  0.17620931 -0.04822991]

 [ 0.24944001  0.18830197 -0.05522287]

 [-0.2049578   0.1091767   0.38943593]

 [ 0.36823908 -0.10026829  0.04425441]

 [ 0.03582131  0.03531081  0.08337656]

 [-0.07315648 -0.0739467   0.35741916]]

Spectral norm of X =  1.00018499995

X.T * X = 

[[ 0.99843209 -0.00168429 -0.0019862 ]

 [-0.00168429  0.99833293 -0.00222193]

 [-0.0019862  -0.00222193  0.99790575]]

Round(X.T * X) = 

[[ 0.998 -0.002 -0.002]

 [-0.002  0.998 -0.002]

 [-0.002 -0.002  0.998]]

where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.

I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:

Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).

Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.

By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$

begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}

$$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$

Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)

$$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$

and, thus,

$$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$

Numerical experiments

The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:

from cvxpy import *

import numpy as np



(m,n) = (30,3)



A = np.random.rand(m,n)



X = Variable(m,n)



# define optimization problem

prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])



# solve optimization problem

print prob.solve()

print prob.status



# print results

print "X = n", X.value

print "Spectral norm of X = ", np.linalg.norm(X.value,2)

print "X.T * X = n", (X.value).T * (X.value)

print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)

Unfortunately, it produces

X.T * X = 

[[ 0.50912521  0.22478268  0.25341844]

 [ 0.22478268  0.4961892   0.2654229 ]

 [ 0.25341844  0.2654229   0.46896397]]

which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.

However, using the Frobenius norm instead of the entry-wise $1$-norm:

prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])

we obtain the following output

3.85187827124

optimal

X = 

[[ 0.13146134 -0.14000011  0.32874275]

 [-0.01234834  0.11837379  0.06604536]

 [-0.10490978  0.10542352  0.25211362]

 [ 0.25263062  0.14707194  0.03884215]

 [ 0.00181182  0.4702248  -0.20126385]

 [ 0.13013444 -0.07199484  0.08727077]

 [ 0.19077548 -0.01423872 -0.05960523]

 [ 0.2865637  -0.00202074  0.02844798]

 [ 0.01602302  0.04395754  0.00154713]

 [ 0.17932924  0.30926775 -0.05940074]

 [ 0.42908676 -0.21953956  0.12394825]

 [ 0.1255695   0.16415755  0.14634119]

 [ 0.28144817  0.08592836  0.08426443]

 [ 0.18209884  0.25983065 -0.02550957]

 [ 0.11077068 -0.10874038  0.23649308]

 [ 0.01565326  0.14043185  0.01186364]

 [-0.04374642  0.20360714  0.01079417]

 [-0.00440237  0.17746665  0.12931808]

 [ 0.18899948  0.08389032  0.23493301]

 [ 0.25802202  0.16171055  0.09263858]

 [-0.01921053  0.26863496  0.13077382]

 [-0.11175044  0.06137184  0.32758781]

 [-0.20321302  0.37803842  0.17629377]

 [-0.09301606 -0.0783033   0.36603431]

 [-0.00572361  0.17620931 -0.04822991]

 [ 0.24944001  0.18830197 -0.05522287]

 [-0.2049578   0.1091767   0.38943593]

 [ 0.36823908 -0.10026829  0.04425441]

 [ 0.03582131  0.03531081  0.08337656]

 [-0.07315648 -0.0739467   0.35741916]]

Spectral norm of X =  1.00018499995

X.T * X = 

[[ 0.99843209 -0.00168429 -0.0019862 ]

 [-0.00168429  0.99833293 -0.00222193]

 [-0.0019862  -0.00222193  0.99790575]]

Round(X.T * X) = 

[[ 0.998 -0.002 -0.002]

 [-0.002  0.998 -0.002]

 [-0.002 -0.002  0.998]]

where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.

I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:

Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).

Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.

By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

edited Apr 6 '18 at 11:09

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

answered Apr 4 '18 at 23:02

Rodrigo de Azevedo

13k41960

1

$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13

$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17

$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17

1

$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25

$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29

|
show 1 more comment

1

$begingroup$
Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:13

$begingroup$
My comment was merely there to help any reader who was confused by change in notation.
$endgroup$
– Brian Borchers
Apr 5 '18 at 1:17

$begingroup$
I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
$endgroup$
– Michael Grant
Apr 5 '18 at 1:17

1

$begingroup$
$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
$endgroup$
– Rodrigo de Azevedo
Apr 5 '18 at 1:25

$begingroup$
@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
$endgroup$
– Royi
Feb 3 at 4:29

Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.

– Brian Borchers
Apr 5 '18 at 1:13

My comment was merely there to help any reader who was confused by change in notation.

– Brian Borchers
Apr 5 '18 at 1:17

I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)

– Michael Grant
Apr 5 '18 at 1:17

$$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$

– Rodrigo de Azevedo
Apr 5 '18 at 1:25

@RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?

– Royi
Feb 3 at 4:29

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu