Nearest Semi Orthonormal Matrix Using the Entry Wise $ {ell}_{1} $ Norm












4












$begingroup$


Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is



$$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$



A solution can be found by using Lagrangian or polar decomposition, and is known to be



$$hat{R} := M(M^TM)^{-1/2}$$



If $|cdot|_F$ is replaced by the entry-wise $1$-norm



$$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$



the problem becomes



$$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$



What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.










share|cite|improve this question











$endgroup$

















    4












    $begingroup$


    Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is



    $$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$



    A solution can be found by using Lagrangian or polar decomposition, and is known to be



    $$hat{R} := M(M^TM)^{-1/2}$$



    If $|cdot|_F$ is replaced by the entry-wise $1$-norm



    $$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$



    the problem becomes



    $$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$



    What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.










    share|cite|improve this question











    $endgroup$















      4












      4








      4


      2



      $begingroup$


      Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is



      $$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$



      A solution can be found by using Lagrangian or polar decomposition, and is known to be



      $$hat{R} := M(M^TM)^{-1/2}$$



      If $|cdot|_F$ is replaced by the entry-wise $1$-norm



      $$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$



      the problem becomes



      $$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$



      What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.










      share|cite|improve this question











      $endgroup$




      Given an $m times n$ matrix $M$ ($m geq n$), the nearest semi-orthonormal matrix problem in $m times n$ matrix $R$ is



      $$begin{array}{ll} text{minimize} & | M - R |_F\ text{subject to} & R^T R = I_nend{array}$$



      A solution can be found by using Lagrangian or polar decomposition, and is known to be



      $$hat{R} := M(M^TM)^{-1/2}$$



      If $|cdot|_F$ is replaced by the entry-wise $1$-norm



      $$|A|_1 := |operatorname{vec}(A)|_1 = sum_{i,j} |A_{i,j}|$$



      the problem becomes



      $$boxed{begin{array}{ll} text{minimize} & | M - R |_1\ text{subject to} & R^T R = I_nend{array}}$$



      What do we know about the solutions in this case? Is $hat{R}$ still a solution? If the solution is something else, do analytic forms or approximations exist? Any insight or direction to literature is appreciated.







      optimization norm orthogonal-matrices non-convex-optimization stiefel-manifolds






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Jan 24 at 12:56









      Royi

      3,54012353




      3,54012353










      asked Nov 2 '17 at 5:32









      FrancisFrancis

      601315




      601315






















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$



          begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}



          where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is



          $$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$



          Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)



          $$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$



          and, thus,



          $$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$



          Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$



          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$





          Numerical experiments



          The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:



          from cvxpy import *
          import numpy as np

          (m,n) = (30,3)

          A = np.random.rand(m,n)

          X = Variable(m,n)

          # define optimization problem
          prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])

          # solve optimization problem
          print prob.solve()
          print prob.status

          # print results
          print "X = n", X.value
          print "Spectral norm of X = ", np.linalg.norm(X.value,2)
          print "X.T * X = n", (X.value).T * (X.value)
          print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)


          Unfortunately, it produces



          X.T * X = 
          [[ 0.50912521 0.22478268 0.25341844]
          [ 0.22478268 0.4961892 0.2654229 ]
          [ 0.25341844 0.2654229 0.46896397]]


          which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.



          However, using the Frobenius norm instead of the entry-wise $1$-norm:



          prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])


          we obtain the following output



          3.85187827124
          optimal
          X =
          [[ 0.13146134 -0.14000011 0.32874275]
          [-0.01234834 0.11837379 0.06604536]
          [-0.10490978 0.10542352 0.25211362]
          [ 0.25263062 0.14707194 0.03884215]
          [ 0.00181182 0.4702248 -0.20126385]
          [ 0.13013444 -0.07199484 0.08727077]
          [ 0.19077548 -0.01423872 -0.05960523]
          [ 0.2865637 -0.00202074 0.02844798]
          [ 0.01602302 0.04395754 0.00154713]
          [ 0.17932924 0.30926775 -0.05940074]
          [ 0.42908676 -0.21953956 0.12394825]
          [ 0.1255695 0.16415755 0.14634119]
          [ 0.28144817 0.08592836 0.08426443]
          [ 0.18209884 0.25983065 -0.02550957]
          [ 0.11077068 -0.10874038 0.23649308]
          [ 0.01565326 0.14043185 0.01186364]
          [-0.04374642 0.20360714 0.01079417]
          [-0.00440237 0.17746665 0.12931808]
          [ 0.18899948 0.08389032 0.23493301]
          [ 0.25802202 0.16171055 0.09263858]
          [-0.01921053 0.26863496 0.13077382]
          [-0.11175044 0.06137184 0.32758781]
          [-0.20321302 0.37803842 0.17629377]
          [-0.09301606 -0.0783033 0.36603431]
          [-0.00572361 0.17620931 -0.04822991]
          [ 0.24944001 0.18830197 -0.05522287]
          [-0.2049578 0.1091767 0.38943593]
          [ 0.36823908 -0.10026829 0.04425441]
          [ 0.03582131 0.03531081 0.08337656]
          [-0.07315648 -0.0739467 0.35741916]]
          Spectral norm of X = 1.00018499995
          X.T * X =
          [[ 0.99843209 -0.00168429 -0.0019862 ]
          [-0.00168429 0.99833293 -0.00222193]
          [-0.0019862 -0.00222193 0.99790575]]
          Round(X.T * X) =
          [[ 0.998 -0.002 -0.002]
          [-0.002 0.998 -0.002]
          [-0.002 -0.002 0.998]]


          where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.



          I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:




          • Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).


          • Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.



          By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.






          share|cite|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:13










          • $begingroup$
            My comment was merely there to help any reader who was confused by change in notation.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:17










          • $begingroup$
            I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
            $endgroup$
            – Michael Grant
            Apr 5 '18 at 1:17








          • 1




            $begingroup$
            $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
            $endgroup$
            – Rodrigo de Azevedo
            Apr 5 '18 at 1:25










          • $begingroup$
            @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
            $endgroup$
            – Royi
            Feb 3 at 4:29













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2500881%2fnearest-semi-orthonormal-matrix-using-the-entry-wise-ell-1-norm%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2












          $begingroup$

          We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$



          begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}



          where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is



          $$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$



          Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)



          $$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$



          and, thus,



          $$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$



          Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$



          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$





          Numerical experiments



          The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:



          from cvxpy import *
          import numpy as np

          (m,n) = (30,3)

          A = np.random.rand(m,n)

          X = Variable(m,n)

          # define optimization problem
          prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])

          # solve optimization problem
          print prob.solve()
          print prob.status

          # print results
          print "X = n", X.value
          print "Spectral norm of X = ", np.linalg.norm(X.value,2)
          print "X.T * X = n", (X.value).T * (X.value)
          print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)


          Unfortunately, it produces



          X.T * X = 
          [[ 0.50912521 0.22478268 0.25341844]
          [ 0.22478268 0.4961892 0.2654229 ]
          [ 0.25341844 0.2654229 0.46896397]]


          which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.



          However, using the Frobenius norm instead of the entry-wise $1$-norm:



          prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])


          we obtain the following output



          3.85187827124
          optimal
          X =
          [[ 0.13146134 -0.14000011 0.32874275]
          [-0.01234834 0.11837379 0.06604536]
          [-0.10490978 0.10542352 0.25211362]
          [ 0.25263062 0.14707194 0.03884215]
          [ 0.00181182 0.4702248 -0.20126385]
          [ 0.13013444 -0.07199484 0.08727077]
          [ 0.19077548 -0.01423872 -0.05960523]
          [ 0.2865637 -0.00202074 0.02844798]
          [ 0.01602302 0.04395754 0.00154713]
          [ 0.17932924 0.30926775 -0.05940074]
          [ 0.42908676 -0.21953956 0.12394825]
          [ 0.1255695 0.16415755 0.14634119]
          [ 0.28144817 0.08592836 0.08426443]
          [ 0.18209884 0.25983065 -0.02550957]
          [ 0.11077068 -0.10874038 0.23649308]
          [ 0.01565326 0.14043185 0.01186364]
          [-0.04374642 0.20360714 0.01079417]
          [-0.00440237 0.17746665 0.12931808]
          [ 0.18899948 0.08389032 0.23493301]
          [ 0.25802202 0.16171055 0.09263858]
          [-0.01921053 0.26863496 0.13077382]
          [-0.11175044 0.06137184 0.32758781]
          [-0.20321302 0.37803842 0.17629377]
          [-0.09301606 -0.0783033 0.36603431]
          [-0.00572361 0.17620931 -0.04822991]
          [ 0.24944001 0.18830197 -0.05522287]
          [-0.2049578 0.1091767 0.38943593]
          [ 0.36823908 -0.10026829 0.04425441]
          [ 0.03582131 0.03531081 0.08337656]
          [-0.07315648 -0.0739467 0.35741916]]
          Spectral norm of X = 1.00018499995
          X.T * X =
          [[ 0.99843209 -0.00168429 -0.0019862 ]
          [-0.00168429 0.99833293 -0.00222193]
          [-0.0019862 -0.00222193 0.99790575]]
          Round(X.T * X) =
          [[ 0.998 -0.002 -0.002]
          [-0.002 0.998 -0.002]
          [-0.002 -0.002 0.998]]


          where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.



          I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:




          • Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).


          • Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.



          By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.






          share|cite|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:13










          • $begingroup$
            My comment was merely there to help any reader who was confused by change in notation.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:17










          • $begingroup$
            I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
            $endgroup$
            – Michael Grant
            Apr 5 '18 at 1:17








          • 1




            $begingroup$
            $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
            $endgroup$
            – Rodrigo de Azevedo
            Apr 5 '18 at 1:25










          • $begingroup$
            @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
            $endgroup$
            – Royi
            Feb 3 at 4:29


















          2












          $begingroup$

          We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$



          begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}



          where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is



          $$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$



          Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)



          $$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$



          and, thus,



          $$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$



          Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$



          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$





          Numerical experiments



          The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:



          from cvxpy import *
          import numpy as np

          (m,n) = (30,3)

          A = np.random.rand(m,n)

          X = Variable(m,n)

          # define optimization problem
          prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])

          # solve optimization problem
          print prob.solve()
          print prob.status

          # print results
          print "X = n", X.value
          print "Spectral norm of X = ", np.linalg.norm(X.value,2)
          print "X.T * X = n", (X.value).T * (X.value)
          print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)


          Unfortunately, it produces



          X.T * X = 
          [[ 0.50912521 0.22478268 0.25341844]
          [ 0.22478268 0.4961892 0.2654229 ]
          [ 0.25341844 0.2654229 0.46896397]]


          which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.



          However, using the Frobenius norm instead of the entry-wise $1$-norm:



          prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])


          we obtain the following output



          3.85187827124
          optimal
          X =
          [[ 0.13146134 -0.14000011 0.32874275]
          [-0.01234834 0.11837379 0.06604536]
          [-0.10490978 0.10542352 0.25211362]
          [ 0.25263062 0.14707194 0.03884215]
          [ 0.00181182 0.4702248 -0.20126385]
          [ 0.13013444 -0.07199484 0.08727077]
          [ 0.19077548 -0.01423872 -0.05960523]
          [ 0.2865637 -0.00202074 0.02844798]
          [ 0.01602302 0.04395754 0.00154713]
          [ 0.17932924 0.30926775 -0.05940074]
          [ 0.42908676 -0.21953956 0.12394825]
          [ 0.1255695 0.16415755 0.14634119]
          [ 0.28144817 0.08592836 0.08426443]
          [ 0.18209884 0.25983065 -0.02550957]
          [ 0.11077068 -0.10874038 0.23649308]
          [ 0.01565326 0.14043185 0.01186364]
          [-0.04374642 0.20360714 0.01079417]
          [-0.00440237 0.17746665 0.12931808]
          [ 0.18899948 0.08389032 0.23493301]
          [ 0.25802202 0.16171055 0.09263858]
          [-0.01921053 0.26863496 0.13077382]
          [-0.11175044 0.06137184 0.32758781]
          [-0.20321302 0.37803842 0.17629377]
          [-0.09301606 -0.0783033 0.36603431]
          [-0.00572361 0.17620931 -0.04822991]
          [ 0.24944001 0.18830197 -0.05522287]
          [-0.2049578 0.1091767 0.38943593]
          [ 0.36823908 -0.10026829 0.04425441]
          [ 0.03582131 0.03531081 0.08337656]
          [-0.07315648 -0.0739467 0.35741916]]
          Spectral norm of X = 1.00018499995
          X.T * X =
          [[ 0.99843209 -0.00168429 -0.0019862 ]
          [-0.00168429 0.99833293 -0.00222193]
          [-0.0019862 -0.00222193 0.99790575]]
          Round(X.T * X) =
          [[ 0.998 -0.002 -0.002]
          [-0.002 0.998 -0.002]
          [-0.002 -0.002 0.998]]


          where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.



          I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:




          • Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).


          • Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.



          By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.






          share|cite|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:13










          • $begingroup$
            My comment was merely there to help any reader who was confused by change in notation.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:17










          • $begingroup$
            I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
            $endgroup$
            – Michael Grant
            Apr 5 '18 at 1:17








          • 1




            $begingroup$
            $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
            $endgroup$
            – Rodrigo de Azevedo
            Apr 5 '18 at 1:25










          • $begingroup$
            @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
            $endgroup$
            – Royi
            Feb 3 at 4:29
















          2












          2








          2





          $begingroup$

          We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$



          begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}



          where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is



          $$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$



          Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)



          $$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$



          and, thus,



          $$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$



          Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$



          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$





          Numerical experiments



          The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:



          from cvxpy import *
          import numpy as np

          (m,n) = (30,3)

          A = np.random.rand(m,n)

          X = Variable(m,n)

          # define optimization problem
          prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])

          # solve optimization problem
          print prob.solve()
          print prob.status

          # print results
          print "X = n", X.value
          print "Spectral norm of X = ", np.linalg.norm(X.value,2)
          print "X.T * X = n", (X.value).T * (X.value)
          print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)


          Unfortunately, it produces



          X.T * X = 
          [[ 0.50912521 0.22478268 0.25341844]
          [ 0.22478268 0.4961892 0.2654229 ]
          [ 0.25341844 0.2654229 0.46896397]]


          which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.



          However, using the Frobenius norm instead of the entry-wise $1$-norm:



          prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])


          we obtain the following output



          3.85187827124
          optimal
          X =
          [[ 0.13146134 -0.14000011 0.32874275]
          [-0.01234834 0.11837379 0.06604536]
          [-0.10490978 0.10542352 0.25211362]
          [ 0.25263062 0.14707194 0.03884215]
          [ 0.00181182 0.4702248 -0.20126385]
          [ 0.13013444 -0.07199484 0.08727077]
          [ 0.19077548 -0.01423872 -0.05960523]
          [ 0.2865637 -0.00202074 0.02844798]
          [ 0.01602302 0.04395754 0.00154713]
          [ 0.17932924 0.30926775 -0.05940074]
          [ 0.42908676 -0.21953956 0.12394825]
          [ 0.1255695 0.16415755 0.14634119]
          [ 0.28144817 0.08592836 0.08426443]
          [ 0.18209884 0.25983065 -0.02550957]
          [ 0.11077068 -0.10874038 0.23649308]
          [ 0.01565326 0.14043185 0.01186364]
          [-0.04374642 0.20360714 0.01079417]
          [-0.00440237 0.17746665 0.12931808]
          [ 0.18899948 0.08389032 0.23493301]
          [ 0.25802202 0.16171055 0.09263858]
          [-0.01921053 0.26863496 0.13077382]
          [-0.11175044 0.06137184 0.32758781]
          [-0.20321302 0.37803842 0.17629377]
          [-0.09301606 -0.0783033 0.36603431]
          [-0.00572361 0.17620931 -0.04822991]
          [ 0.24944001 0.18830197 -0.05522287]
          [-0.2049578 0.1091767 0.38943593]
          [ 0.36823908 -0.10026829 0.04425441]
          [ 0.03582131 0.03531081 0.08337656]
          [-0.07315648 -0.0739467 0.35741916]]
          Spectral norm of X = 1.00018499995
          X.T * X =
          [[ 0.99843209 -0.00168429 -0.0019862 ]
          [-0.00168429 0.99833293 -0.00222193]
          [-0.0019862 -0.00222193 0.99790575]]
          Round(X.T * X) =
          [[ 0.998 -0.002 -0.002]
          [-0.002 0.998 -0.002]
          [-0.002 -0.002 0.998]]


          where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.



          I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:




          • Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).


          • Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.



          By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.






          share|cite|improve this answer











          $endgroup$



          We have the following non-convex optimization problem in (non-fat) $mathrm X in mathbb R^{m times n}$



          begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & mathrm X^top mathrm X = mathrm I_nend{array}



          where (non-fat) $mathrm A in mathbb R^{m times n}$ is given. The feasible region, defined by $mathrm X^top mathrm X = mathrm I_n$, is a Stiefel manifold whose convex hull is defined by $mathrm X^top mathrm X preceq mathrm I_n$, or, equivalently, by the inequality $| mathrm X |_2 leq 1$. Hence, a convex relaxation of the original optimization problem is



          $$boxed{qquad begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & | mathrm X |_2 leq 1end{array} qquad}$$



          Via the Schur complement test for positive semidefiniteness, $mathrm X^top mathrm X preceq mathrm I_n$ can be rewritten as the following linear matrix inequality (LMI)



          $$begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}$$



          and, thus,



          $$begin{array}{ll} text{minimize} & | mathrm X - mathrm A |_1\ text{subject to} & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$



          Introducing variable $mathrm Y in mathbb R^{m times n}$, we obtain the following semidefinite program in $mathrm X, mathrm Y in mathbb R^{m times n}$



          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm X - mathrm A leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm X\ mathrm X^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$





          Numerical experiments



          The following Python + NumPy + CVXPY code solves the convex relaxation of the original problem:



          from cvxpy import *
          import numpy as np

          (m,n) = (30,3)

          A = np.random.rand(m,n)

          X = Variable(m,n)

          # define optimization problem
          prob = Problem( Minimize(norm(X-A,1)), [ norm(X,2) <= 1 ])

          # solve optimization problem
          print prob.solve()
          print prob.status

          # print results
          print "X = n", X.value
          print "Spectral norm of X = ", np.linalg.norm(X.value,2)
          print "X.T * X = n", (X.value).T * (X.value)
          print "Round(X.T * X) = n", np.round((X.value).T * (X.value), 3)


          Unfortunately, it produces



          X.T * X = 
          [[ 0.50912521 0.22478268 0.25341844]
          [ 0.22478268 0.4961892 0.2654229 ]
          [ 0.25341844 0.2654229 0.46896397]]


          which is quite "far" from the $3 times 3$ identity matrix. This is disappointing.



          However, using the Frobenius norm instead of the entry-wise $1$-norm:



          prob = Problem( Minimize(norm(X-A,'fro')), [ norm(X,2) <= 1 ])


          we obtain the following output



          3.85187827124
          optimal
          X =
          [[ 0.13146134 -0.14000011 0.32874275]
          [-0.01234834 0.11837379 0.06604536]
          [-0.10490978 0.10542352 0.25211362]
          [ 0.25263062 0.14707194 0.03884215]
          [ 0.00181182 0.4702248 -0.20126385]
          [ 0.13013444 -0.07199484 0.08727077]
          [ 0.19077548 -0.01423872 -0.05960523]
          [ 0.2865637 -0.00202074 0.02844798]
          [ 0.01602302 0.04395754 0.00154713]
          [ 0.17932924 0.30926775 -0.05940074]
          [ 0.42908676 -0.21953956 0.12394825]
          [ 0.1255695 0.16415755 0.14634119]
          [ 0.28144817 0.08592836 0.08426443]
          [ 0.18209884 0.25983065 -0.02550957]
          [ 0.11077068 -0.10874038 0.23649308]
          [ 0.01565326 0.14043185 0.01186364]
          [-0.04374642 0.20360714 0.01079417]
          [-0.00440237 0.17746665 0.12931808]
          [ 0.18899948 0.08389032 0.23493301]
          [ 0.25802202 0.16171055 0.09263858]
          [-0.01921053 0.26863496 0.13077382]
          [-0.11175044 0.06137184 0.32758781]
          [-0.20321302 0.37803842 0.17629377]
          [-0.09301606 -0.0783033 0.36603431]
          [-0.00572361 0.17620931 -0.04822991]
          [ 0.24944001 0.18830197 -0.05522287]
          [-0.2049578 0.1091767 0.38943593]
          [ 0.36823908 -0.10026829 0.04425441]
          [ 0.03582131 0.03531081 0.08337656]
          [-0.07315648 -0.0739467 0.35741916]]
          Spectral norm of X = 1.00018499995
          X.T * X =
          [[ 0.99843209 -0.00168429 -0.0019862 ]
          [-0.00168429 0.99833293 -0.00222193]
          [-0.0019862 -0.00222193 0.99790575]]
          Round(X.T * X) =
          [[ 0.998 -0.002 -0.002]
          [-0.002 0.998 -0.002]
          [-0.002 -0.002 0.998]]


          where $rm X^top X$ is now quite "close" to the $3 times 3$ identity matrix.



          I did perform many numerical experiments, with several values for $m$ and $n$. My conclusions:




          • Minimizing the Frobenius norm worked rather satisfactorily for very tall matrices ($m gg n$).


          • Unfortunately, minimizing the entry-wise $1$-norm failed in all experiments I performed.



          By "worked" and "failed" I mean that the solution of the relaxed convex problem is a solution of the original (non-convex) optimization problem or not, respectively.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Apr 6 '18 at 11:09

























          answered Apr 4 '18 at 23:02









          Rodrigo de AzevedoRodrigo de Azevedo

          13k41960




          13k41960








          • 1




            $begingroup$
            Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:13










          • $begingroup$
            My comment was merely there to help any reader who was confused by change in notation.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:17










          • $begingroup$
            I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
            $endgroup$
            – Michael Grant
            Apr 5 '18 at 1:17








          • 1




            $begingroup$
            $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
            $endgroup$
            – Rodrigo de Azevedo
            Apr 5 '18 at 1:25










          • $begingroup$
            @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
            $endgroup$
            – Royi
            Feb 3 at 4:29
















          • 1




            $begingroup$
            Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:13










          • $begingroup$
            My comment was merely there to help any reader who was confused by change in notation.
            $endgroup$
            – Brian Borchers
            Apr 5 '18 at 1:17










          • $begingroup$
            I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
            $endgroup$
            – Michael Grant
            Apr 5 '18 at 1:17








          • 1




            $begingroup$
            $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
            $endgroup$
            – Rodrigo de Azevedo
            Apr 5 '18 at 1:25










          • $begingroup$
            @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
            $endgroup$
            – Royi
            Feb 3 at 4:29










          1




          1




          $begingroup$
          Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
          $endgroup$
          – Brian Borchers
          Apr 5 '18 at 1:13




          $begingroup$
          Your notation is inconsistent with the original question. Your $A$ is the original poster's $M$.
          $endgroup$
          – Brian Borchers
          Apr 5 '18 at 1:13












          $begingroup$
          My comment was merely there to help any reader who was confused by change in notation.
          $endgroup$
          – Brian Borchers
          Apr 5 '18 at 1:17




          $begingroup$
          My comment was merely there to help any reader who was confused by change in notation.
          $endgroup$
          – Brian Borchers
          Apr 5 '18 at 1:17












          $begingroup$
          I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
          $endgroup$
          – Michael Grant
          Apr 5 '18 at 1:17






          $begingroup$
          I'm with Brian on this one—the notation change is unnecessarily distracting. As for using $m$ and $M$ at the same time, well, that would throw me off every time I need to formulate a big-$M$ method! ;-)
          $endgroup$
          – Michael Grant
          Apr 5 '18 at 1:17






          1




          1




          $begingroup$
          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
          $endgroup$
          – Rodrigo de Azevedo
          Apr 5 '18 at 1:25




          $begingroup$
          $$begin{array}{ll} text{minimize} & langle 1_m 1_n^top, mathrm Y rangle\ text{subject to} & - mathrm Y leq mathrm R - mathrm M leq mathrm Y\ & begin{bmatrix} mathrm I_m & mathrm R\ mathrm R^top & mathrm I_nend{bmatrix} succeq mathrm O_{m+n}end{array}$$
          $endgroup$
          – Rodrigo de Azevedo
          Apr 5 '18 at 1:25












          $begingroup$
          @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
          $endgroup$
          – Royi
          Feb 3 at 4:29






          $begingroup$
          @RodrigodeAzevedo, Is there a projection onto the Stiefel Manifold (Also the space of Semi Orthogonal Matrices)?
          $endgroup$
          – Royi
          Feb 3 at 4:29




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2500881%2fnearest-semi-orthonormal-matrix-using-the-entry-wise-ell-1-norm%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          MongoDB - Not Authorized To Execute Command

          How to fix TextFormField cause rebuild widget in Flutter

          in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith