What kind of matrices are non-diagonalizable?












33












$begingroup$


I'm trying to build an intuitive geometric picture about diagonalization.
Let me show what I got so far.



Eigenvector of some linear operator signifies a direction in which operator just ''works'' like a stretching, in other words, operator preserves the direction of its eigenvector. Corresponding eigenvalue is just a value which tells us for how much operator stretches the eigenvector (negative stretches = flipping in the opposite direction).
When we limit ourselves to real vector spaces, it's intuitively clear that rotations don't preserve direction of any non-zero vector. Actually, I'm thinking about 2D and 3D spaces as I write, so I talk about ''rotations''... for n-dimensional spaces it would be better to talk about ''operators which act like rotations on some 2D subspace''.



But, there are non-diagonalizable matrices that aren't rotations - all non-zero nilpotent matrices. My intuitive view of nilpotent matrices is that they ''gradually collapse all dimensions/gradually lose all the information'' (if we use them over and over again), so it's clear to me why they can't be diagonalizable.



But, again, there are non-diagonalizable matrices that aren't rotations nor nilpotent, for an example:



$$
begin{pmatrix}
1 & 1 \
0 & 1
end{pmatrix}
$$



So, what's the deal with them? Is there any kind of intuitive geometric reasoning that would help me grasp why there are matrices like this one? What's their characteristic that stops them from being diagonalizable?










share|cite|improve this question











$endgroup$

















    33












    $begingroup$


    I'm trying to build an intuitive geometric picture about diagonalization.
    Let me show what I got so far.



    Eigenvector of some linear operator signifies a direction in which operator just ''works'' like a stretching, in other words, operator preserves the direction of its eigenvector. Corresponding eigenvalue is just a value which tells us for how much operator stretches the eigenvector (negative stretches = flipping in the opposite direction).
    When we limit ourselves to real vector spaces, it's intuitively clear that rotations don't preserve direction of any non-zero vector. Actually, I'm thinking about 2D and 3D spaces as I write, so I talk about ''rotations''... for n-dimensional spaces it would be better to talk about ''operators which act like rotations on some 2D subspace''.



    But, there are non-diagonalizable matrices that aren't rotations - all non-zero nilpotent matrices. My intuitive view of nilpotent matrices is that they ''gradually collapse all dimensions/gradually lose all the information'' (if we use them over and over again), so it's clear to me why they can't be diagonalizable.



    But, again, there are non-diagonalizable matrices that aren't rotations nor nilpotent, for an example:



    $$
    begin{pmatrix}
    1 & 1 \
    0 & 1
    end{pmatrix}
    $$



    So, what's the deal with them? Is there any kind of intuitive geometric reasoning that would help me grasp why there are matrices like this one? What's their characteristic that stops them from being diagonalizable?










    share|cite|improve this question











    $endgroup$















      33












      33








      33


      25



      $begingroup$


      I'm trying to build an intuitive geometric picture about diagonalization.
      Let me show what I got so far.



      Eigenvector of some linear operator signifies a direction in which operator just ''works'' like a stretching, in other words, operator preserves the direction of its eigenvector. Corresponding eigenvalue is just a value which tells us for how much operator stretches the eigenvector (negative stretches = flipping in the opposite direction).
      When we limit ourselves to real vector spaces, it's intuitively clear that rotations don't preserve direction of any non-zero vector. Actually, I'm thinking about 2D and 3D spaces as I write, so I talk about ''rotations''... for n-dimensional spaces it would be better to talk about ''operators which act like rotations on some 2D subspace''.



      But, there are non-diagonalizable matrices that aren't rotations - all non-zero nilpotent matrices. My intuitive view of nilpotent matrices is that they ''gradually collapse all dimensions/gradually lose all the information'' (if we use them over and over again), so it's clear to me why they can't be diagonalizable.



      But, again, there are non-diagonalizable matrices that aren't rotations nor nilpotent, for an example:



      $$
      begin{pmatrix}
      1 & 1 \
      0 & 1
      end{pmatrix}
      $$



      So, what's the deal with them? Is there any kind of intuitive geometric reasoning that would help me grasp why there are matrices like this one? What's their characteristic that stops them from being diagonalizable?










      share|cite|improve this question











      $endgroup$




      I'm trying to build an intuitive geometric picture about diagonalization.
      Let me show what I got so far.



      Eigenvector of some linear operator signifies a direction in which operator just ''works'' like a stretching, in other words, operator preserves the direction of its eigenvector. Corresponding eigenvalue is just a value which tells us for how much operator stretches the eigenvector (negative stretches = flipping in the opposite direction).
      When we limit ourselves to real vector spaces, it's intuitively clear that rotations don't preserve direction of any non-zero vector. Actually, I'm thinking about 2D and 3D spaces as I write, so I talk about ''rotations''... for n-dimensional spaces it would be better to talk about ''operators which act like rotations on some 2D subspace''.



      But, there are non-diagonalizable matrices that aren't rotations - all non-zero nilpotent matrices. My intuitive view of nilpotent matrices is that they ''gradually collapse all dimensions/gradually lose all the information'' (if we use them over and over again), so it's clear to me why they can't be diagonalizable.



      But, again, there are non-diagonalizable matrices that aren't rotations nor nilpotent, for an example:



      $$
      begin{pmatrix}
      1 & 1 \
      0 & 1
      end{pmatrix}
      $$



      So, what's the deal with them? Is there any kind of intuitive geometric reasoning that would help me grasp why there are matrices like this one? What's their characteristic that stops them from being diagonalizable?







      linear-algebra matrices soft-question vector-spaces eigenvalues-eigenvectors






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Aug 21 '13 at 17:41







      ante.ceperic

















      asked Aug 21 '13 at 16:22









      ante.cepericante.ceperic

      1,89211533




      1,89211533






















          2 Answers
          2






          active

          oldest

          votes


















          26












          $begingroup$

          I think a very useful notion here is the idea of a "generalized eigenvector".



          An eigenvector of a matrix $A$ is a vector $v$ with associated value $lambda$ such that
          $$
          (A-lambda I)v=0
          $$
          A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
          $$
          (A-lambda I)^kw=0
          $$
          That is, $(A-lambda I)$ is nilpotent on $w$. Or, in other words:
          $$
          (A - lambda I)^{k-1}w=v
          $$
          For some eigenvector $v$ with the same associated value.





          Now, let's see how this definition helps us with a non-diagonalizable matrix such as
          $$
          A = pmatrix{
          2 & 1\
          0 & 2
          }
          $$
          For this matrix, we have $lambda=2$ as a unique eigenvalue, and $v=pmatrix{1\0}$ as the associated eigenvector, which I will let you verify. $w=pmatrix{0\1}$ is our generalized eiegenvector. Notice that
          $$
          (A - 2I) = pmatrix{
          0 & 1\
          0 & 0}
          $$
          Is a nilpotent matrix of order $2$. Note that $(A - 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
          $$
          Aw = pmatrix{1\2}=2w + v
          $$
          So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.





          An important related notion is Jordan Normal Form. That is, while we can't always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.



          I hope that helps. I'd say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.






          share|cite|improve this answer











          $endgroup$





















            3












            $begingroup$

            Edit: The algebra I speak of here is not actually the Grassmann numbers at all -- they are $mathbb{R}[X]/(X^n)$, whose generators don't satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".





            Non-diagonalisable 2 by 2 matrices can be diagonalised over the dual numbers -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.



            The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $left[begin{array}{*{20}{c}}{1}&{v}\{0}&{1}end{array}right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with MatVis). So one may imagine that the two eigenvectors are only an "epsilon" away, where $varepsilon$ is the unit dual satisfying $varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:



            $$(lambda-1)^2=0$$



            Whose solutions among the dual numbers are $lambda=1+kvarepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:



            $$left[begin{array}{*{20}{c}}{1}&{0}\{0}&{1+vvarepsilon}end{array}right]$$



            Granted this is not unique, this is formed from the change-of-basis matrix $left[begin{array}{*{20}{c}}{1}&{1}\{0}&{epsilon}end{array}right]$, but any vector of the form $(1,kvarepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.



            Then a nilpotent matrix with characteristic polynomial $lambda^2=0$ has solutions $lambda=kvarepsilon$, and is simply diagonalised as:



            $$left[begin{array}{*{20}{c}}{0}&{0}\{0}&{varepsilon}end{array}right]$$



            (Think about this.) Indeed, the resulting matrix has minimal polynomial $lambda^2=0$, and the eigenvectors are as before.





            What about higher dimensional matrices? Consider:



            $$left[ {begin{array}{*{20}{c}}0&v&0\0&0&w\0&0&0end{array}} right]$$



            This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $lambda^3=0$. Although $varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $mathrm{diag}(0,vvarepsilon,wvarepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+bvarepsilon,0,0)$.



            Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying $epsilon^n=0$. Then the diagonalisation takes for instance the form:



            $$left[ {begin{array}{*{20}{c}}0&0&0\0&{vepsilon}&0\0&0&{wepsilon}end{array}} right]$$





            *Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".



            In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.



            (Perhaps relevant: Grassmann numbers as eigenvalues of nilpotent operators? -- discussing the fact that the Grassmann numbers are not a field).



            You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form $PD=AP$, not $D=P^{-1}AP$. I don't see any way to bypass this. See my posts All matrices can be diagonalised (a re-post of this answer) and Repeated roots of differential equations for ideas, I guess.






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
              $endgroup$
              – darij grinberg
              Feb 5 at 16:56












            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f472915%2fwhat-kind-of-matrices-are-non-diagonalizable%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            26












            $begingroup$

            I think a very useful notion here is the idea of a "generalized eigenvector".



            An eigenvector of a matrix $A$ is a vector $v$ with associated value $lambda$ such that
            $$
            (A-lambda I)v=0
            $$
            A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
            $$
            (A-lambda I)^kw=0
            $$
            That is, $(A-lambda I)$ is nilpotent on $w$. Or, in other words:
            $$
            (A - lambda I)^{k-1}w=v
            $$
            For some eigenvector $v$ with the same associated value.





            Now, let's see how this definition helps us with a non-diagonalizable matrix such as
            $$
            A = pmatrix{
            2 & 1\
            0 & 2
            }
            $$
            For this matrix, we have $lambda=2$ as a unique eigenvalue, and $v=pmatrix{1\0}$ as the associated eigenvector, which I will let you verify. $w=pmatrix{0\1}$ is our generalized eiegenvector. Notice that
            $$
            (A - 2I) = pmatrix{
            0 & 1\
            0 & 0}
            $$
            Is a nilpotent matrix of order $2$. Note that $(A - 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
            $$
            Aw = pmatrix{1\2}=2w + v
            $$
            So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.





            An important related notion is Jordan Normal Form. That is, while we can't always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.



            I hope that helps. I'd say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.






            share|cite|improve this answer











            $endgroup$


















              26












              $begingroup$

              I think a very useful notion here is the idea of a "generalized eigenvector".



              An eigenvector of a matrix $A$ is a vector $v$ with associated value $lambda$ such that
              $$
              (A-lambda I)v=0
              $$
              A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
              $$
              (A-lambda I)^kw=0
              $$
              That is, $(A-lambda I)$ is nilpotent on $w$. Or, in other words:
              $$
              (A - lambda I)^{k-1}w=v
              $$
              For some eigenvector $v$ with the same associated value.





              Now, let's see how this definition helps us with a non-diagonalizable matrix such as
              $$
              A = pmatrix{
              2 & 1\
              0 & 2
              }
              $$
              For this matrix, we have $lambda=2$ as a unique eigenvalue, and $v=pmatrix{1\0}$ as the associated eigenvector, which I will let you verify. $w=pmatrix{0\1}$ is our generalized eiegenvector. Notice that
              $$
              (A - 2I) = pmatrix{
              0 & 1\
              0 & 0}
              $$
              Is a nilpotent matrix of order $2$. Note that $(A - 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
              $$
              Aw = pmatrix{1\2}=2w + v
              $$
              So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.





              An important related notion is Jordan Normal Form. That is, while we can't always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.



              I hope that helps. I'd say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.






              share|cite|improve this answer











              $endgroup$
















                26












                26








                26





                $begingroup$

                I think a very useful notion here is the idea of a "generalized eigenvector".



                An eigenvector of a matrix $A$ is a vector $v$ with associated value $lambda$ such that
                $$
                (A-lambda I)v=0
                $$
                A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
                $$
                (A-lambda I)^kw=0
                $$
                That is, $(A-lambda I)$ is nilpotent on $w$. Or, in other words:
                $$
                (A - lambda I)^{k-1}w=v
                $$
                For some eigenvector $v$ with the same associated value.





                Now, let's see how this definition helps us with a non-diagonalizable matrix such as
                $$
                A = pmatrix{
                2 & 1\
                0 & 2
                }
                $$
                For this matrix, we have $lambda=2$ as a unique eigenvalue, and $v=pmatrix{1\0}$ as the associated eigenvector, which I will let you verify. $w=pmatrix{0\1}$ is our generalized eiegenvector. Notice that
                $$
                (A - 2I) = pmatrix{
                0 & 1\
                0 & 0}
                $$
                Is a nilpotent matrix of order $2$. Note that $(A - 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
                $$
                Aw = pmatrix{1\2}=2w + v
                $$
                So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.





                An important related notion is Jordan Normal Form. That is, while we can't always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.



                I hope that helps. I'd say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.






                share|cite|improve this answer











                $endgroup$



                I think a very useful notion here is the idea of a "generalized eigenvector".



                An eigenvector of a matrix $A$ is a vector $v$ with associated value $lambda$ such that
                $$
                (A-lambda I)v=0
                $$
                A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
                $$
                (A-lambda I)^kw=0
                $$
                That is, $(A-lambda I)$ is nilpotent on $w$. Or, in other words:
                $$
                (A - lambda I)^{k-1}w=v
                $$
                For some eigenvector $v$ with the same associated value.





                Now, let's see how this definition helps us with a non-diagonalizable matrix such as
                $$
                A = pmatrix{
                2 & 1\
                0 & 2
                }
                $$
                For this matrix, we have $lambda=2$ as a unique eigenvalue, and $v=pmatrix{1\0}$ as the associated eigenvector, which I will let you verify. $w=pmatrix{0\1}$ is our generalized eiegenvector. Notice that
                $$
                (A - 2I) = pmatrix{
                0 & 1\
                0 & 0}
                $$
                Is a nilpotent matrix of order $2$. Note that $(A - 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
                $$
                Aw = pmatrix{1\2}=2w + v
                $$
                So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.





                An important related notion is Jordan Normal Form. That is, while we can't always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.



                I hope that helps. I'd say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Sep 3 '14 at 14:06

























                answered Aug 21 '13 at 17:26









                OmnomnomnomOmnomnomnom

                129k794188




                129k794188























                    3












                    $begingroup$

                    Edit: The algebra I speak of here is not actually the Grassmann numbers at all -- they are $mathbb{R}[X]/(X^n)$, whose generators don't satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".





                    Non-diagonalisable 2 by 2 matrices can be diagonalised over the dual numbers -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.



                    The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $left[begin{array}{*{20}{c}}{1}&{v}\{0}&{1}end{array}right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with MatVis). So one may imagine that the two eigenvectors are only an "epsilon" away, where $varepsilon$ is the unit dual satisfying $varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:



                    $$(lambda-1)^2=0$$



                    Whose solutions among the dual numbers are $lambda=1+kvarepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:



                    $$left[begin{array}{*{20}{c}}{1}&{0}\{0}&{1+vvarepsilon}end{array}right]$$



                    Granted this is not unique, this is formed from the change-of-basis matrix $left[begin{array}{*{20}{c}}{1}&{1}\{0}&{epsilon}end{array}right]$, but any vector of the form $(1,kvarepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.



                    Then a nilpotent matrix with characteristic polynomial $lambda^2=0$ has solutions $lambda=kvarepsilon$, and is simply diagonalised as:



                    $$left[begin{array}{*{20}{c}}{0}&{0}\{0}&{varepsilon}end{array}right]$$



                    (Think about this.) Indeed, the resulting matrix has minimal polynomial $lambda^2=0$, and the eigenvectors are as before.





                    What about higher dimensional matrices? Consider:



                    $$left[ {begin{array}{*{20}{c}}0&v&0\0&0&w\0&0&0end{array}} right]$$



                    This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $lambda^3=0$. Although $varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $mathrm{diag}(0,vvarepsilon,wvarepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+bvarepsilon,0,0)$.



                    Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying $epsilon^n=0$. Then the diagonalisation takes for instance the form:



                    $$left[ {begin{array}{*{20}{c}}0&0&0\0&{vepsilon}&0\0&0&{wepsilon}end{array}} right]$$





                    *Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".



                    In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.



                    (Perhaps relevant: Grassmann numbers as eigenvalues of nilpotent operators? -- discussing the fact that the Grassmann numbers are not a field).



                    You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form $PD=AP$, not $D=P^{-1}AP$. I don't see any way to bypass this. See my posts All matrices can be diagonalised (a re-post of this answer) and Repeated roots of differential equations for ideas, I guess.






                    share|cite|improve this answer











                    $endgroup$













                    • $begingroup$
                      This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                      $endgroup$
                      – darij grinberg
                      Feb 5 at 16:56
















                    3












                    $begingroup$

                    Edit: The algebra I speak of here is not actually the Grassmann numbers at all -- they are $mathbb{R}[X]/(X^n)$, whose generators don't satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".





                    Non-diagonalisable 2 by 2 matrices can be diagonalised over the dual numbers -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.



                    The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $left[begin{array}{*{20}{c}}{1}&{v}\{0}&{1}end{array}right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with MatVis). So one may imagine that the two eigenvectors are only an "epsilon" away, where $varepsilon$ is the unit dual satisfying $varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:



                    $$(lambda-1)^2=0$$



                    Whose solutions among the dual numbers are $lambda=1+kvarepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:



                    $$left[begin{array}{*{20}{c}}{1}&{0}\{0}&{1+vvarepsilon}end{array}right]$$



                    Granted this is not unique, this is formed from the change-of-basis matrix $left[begin{array}{*{20}{c}}{1}&{1}\{0}&{epsilon}end{array}right]$, but any vector of the form $(1,kvarepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.



                    Then a nilpotent matrix with characteristic polynomial $lambda^2=0$ has solutions $lambda=kvarepsilon$, and is simply diagonalised as:



                    $$left[begin{array}{*{20}{c}}{0}&{0}\{0}&{varepsilon}end{array}right]$$



                    (Think about this.) Indeed, the resulting matrix has minimal polynomial $lambda^2=0$, and the eigenvectors are as before.





                    What about higher dimensional matrices? Consider:



                    $$left[ {begin{array}{*{20}{c}}0&v&0\0&0&w\0&0&0end{array}} right]$$



                    This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $lambda^3=0$. Although $varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $mathrm{diag}(0,vvarepsilon,wvarepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+bvarepsilon,0,0)$.



                    Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying $epsilon^n=0$. Then the diagonalisation takes for instance the form:



                    $$left[ {begin{array}{*{20}{c}}0&0&0\0&{vepsilon}&0\0&0&{wepsilon}end{array}} right]$$





                    *Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".



                    In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.



                    (Perhaps relevant: Grassmann numbers as eigenvalues of nilpotent operators? -- discussing the fact that the Grassmann numbers are not a field).



                    You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form $PD=AP$, not $D=P^{-1}AP$. I don't see any way to bypass this. See my posts All matrices can be diagonalised (a re-post of this answer) and Repeated roots of differential equations for ideas, I guess.






                    share|cite|improve this answer











                    $endgroup$













                    • $begingroup$
                      This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                      $endgroup$
                      – darij grinberg
                      Feb 5 at 16:56














                    3












                    3








                    3





                    $begingroup$

                    Edit: The algebra I speak of here is not actually the Grassmann numbers at all -- they are $mathbb{R}[X]/(X^n)$, whose generators don't satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".





                    Non-diagonalisable 2 by 2 matrices can be diagonalised over the dual numbers -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.



                    The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $left[begin{array}{*{20}{c}}{1}&{v}\{0}&{1}end{array}right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with MatVis). So one may imagine that the two eigenvectors are only an "epsilon" away, where $varepsilon$ is the unit dual satisfying $varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:



                    $$(lambda-1)^2=0$$



                    Whose solutions among the dual numbers are $lambda=1+kvarepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:



                    $$left[begin{array}{*{20}{c}}{1}&{0}\{0}&{1+vvarepsilon}end{array}right]$$



                    Granted this is not unique, this is formed from the change-of-basis matrix $left[begin{array}{*{20}{c}}{1}&{1}\{0}&{epsilon}end{array}right]$, but any vector of the form $(1,kvarepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.



                    Then a nilpotent matrix with characteristic polynomial $lambda^2=0$ has solutions $lambda=kvarepsilon$, and is simply diagonalised as:



                    $$left[begin{array}{*{20}{c}}{0}&{0}\{0}&{varepsilon}end{array}right]$$



                    (Think about this.) Indeed, the resulting matrix has minimal polynomial $lambda^2=0$, and the eigenvectors are as before.





                    What about higher dimensional matrices? Consider:



                    $$left[ {begin{array}{*{20}{c}}0&v&0\0&0&w\0&0&0end{array}} right]$$



                    This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $lambda^3=0$. Although $varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $mathrm{diag}(0,vvarepsilon,wvarepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+bvarepsilon,0,0)$.



                    Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying $epsilon^n=0$. Then the diagonalisation takes for instance the form:



                    $$left[ {begin{array}{*{20}{c}}0&0&0\0&{vepsilon}&0\0&0&{wepsilon}end{array}} right]$$





                    *Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".



                    In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.



                    (Perhaps relevant: Grassmann numbers as eigenvalues of nilpotent operators? -- discussing the fact that the Grassmann numbers are not a field).



                    You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form $PD=AP$, not $D=P^{-1}AP$. I don't see any way to bypass this. See my posts All matrices can be diagonalised (a re-post of this answer) and Repeated roots of differential equations for ideas, I guess.






                    share|cite|improve this answer











                    $endgroup$



                    Edit: The algebra I speak of here is not actually the Grassmann numbers at all -- they are $mathbb{R}[X]/(X^n)$, whose generators don't satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".





                    Non-diagonalisable 2 by 2 matrices can be diagonalised over the dual numbers -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.



                    The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation $left[begin{array}{*{20}{c}}{1}&{v}\{0}&{1}end{array}right]$, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with MatVis). So one may imagine that the two eigenvectors are only an "epsilon" away, where $varepsilon$ is the unit dual satisfying $varepsilon^2=0$ (called the "soul"). Indeed, its characteristic polynomial is:



                    $$(lambda-1)^2=0$$



                    Whose solutions among the dual numbers are $lambda=1+kvarepsilon$ for real $k$. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:



                    $$left[begin{array}{*{20}{c}}{1}&{0}\{0}&{1+vvarepsilon}end{array}right]$$



                    Granted this is not unique, this is formed from the change-of-basis matrix $left[begin{array}{*{20}{c}}{1}&{1}\{0}&{epsilon}end{array}right]$, but any vector of the form $(1,kvarepsilon)$ is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by $(1,0)$ and $(1,varepsilon)$ in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.



                    Then a nilpotent matrix with characteristic polynomial $lambda^2=0$ has solutions $lambda=kvarepsilon$, and is simply diagonalised as:



                    $$left[begin{array}{*{20}{c}}{0}&{0}\{0}&{varepsilon}end{array}right]$$



                    (Think about this.) Indeed, the resulting matrix has minimal polynomial $lambda^2=0$, and the eigenvectors are as before.





                    What about higher dimensional matrices? Consider:



                    $$left[ {begin{array}{*{20}{c}}0&v&0\0&0&w\0&0&0end{array}} right]$$



                    This is a nilpotent matrix $A$ satisfying $A^3=0$ (but not $A^2=0$). The characteristic polynomial is $lambda^3=0$. Although $varepsilon$ might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form $mathrm{diag}(0,vvarepsilon,wvarepsilon)$, it has minimal polynomial $A^2=0$, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form $(a+bvarepsilon,0,0)$.



                    Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying $epsilon^n=0$. Then the diagonalisation takes for instance the form:



                    $$left[ {begin{array}{*{20}{c}}0&0&0\0&{vepsilon}&0\0&0&{wepsilon}end{array}} right]$$





                    *Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and $ab=0$ no longer implies "$a=0$ or $b=0$".



                    In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient $mathbb{R}[X]/(X^n)$, so you actually have something clear to work with.



                    (Perhaps relevant: Grassmann numbers as eigenvalues of nilpotent operators? -- discussing the fact that the Grassmann numbers are not a field).



                    You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form $PD=AP$, not $D=P^{-1}AP$. I don't see any way to bypass this. See my posts All matrices can be diagonalised (a re-post of this answer) and Repeated roots of differential equations for ideas, I guess.







                    share|cite|improve this answer














                    share|cite|improve this answer



                    share|cite|improve this answer








                    edited Feb 21 at 17:19

























                    answered Feb 2 at 22:17









                    Abhimanyu Pallavi SudhirAbhimanyu Pallavi Sudhir

                    982719




                    982719












                    • $begingroup$
                      This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                      $endgroup$
                      – darij grinberg
                      Feb 5 at 16:56


















                    • $begingroup$
                      This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                      $endgroup$
                      – darij grinberg
                      Feb 5 at 16:56
















                    $begingroup$
                    This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                    $endgroup$
                    – darij grinberg
                    Feb 5 at 16:56




                    $begingroup$
                    This is an interesting idea, but I'm not sure how literally it can be taken. The matrices you are using for the "diagonalization" are not invertible over the dual (or, respectively, Grassmann) numbers, so you don't get a diagonalization of the form $A = SDS^{-1}$ but merely $AS = SD$. I don't know which properties of diagonalization still hold for this kind.
                    $endgroup$
                    – darij grinberg
                    Feb 5 at 16:56


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Mathematics Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f472915%2fwhat-kind-of-matrices-are-non-diagonalizable%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

                    Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

                    A Topological Invariant for $pi_3(U(n))$