Geometric intuition of conjugate function












16












$begingroup$


I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.



$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$










share|cite|improve this question











$endgroup$












  • $begingroup$
    You can improve this post by asking a more specific question.
    $endgroup$
    – Pedro Tamaroff
    Jul 30 '16 at 3:22










  • $begingroup$
    Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
    $endgroup$
    – Lorenzo Stella
    Aug 1 '16 at 9:02
















16












$begingroup$


I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.



$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$










share|cite|improve this question











$endgroup$












  • $begingroup$
    You can improve this post by asking a more specific question.
    $endgroup$
    – Pedro Tamaroff
    Jul 30 '16 at 3:22










  • $begingroup$
    Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
    $endgroup$
    – Lorenzo Stella
    Aug 1 '16 at 9:02














16












16








16


6



$begingroup$


I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.



$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$










share|cite|improve this question











$endgroup$




I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.



$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$







optimization convex-analysis convex-optimization






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Sep 20 '18 at 14:04









Lorenzo Stella

582311




582311










asked Jul 29 '16 at 0:31









Abhishek BhatiaAbhishek Bhatia

3401822




3401822












  • $begingroup$
    You can improve this post by asking a more specific question.
    $endgroup$
    – Pedro Tamaroff
    Jul 30 '16 at 3:22










  • $begingroup$
    Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
    $endgroup$
    – Lorenzo Stella
    Aug 1 '16 at 9:02


















  • $begingroup$
    You can improve this post by asking a more specific question.
    $endgroup$
    – Pedro Tamaroff
    Jul 30 '16 at 3:22










  • $begingroup$
    Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
    $endgroup$
    – Lorenzo Stella
    Aug 1 '16 at 9:02
















$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff
Jul 30 '16 at 3:22




$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff
Jul 30 '16 at 3:22












$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02




$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02










3 Answers
3






active

oldest

votes


















14












$begingroup$

I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Great recommendation.
    $endgroup$
    – Michael Grant
    Aug 1 '16 at 14:12










  • $begingroup$
    @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
    $endgroup$
    – rsp1984
    Feb 1 at 23:40










  • $begingroup$
    "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
    $endgroup$
    – Lorenzo Stella
    Feb 3 at 14:16



















4












$begingroup$

You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:



$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$



I denote the axis that measures the output of $f$ as $x_{n+1}$.



Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:



$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$



But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:



$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$



As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.






share|cite|improve this answer











$endgroup$





















    2












    $begingroup$

    I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.



    The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.



    Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.



    Here's the Legendre transformation:



    $f^*(y)=sup(yx - f(x))$



    The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.



    $f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.



    Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.



    The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.



    To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.



    The Wikipedia Article on Legendre transformations also has some more good information, such as:



    If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.



    So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.



    It's not too hard to imagine why that particular trick only works for convex functions.



    PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:



    $f^*(f'(x))=x*f'(x)-f(x)$



    In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.






    share|cite|improve this answer











    $endgroup$














      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "69"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      noCode: true, onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1874482%2fgeometric-intuition-of-conjugate-function%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      14












      $begingroup$

      I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        Great recommendation.
        $endgroup$
        – Michael Grant
        Aug 1 '16 at 14:12










      • $begingroup$
        @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
        $endgroup$
        – rsp1984
        Feb 1 at 23:40










      • $begingroup$
        "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
        $endgroup$
        – Lorenzo Stella
        Feb 3 at 14:16
















      14












      $begingroup$

      I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        Great recommendation.
        $endgroup$
        – Michael Grant
        Aug 1 '16 at 14:12










      • $begingroup$
        @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
        $endgroup$
        – rsp1984
        Feb 1 at 23:40










      • $begingroup$
        "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
        $endgroup$
        – Lorenzo Stella
        Feb 3 at 14:16














      14












      14








      14





      $begingroup$

      I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.






      share|cite|improve this answer









      $endgroup$



      I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.







      share|cite|improve this answer












      share|cite|improve this answer



      share|cite|improve this answer










      answered Aug 1 '16 at 9:00









      Lorenzo StellaLorenzo Stella

      582311




      582311












      • $begingroup$
        Great recommendation.
        $endgroup$
        – Michael Grant
        Aug 1 '16 at 14:12










      • $begingroup$
        @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
        $endgroup$
        – rsp1984
        Feb 1 at 23:40










      • $begingroup$
        "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
        $endgroup$
        – Lorenzo Stella
        Feb 3 at 14:16


















      • $begingroup$
        Great recommendation.
        $endgroup$
        – Michael Grant
        Aug 1 '16 at 14:12










      • $begingroup$
        @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
        $endgroup$
        – rsp1984
        Feb 1 at 23:40










      • $begingroup$
        "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
        $endgroup$
        – Lorenzo Stella
        Feb 3 at 14:16
















      $begingroup$
      Great recommendation.
      $endgroup$
      – Michael Grant
      Aug 1 '16 at 14:12




      $begingroup$
      Great recommendation.
      $endgroup$
      – Michael Grant
      Aug 1 '16 at 14:12












      $begingroup$
      @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
      $endgroup$
      – rsp1984
      Feb 1 at 23:40




      $begingroup$
      @Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
      $endgroup$
      – rsp1984
      Feb 1 at 23:40












      $begingroup$
      "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
      $endgroup$
      – Lorenzo Stella
      Feb 3 at 14:16




      $begingroup$
      "Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
      $endgroup$
      – Lorenzo Stella
      Feb 3 at 14:16











      4












      $begingroup$

      You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
      Assume:



      $x in mathbb{R}^n$
      , $x_{n+1}= f(x) in mathbb{R}$



      I denote the axis that measures the output of $f$ as $x_{n+1}$.



      Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:



      $ inf(f) = sup(S = { 0 - f(x) | x in dom f})$



      But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:



      $f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$



      As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.






      share|cite|improve this answer











      $endgroup$


















        4












        $begingroup$

        You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
        Assume:



        $x in mathbb{R}^n$
        , $x_{n+1}= f(x) in mathbb{R}$



        I denote the axis that measures the output of $f$ as $x_{n+1}$.



        Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:



        $ inf(f) = sup(S = { 0 - f(x) | x in dom f})$



        But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:



        $f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$



        As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.






        share|cite|improve this answer











        $endgroup$
















          4












          4








          4





          $begingroup$

          You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
          Assume:



          $x in mathbb{R}^n$
          , $x_{n+1}= f(x) in mathbb{R}$



          I denote the axis that measures the output of $f$ as $x_{n+1}$.



          Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:



          $ inf(f) = sup(S = { 0 - f(x) | x in dom f})$



          But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:



          $f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$



          As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.






          share|cite|improve this answer











          $endgroup$



          You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
          Assume:



          $x in mathbb{R}^n$
          , $x_{n+1}= f(x) in mathbb{R}$



          I denote the axis that measures the output of $f$ as $x_{n+1}$.



          Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:



          $ inf(f) = sup(S = { 0 - f(x) | x in dom f})$



          But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:



          $f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$



          As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Mar 6 at 12:27

























          answered Apr 6 '18 at 21:10









          NKSHELLNKSHELL

          414




          414























              2












              $begingroup$

              I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.



              The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.



              Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.



              Here's the Legendre transformation:



              $f^*(y)=sup(yx - f(x))$



              The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.



              $f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.



              Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.



              The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.



              To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.



              The Wikipedia Article on Legendre transformations also has some more good information, such as:



              If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.



              So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.



              It's not too hard to imagine why that particular trick only works for convex functions.



              PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:



              $f^*(f'(x))=x*f'(x)-f(x)$



              In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.






              share|cite|improve this answer











              $endgroup$


















                2












                $begingroup$

                I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.



                The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.



                Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.



                Here's the Legendre transformation:



                $f^*(y)=sup(yx - f(x))$



                The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.



                $f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.



                Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.



                The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.



                To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.



                The Wikipedia Article on Legendre transformations also has some more good information, such as:



                If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.



                So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.



                It's not too hard to imagine why that particular trick only works for convex functions.



                PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:



                $f^*(f'(x))=x*f'(x)-f(x)$



                In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.






                share|cite|improve this answer











                $endgroup$
















                  2












                  2








                  2





                  $begingroup$

                  I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.



                  The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.



                  Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.



                  Here's the Legendre transformation:



                  $f^*(y)=sup(yx - f(x))$



                  The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.



                  $f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.



                  Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.



                  The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.



                  To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.



                  The Wikipedia Article on Legendre transformations also has some more good information, such as:



                  If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.



                  So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.



                  It's not too hard to imagine why that particular trick only works for convex functions.



                  PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:



                  $f^*(f'(x))=x*f'(x)-f(x)$



                  In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.






                  share|cite|improve this answer











                  $endgroup$



                  I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.



                  The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.



                  Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.



                  Here's the Legendre transformation:



                  $f^*(y)=sup(yx - f(x))$



                  The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.



                  $f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.



                  Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.



                  The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.



                  To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.



                  The Wikipedia Article on Legendre transformations also has some more good information, such as:



                  If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.



                  So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.



                  It's not too hard to imagine why that particular trick only works for convex functions.



                  PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:



                  $f^*(f'(x))=x*f'(x)-f(x)$



                  In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited Mar 14 at 13:05

























                  answered Feb 2 at 16:13









                  rsp1984rsp1984

                  1315




                  1315






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematics Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1874482%2fgeometric-intuition-of-conjugate-function%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      MongoDB - Not Authorized To Execute Command

                      How to fix TextFormField cause rebuild widget in Flutter

                      in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith