Geometric intuition of conjugate function
$begingroup$
I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.
$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$
optimization convex-analysis convex-optimization
$endgroup$
add a comment |
$begingroup$
I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.
$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$
optimization convex-analysis convex-optimization
$endgroup$
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02
add a comment |
$begingroup$
I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.
$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$
optimization convex-analysis convex-optimization
$endgroup$
I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.
$$ f^*(y)= sup_{x in operatorname{dom} f } (y^Tx-f(x))$$
optimization convex-analysis convex-optimization
optimization convex-analysis convex-optimization
edited Sep 20 '18 at 14:04
Lorenzo Stella
582311
582311
asked Jul 29 '16 at 0:31
Abhishek BhatiaAbhishek Bhatia
3401822
3401822
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02
add a comment |
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.
$endgroup$
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
add a comment |
$begingroup$
You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:
$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$
I denote the axis that measures the output of $f$ as $x_{n+1}$.
Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:
$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$
But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:
$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$
As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.
$endgroup$
add a comment |
$begingroup$
I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.
The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.
Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.
Here's the Legendre transformation:
$f^*(y)=sup(yx - f(x))$
The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.
$f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.
Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.
The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.
To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.
The Wikipedia Article on Legendre transformations also has some more good information, such as:
If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.
So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.
It's not too hard to imagine why that particular trick only works for convex functions.
PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:
$f^*(f'(x))=x*f'(x)-f(x)$
In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1874482%2fgeometric-intuition-of-conjugate-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.
$endgroup$
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
add a comment |
$begingroup$
I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.
$endgroup$
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
add a comment |
$begingroup$
I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.
$endgroup$
I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.
answered Aug 1 '16 at 9:00
Lorenzo StellaLorenzo Stella
582311
582311
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
add a comment |
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
Great recommendation.
$endgroup$
– Michael Grant
Aug 1 '16 at 14:12
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
@Lorenzo Stella I appreciate your answer but good lord, these notes are horrible! It was asked for a "geometric and intuitive explanation" of the convex conjugate, not a 340 pages long book filled with theorems and proofs.
$endgroup$
– rsp1984
Feb 1 at 23:40
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
$begingroup$
"Horrible" seems a bit exaggerated to me. There's no need to go through all 340 pages: reading page 7 is sufficient.
$endgroup$
– Lorenzo Stella
Feb 3 at 14:16
add a comment |
$begingroup$
You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:
$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$
I denote the axis that measures the output of $f$ as $x_{n+1}$.
Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:
$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$
But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:
$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$
As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.
$endgroup$
add a comment |
$begingroup$
You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:
$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$
I denote the axis that measures the output of $f$ as $x_{n+1}$.
Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:
$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$
But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:
$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$
As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.
$endgroup$
add a comment |
$begingroup$
You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:
$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$
I denote the axis that measures the output of $f$ as $x_{n+1}$.
Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:
$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$
But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:
$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$
As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.
$endgroup$
You can think of conjugate function as an extended infimum function. For $y=0$,$ f^*(y)$ is simply the infimum of $f$. Here I try to explain what it means geometrically.
Assume:
$x in mathbb{R}^n$
, $x_{n+1}= f(x) in mathbb{R}$
I denote the axis that measures the output of $f$ as $x_{n+1}$.
Note that we always measure the infimum of a function according to the line $x_{n+1}=0$. Think of finding the infimum of $f$ as finding the supremum of a set $S$:
$ inf(f) = sup(S = { 0 - f(x) | x in dom f})$
But what if we wanted to measure the infimum of $f$ according to an inclined line that passes through the origin, say, $y^Tx=0$? This is where we can use conjugate function. It takes the slope of the line as its input, and outputs $f$'s infimum according to the line $y^Tx=0$. So its formula should make sense now. It looks like the case above:
$f^*(y) = sup(S = {y^Tx - f(x)|xin dom f})$
As an example of its usage, we sometimes need to measure this extended infimum function in Lagrange dual function for finding a lower bound on the optimal value of an optimization problem.
edited Mar 6 at 12:27
answered Apr 6 '18 at 21:10


NKSHELLNKSHELL
414
414
add a comment |
add a comment |
$begingroup$
I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.
The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.
Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.
Here's the Legendre transformation:
$f^*(y)=sup(yx - f(x))$
The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.
$f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.
Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.
The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.
To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.
The Wikipedia Article on Legendre transformations also has some more good information, such as:
If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.
So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.
It's not too hard to imagine why that particular trick only works for convex functions.
PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:
$f^*(f'(x))=x*f'(x)-f(x)$
In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.
$endgroup$
add a comment |
$begingroup$
I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.
The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.
Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.
Here's the Legendre transformation:
$f^*(y)=sup(yx - f(x))$
The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.
$f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.
Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.
The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.
To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.
The Wikipedia Article on Legendre transformations also has some more good information, such as:
If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.
So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.
It's not too hard to imagine why that particular trick only works for convex functions.
PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:
$f^*(f'(x))=x*f'(x)-f(x)$
In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.
$endgroup$
add a comment |
$begingroup$
I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.
The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.
Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.
Here's the Legendre transformation:
$f^*(y)=sup(yx - f(x))$
The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.
$f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.
Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.
The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.
To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.
The Wikipedia Article on Legendre transformations also has some more good information, such as:
If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.
So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.
It's not too hard to imagine why that particular trick only works for convex functions.
PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:
$f^*(f'(x))=x*f'(x)-f(x)$
In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.
$endgroup$
I'm going to attempt a very basic and intuitively understandable explanation. Of course this will be grossly oversimplifying things but I understand it's what was asked for.
The point of the convex conjugate is to represent a function $f$ as a set of tangent hyperplanes. The parameters of all the tangent hyperplanes are encoded in the convex conjugate function $f^*$.
Let's make things easier to understand by covering the 1D case. In that case our $f^*$ is called the Legendre transformation and our hyperplanes become simple lines. The generalization to multidimensional functionals is then relatively straightforward.
Here's the Legendre transformation:
$f^*(y)=sup(yx - f(x))$
The domain of $f^*$ is slope values, the co-domain is y-offsets (as in a simple line equation $ax + b$). Hence $f^*$ encodes a set of lines. First we're going to evaluate $f^*$ at a single point $y$.
$f^*$ at a point $y$ is the largest difference value between $f$ and a line through the origin with slope $y$.
Note that this value might be still negative. In plain English, it's the degree to which $yx$ "surpasses" $f(x)$ in value. If your function $f$ goes "above" $yx$ (for example $f(x) = x^2+1$ and $y=0.5$) then the value $f^*(y)$ will be negative. If your function goes "below" $yx$ your value will be positive.
The "surpass value" is important because in the convex case it is one parameter of a tangent line to $f$: the y-offset.
To imagine $f^*(y)$ not just for a single value $y$ but for the entire domain, picture $f(x)$ and a line through the origin that's rotating around the origin, like a propeller. For each incremental rotation we plot the largest difference value between $f$ and the line into a new coordinate system (where $y$ may go along the x-axis and $f^*(y)$ may go along the y-axis). The resulting graph then shows the whole Legendre transformation of $f$.
The Wikipedia Article on Legendre transformations also has some more good information, such as:
If the convex function $f$ is defined on the whole line and is everywhere differentiable, then $f^*(y)$ can be interpreted as the negative of the y-intercept of the tangent line to the graph of f that has slope $y$.
So $f^*$ can be seen as a map where the input is a slope and the output is a y-offset in the coordinate system of $f$. Thus $f^*$ holds the information how much I need to offset a line with a particular slope such that it becomes a tangent to $f$. The set of all the lines offset in y by their respective amounts covers the area at and below $f$.
It's not too hard to imagine why that particular trick only works for convex functions.
PS: If $f$ is convex there is a much easier definition of the Legendre transformation which avoids the whole supremum thing:
$f^*(f'(x))=x*f'(x)-f(x)$
In plain English: For $x$ We compute $f'(x)$ and define $f^*$ at $f'(x)$ as the negative y intercept of the tangent line.
edited Mar 14 at 13:05
answered Feb 2 at 16:13
rsp1984rsp1984
1315
1315
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1874482%2fgeometric-intuition-of-conjugate-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You can improve this post by asking a more specific question.
$endgroup$
– Pedro Tamaroff♦
Jul 30 '16 at 3:22
$begingroup$
Yes, maybe change the question to "Geometric intuition of conjugate function" or something similar.
$endgroup$
– Lorenzo Stella
Aug 1 '16 at 9:02