Proximal gradient method justification

If $f$ and $g$ are respectively a differentiable function and a convex, lower semi-continuous function, then the algorithm defined by:

$$ x^{k+1} = text{prox}_{gamma{g}}[x^{k} - gammanabla{f(x^{k}})]$$

converges to $text{argmin}[f+g]$.

This is justified by the fact that if $x^{*}$ is a minimizer of $f+g$, then:
$$ x^{*} = text{prox}_{gamma{g}}[x^{*} - gammanabla{f(x^{*}})]$$

But I do not understand this relation. Why is it true?

That is, why $x^{*} = text{argmin}[f+g] Leftrightarrow x^{*} =text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})] $ ?

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

add a comment |

If $f$ and $g$ are respectively a differentiable function and a convex, lower semi-continuous function, then the algorithm defined by:

$$ x^{k+1} = text{prox}_{gamma{g}}[x^{k} - gammanabla{f(x^{k}})]$$

converges to $text{argmin}[f+g]$.

This is justified by the fact that if $x^{*}$ is a minimizer of $f+g$, then:
$$ x^{*} = text{prox}_{gamma{g}}[x^{*} - gammanabla{f(x^{*}})]$$

But I do not understand this relation. Why is it true?

That is, why $x^{*} = text{argmin}[f+g] Leftrightarrow x^{*} =text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})] $ ?

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

add a comment |

If $f$ and $g$ are respectively a differentiable function and a convex, lower semi-continuous function, then the algorithm defined by:

$$ x^{k+1} = text{prox}_{gamma{g}}[x^{k} - gammanabla{f(x^{k}})]$$

converges to $text{argmin}[f+g]$.

This is justified by the fact that if $x^{*}$ is a minimizer of $f+g$, then:
$$ x^{*} = text{prox}_{gamma{g}}[x^{*} - gammanabla{f(x^{*}})]$$

But I do not understand this relation. Why is it true?

That is, why $x^{*} = text{argmin}[f+g] Leftrightarrow x^{*} =text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})] $ ?

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

If $f$ and $g$ are respectively a differentiable function and a convex, lower semi-continuous function, then the algorithm defined by:

$$ x^{k+1} = text{prox}_{gamma{g}}[x^{k} - gammanabla{f(x^{k}})]$$

converges to $text{argmin}[f+g]$.

This is justified by the fact that if $x^{*}$ is a minimizer of $f+g$, then:
$$ x^{*} = text{prox}_{gamma{g}}[x^{*} - gammanabla{f(x^{*}})]$$

But I do not understand this relation. Why is it true?

That is, why $x^{*} = text{argmin}[f+g] Leftrightarrow x^{*} =text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})] $ ?

optimization convex-analysis gradient-descent semicontinuous-functions

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

edited Jan 17 at 16:48

user52705

14810

edited Jan 17 at 16:48

user52705

14810

edited Jan 17 at 16:48

user52705

14810

asked Jan 17 at 14:24

Alberto

asked Jan 17 at 14:24

Alberto

asked Jan 17 at 14:24

Alberto

add a comment |

2 Answers
2

active

oldest

votes

I will show below that if $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$ then $x^* in text{argmin } f(x)+g(x) $.

Plugging in the definition of proximal operator, we have
$$text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})] = text{argmin}_{x} left{gamma g(x) + frac{1}{2}|x- (x^* - gamma nabla f(x^*))|^2right}$$
Now since $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$, we have by Fermat's rule at $x=x^*$, the following
$$0 in partial (gamma g(x)) + (x-(x^*-gammanabla f(x^*)))$$
Now just substitute $x=x^*$, we get
$$0in gamma partial g(x^*) + gamma nabla f(x^*)$$
This is equivalent to saying that $0 in partial F(x^*)$, where $F(x) = g(x)+f(x)$, so $x^*$ is the minimizer. The other direction of the proof is very similar.

Note: The last step $0 in partial F(x^*)$ need not always hold (check subdifferential properties).

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

add a comment |

Here's an explanation which assumes that we already understand the idea that the prox-operator of $g$ with parameter $t > 0$ is the operator $(I + t partial g)^{-1}$, where $partial g$ is the subdifferential of $g$.

I'll assume that $f$ is convex as well as differentiable, and that $g$ is convex and closed. Let $t >0$. A point $x$ is a minimizer of $f + g$ if and only if
begin{align}
&0 in nabla f(x) + partial g(x) \
iff &x in x + t nabla f(x) + tpartial g(x) \
iff &x - t nabla f(x) in (I + t partial g)(x) \
iff &x = (I + t partial g)^{-1}(x - t nabla f(x)).
end{align}
The final equation is another way of saying that
$$
x = text{prox}_{tg}(x - t nabla f(x)).
$$

We can then solve this equation using fixed point iteration, which yields the proximal gradient method.

By the way, if this derivation of the proximal gradient method doesn't seem intuitive, there are other ways to discover the proximal gradient method that are more obvious. The viewpoint given here has the advantage that it shows that the proximal gradient method is a fixed point iteration, which helps us with convergence proofs.

answered Jan 17 at 17:31

littleO

29.9k646109

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3077039%2fproximal-gradient-method-justification%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

I will show below that if $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$ then $x^* in text{argmin } f(x)+g(x) $.

Note: The last step $0 in partial F(x^*)$ need not always hold (check subdifferential properties).

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

add a comment |

I will show below that if $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$ then $x^* in text{argmin } f(x)+g(x) $.

Note: The last step $0 in partial F(x^*)$ need not always hold (check subdifferential properties).

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

add a comment |

I will show below that if $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$ then $x^* in text{argmin } f(x)+g(x) $.

Note: The last step $0 in partial F(x^*)$ need not always hold (check subdifferential properties).

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

I will show below that if $x^* = text{prox}_{gamma{g}}[x^{*} - gamma nabla{f(x^{*}})]$ then $x^* in text{argmin } f(x)+g(x) $.

Note: The last step $0 in partial F(x^*)$ need not always hold (check subdifferential properties).

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

edited Jan 18 at 16:16

answered Jan 17 at 16:03

user52705

14810

answered Jan 17 at 16:03

user52705

14810

answered Jan 17 at 16:03

user52705

14810

add a comment |

We can then solve this equation using fixed point iteration, which yields the proximal gradient method.

answered Jan 17 at 17:31

littleO

29.9k646109

add a comment |

We can then solve this equation using fixed point iteration, which yields the proximal gradient method.

answered Jan 17 at 17:31

littleO

29.9k646109

add a comment |

We can then solve this equation using fixed point iteration, which yields the proximal gradient method.

answered Jan 17 at 17:31

littleO

29.9k646109

We can then solve this equation using fixed point iteration, which yields the proximal gradient method.

answered Jan 17 at 17:31

littleO

29.9k646109

answered Jan 17 at 17:31

littleO

29.9k646109

answered Jan 17 at 17:31

littleO

29.9k646109

answered Jan 17 at 17:31

littleO

29.9k646109

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu