Second order approximation of log det X

I'm trying to follow the derivation of second order approximation of log det X from Boyd's "Convex Optimization", p.658 in http://www.stanford.edu/~boyd/cvxbook/

How is the last step derived? IE, where does the trace expression come from?

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

$begingroup$
This should be just the trace of log[x]. A relevant question is here:math.stackexchange.com/questions/38701/…
$endgroup$
– Bombyx mori
Nov 29 '12 at 6:23

add a comment |

I'm trying to follow the derivation of second order approximation of log det X from Boyd's "Convex Optimization", p.658 in http://www.stanford.edu/~boyd/cvxbook/

How is the last step derived? IE, where does the trace expression come from?

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

$begingroup$
This should be just the trace of log[x]. A relevant question is here:math.stackexchange.com/questions/38701/…
$endgroup$
– Bombyx mori
Nov 29 '12 at 6:23

add a comment |

I'm trying to follow the derivation of second order approximation of log det X from Boyd's "Convex Optimization", p.658 in http://www.stanford.edu/~boyd/cvxbook/

How is the last step derived? IE, where does the trace expression come from?

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

I'm trying to follow the derivation of second order approximation of log det X from Boyd's "Convex Optimization", p.658 in http://www.stanford.edu/~boyd/cvxbook/

How is the last step derived? IE, where does the trace expression come from?

linear-algebra convex-analysis

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

edited Jan 8 at 19:32

Glorfindel

3,41981830

edited Jan 8 at 19:32

Glorfindel

3,41981830

edited Jan 8 at 19:32

Glorfindel

3,41981830

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

asked Nov 29 '12 at 5:07

Yaroslav Bulatov

1,87411526

$begingroup$
This should be just the trace of log[x]. A relevant question is here:math.stackexchange.com/questions/38701/…
$endgroup$
– Bombyx mori
Nov 29 '12 at 6:23

add a comment |

$begingroup$
This should be just the trace of log[x]. A relevant question is here:math.stackexchange.com/questions/38701/…
$endgroup$
– Bombyx mori
Nov 29 '12 at 6:23

This should be just the trace of log[x]. A relevant question is here:math.stackexchange.com/questions/38701/…

– Bombyx mori
Nov 29 '12 at 6:23

add a comment |

1 Answer
1

active

oldest

votes

Short answer: The trace gives the scalar product on the space of matrices: $langle X,Y rangle = mathrm{tr}(X^top Y)$. Since you're working with symmetric matrices, you can forget the transposition: $langle X,Y rangle = mathrm{tr}(XY)$.

Long answer, with all the gory details: Given a function $f:mathrm S_n^{++}tomathbf R$, the link between the gradient $nabla_Xf$ of the function $f$ at $X$ (which is a vector) and its differential $d_Xf$ at $X$ (which is a linear form) is that for any $Uin V$,
$$ d_Xf(U) = langle nabla_Xf,U rangle. $$
For your function $f$, since you know the gradient, you can write the differential:
$$ d_Xf(U) = langle X^{-1},U rangle = mathrm{tr}(X^{-1}U). $$

What about the second order differential? Well, it's the differential of the differential. Let's take it slow. The differential of $f$ is the function $df:mathrm S_n^{++}tomathrm L(mathrm M_n,mathbf R)$, defined by $df(X) = Vmapsto mathrm{tr}(X^{-1}V)$. To find the differential of $df$ at $X$, we look at $df(X+Delta X)$, and take the part that varies linearly in $Delta X$. Since $df(X+Delta X)$ is a function $mathrm M_ntomathbf R$, if we hope to ever understand anything we should apply it to some matrix $V$:
$$ df(X+Delta X)(V) = mathrm{tr}left[ (X+Delta X)^{-1} V right] $$
and use the approximation from the passage you cited:
begin{align*}
df(X+Delta X)(V)
&simeq mathrm{tr}left[ left(X^{-1} - X^{-1}(Delta X)X^{-1}right) V right]\
&= mathrm{tr}(X^{-1}V) - mathrm{tr}(X^{-1}(Delta X)X^{-1}V)\
&= df(X)(V) - mathrm{tr}(X^{-1}(Delta X)X^{-1}V).
end{align*}
And we just see that the part that varies linearly in $Delta X$ is the $-mathrm{tr}(cdots)$. So the differential of $df$ at $X$ is the function $d^2_Xf:mathrm S_n^{++}tomathrm L(mathrm M_n, mathrm L(mathrm M_n,mathbf R))$ defined by
$$ d^2_Xf(U)(V) = -mathrm{tr}(X^{-1}UX^{-1}V). $$

answered Nov 29 '12 at 8:19

jathd

1,59679

$begingroup$
It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.
$endgroup$
– loup blanc
Jan 9 at 14:52

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f247043%2fsecond-order-approximation-of-log-det-x%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

answered Nov 29 '12 at 8:19

jathd

1,59679

$begingroup$
It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.
$endgroup$
– loup blanc
Jan 9 at 14:52

add a comment |

answered Nov 29 '12 at 8:19

jathd

1,59679

$begingroup$
It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.
$endgroup$
– loup blanc
Jan 9 at 14:52

add a comment |

answered Nov 29 '12 at 8:19

jathd

1,59679

answered Nov 29 '12 at 8:19

jathd

1,59679

answered Nov 29 '12 at 8:19

jathd

1,59679

answered Nov 29 '12 at 8:19

jathd

1,59679

answered Nov 29 '12 at 8:19

jathd

1,59679

$begingroup$
It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.
$endgroup$
– loup blanc
Jan 9 at 14:52

add a comment |

$begingroup$
It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.
$endgroup$
– loup blanc
Jan 9 at 14:52

It's better to consider $d^2f_X$ as a symmetric bilinear form; $d^2f_X(U,V)=-tr(X^{-1}UX^{-1}V)$. When you write the Taylor formula, you use $d^2f_X(U,U)=-tr((X^{-1}U)^2)$.

– loup blanc
Jan 9 at 14:52

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu