Compute the gradient of $f(x)=|text{diag}(x)|$ with the chain rule

Consider the function $f:mathbb{R}^ntomathbb{R}$ given by $f(x)=|text{diag}(x)|$, where $text{diag}(x)inmathbb{R}^{ntimes{n}}$ is the diagonal matrix with diagonal entries $x_1,x_2,dots,x_n$, and $|cdot|$ is the spectral norm (matrix 2-norm).

Since the spectral norm of a matrix is its largest singular value, and the singular values of a (square) diagonal matrix are the absolute values of the diagonal entries, we see that $f(x)=|x|_infty$, where $|cdot|_infty$ is the (vector) sup-norm. In this form, it is easier to deduce the properties of $f$--in particular, it is differentiable at any point $xinmathbb{R}^n$ where the largest element of $x$ (in absolute value) is unique. At such a point, the gradient of $f$ is given by
$$
nabla{f(x)}=text{sgn}(x_k)e_k
$$
where $k$ is the index of the (unique) largest entry of $x$ (in absolute value), $e_k$ is the $k^text{th}$ standard basis vector in $mathbb{R}^n$, and $text{sgn}(cdot)$ is the sign function.

I want to deduce the above expression for the gradient using the chain rule applied to $f(x)=(gcirc{h})(x)$, where $g:mathbb{R}^{ntimes{n}}tomathbb{R}$ is given by $g(A)=|A|$, and $h:mathbb{R}^ntomathbb{R}^{ntimes{n}}$ is given by $h(x)=text{diag}(x)$.

The "Jacobian" of $h$ is a three-dimensional object, where
$$
frac{partial[h(x)]_{ij}}{partial{x_k}}=begin{cases}1,&i=j=k,\0,&text{else.}end{cases}
$$
The function $g$ is differentiable at any point $A$ where $A$ has a unique largest singular value, in which case the gradient(?) is given by
$$
nabla{g(A)}=uv^text{T},
$$
where $u$ and $v$ are the left and right singular vectors (respectively) corresponding to the (unique) largest singular value of $A$.

So I essentially have a three-dimensional object and a 2-dimensional object, and I want to apply the chain rule to get the gradient, a 1-dimensional object (i.e. a vector). A straightforward application suggests "multiplying them together" (not sure that concept is even defined), which seems like it would produce a matrix. What simple thing am I missing here?

asked Jan 22 at 1:23

David M.

1,856418

add a comment |

asked Jan 22 at 1:23

David M.

1,856418

add a comment |

asked Jan 22 at 1:23

David M.

1,856418

derivatives matrix-calculus chain-rule

asked Jan 22 at 1:23

David M.

1,856418

asked Jan 22 at 1:23

David M.

1,856418

asked Jan 22 at 1:23

David M.

1,856418

asked Jan 22 at 1:23

David M.

1,856418

asked Jan 22 at 1:23

David M.

1,856418

add a comment |

1 Answer
1

active

oldest

votes

A hint rather than a complete answer

It might help to think of $Bbb R^{n times n}$ as $Bbb R^{n^2}$.

Now $h$ is a function from $Bbb R^n$ to $Bbb R^{n^2}$ and $g$ is a function from
$Bbb R^{n^2}$ to $Bbb R$, so the composite goes from $Bbb R^n$ to $Bbb R$. Can you work out the $k$th partial derivative of this?

When you do, you may find out that the result you get is surprisingly closely related to matrix multiplication (of some sort) of the derivatives of $h$ and $g$...or you may not.

Just to get you started, $$nabla g(A)_{ni + j} = u_i v_j,$$
where I'm working with indices that go from $0$ to $n-1$ here.

Can you now work out the $ni + j$th entry of the $k$th partial derivative of $g circ h$?

answered Jan 22 at 1:40

John Hughes

64.4k24191

$begingroup$
Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!
$endgroup$
– David M.
Jan 22 at 18:53

$begingroup$
And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)
$endgroup$
– John Hughes
Jan 22 at 19:24

$begingroup$
Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)
$endgroup$
– David M.
Jan 22 at 19:27

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3082644%2fcompute-the-gradient-of-fx-textdiagx-with-the-chain-rule%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

A hint rather than a complete answer

It might help to think of $Bbb R^{n times n}$ as $Bbb R^{n^2}$.

When you do, you may find out that the result you get is surprisingly closely related to matrix multiplication (of some sort) of the derivatives of $h$ and $g$...or you may not.

Just to get you started, $$nabla g(A)_{ni + j} = u_i v_j,$$
where I'm working with indices that go from $0$ to $n-1$ here.

Can you now work out the $ni + j$th entry of the $k$th partial derivative of $g circ h$?

answered Jan 22 at 1:40

John Hughes

64.4k24191

$begingroup$
Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!
$endgroup$
– David M.
Jan 22 at 18:53

$begingroup$
And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)
$endgroup$
– John Hughes
Jan 22 at 19:24

$begingroup$
Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)
$endgroup$
– David M.
Jan 22 at 19:27

add a comment |

A hint rather than a complete answer

It might help to think of $Bbb R^{n times n}$ as $Bbb R^{n^2}$.

When you do, you may find out that the result you get is surprisingly closely related to matrix multiplication (of some sort) of the derivatives of $h$ and $g$...or you may not.

Just to get you started, $$nabla g(A)_{ni + j} = u_i v_j,$$
where I'm working with indices that go from $0$ to $n-1$ here.

Can you now work out the $ni + j$th entry of the $k$th partial derivative of $g circ h$?

answered Jan 22 at 1:40

John Hughes

64.4k24191

$begingroup$
Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!
$endgroup$
– David M.
Jan 22 at 18:53

$begingroup$
And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)
$endgroup$
– John Hughes
Jan 22 at 19:24

$begingroup$
Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)
$endgroup$
– David M.
Jan 22 at 19:27

add a comment |

A hint rather than a complete answer

It might help to think of $Bbb R^{n times n}$ as $Bbb R^{n^2}$.

When you do, you may find out that the result you get is surprisingly closely related to matrix multiplication (of some sort) of the derivatives of $h$ and $g$...or you may not.

Just to get you started, $$nabla g(A)_{ni + j} = u_i v_j,$$
where I'm working with indices that go from $0$ to $n-1$ here.

Can you now work out the $ni + j$th entry of the $k$th partial derivative of $g circ h$?

answered Jan 22 at 1:40

John Hughes

64.4k24191

A hint rather than a complete answer

It might help to think of $Bbb R^{n times n}$ as $Bbb R^{n^2}$.

When you do, you may find out that the result you get is surprisingly closely related to matrix multiplication (of some sort) of the derivatives of $h$ and $g$...or you may not.

Just to get you started, $$nabla g(A)_{ni + j} = u_i v_j,$$
where I'm working with indices that go from $0$ to $n-1$ here.

Can you now work out the $ni + j$th entry of the $k$th partial derivative of $g circ h$?

answered Jan 22 at 1:40

John Hughes

64.4k24191

answered Jan 22 at 1:40

John Hughes

64.4k24191

answered Jan 22 at 1:40

John Hughes

64.4k24191

answered Jan 22 at 1:40

John Hughes

64.4k24191

$begingroup$
Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!
$endgroup$
– David M.
Jan 22 at 18:53

$begingroup$
And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)
$endgroup$
– John Hughes
Jan 22 at 19:24

$begingroup$
Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)
$endgroup$
– David M.
Jan 22 at 19:27

add a comment |

$begingroup$
Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!
$endgroup$
– David M.
Jan 22 at 18:53

$begingroup$
And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)
$endgroup$
– John Hughes
Jan 22 at 19:24

$begingroup$
Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)
$endgroup$
– David M.
Jan 22 at 19:27

Thought I might have to resort to this--was being lazy and trying to avoid "index counting" from stacking matrices into vectors. Indeed this works!

– David M.
Jan 22 at 18:53

And when you were done, did you discover that there was a matrix multiplication hidden in there? (I'm asking seriously...I didn't bother doing it myself, and wonder what the result was...)

– John Hughes
Jan 22 at 19:24

Yes it works out following the conventions of the usual multi-dimensional chain rule. The resulting gradient vector is $J_g^text{T}{J_h}$, where $J_ginmathbb{R}^{1times{n^2}}$ is the outer product of $u$ and $v$ stacked columnwise, and $J_hinmathbb{R}^{n^2times{n}}$, where each row of $J_h$ is the gradient of a component of the matrix $text{diag}(x)$ stacked columnwise (lots of words, but basically what you would expect)

– David M.
Jan 22 at 19:27

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu