Show that $x, y to -log(operatorname{sigmoid}(x) - operatorname{sigmoid}(y))$ is convex for $x > y$
$begingroup$
I've managed to essentially brute force the problem by calculating the Hessian of the function, and showing that its determinant and trace are non-negative.
This was done by using a change of variable to reduce the problem to showing that two certain polynomials are positive over a subset of $[0,1]^2$, proving that it's non-negative in a neighborhood of its zeros, and numerically checking that it's positive away from them.
This solution feels a bit too messy for me, so I was wondering if there isn't a cleaner approach one could use. (I'm aware we could use Sylvester's criterion to simplify the numerical step, but I'd like to avoid using that as well if possible.)
For reference, the expression of the Hessian is.
$$H(x,y) = begin{bmatrix}
-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 && -s(1-s)t(1-t) \
-s(1-s)t(1-t) && t(1-t)(1-2t)(s-t) + t^2(1-t)^2
end{bmatrix}.$$
where $s=operatorname{sigmoid}(x), t=operatorname{sigmoid}(y)$.
convex-analysis
$endgroup$
add a comment |
$begingroup$
I've managed to essentially brute force the problem by calculating the Hessian of the function, and showing that its determinant and trace are non-negative.
This was done by using a change of variable to reduce the problem to showing that two certain polynomials are positive over a subset of $[0,1]^2$, proving that it's non-negative in a neighborhood of its zeros, and numerically checking that it's positive away from them.
This solution feels a bit too messy for me, so I was wondering if there isn't a cleaner approach one could use. (I'm aware we could use Sylvester's criterion to simplify the numerical step, but I'd like to avoid using that as well if possible.)
For reference, the expression of the Hessian is.
$$H(x,y) = begin{bmatrix}
-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 && -s(1-s)t(1-t) \
-s(1-s)t(1-t) && t(1-t)(1-2t)(s-t) + t^2(1-t)^2
end{bmatrix}.$$
where $s=operatorname{sigmoid}(x), t=operatorname{sigmoid}(y)$.
convex-analysis
$endgroup$
$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
1
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18
add a comment |
$begingroup$
I've managed to essentially brute force the problem by calculating the Hessian of the function, and showing that its determinant and trace are non-negative.
This was done by using a change of variable to reduce the problem to showing that two certain polynomials are positive over a subset of $[0,1]^2$, proving that it's non-negative in a neighborhood of its zeros, and numerically checking that it's positive away from them.
This solution feels a bit too messy for me, so I was wondering if there isn't a cleaner approach one could use. (I'm aware we could use Sylvester's criterion to simplify the numerical step, but I'd like to avoid using that as well if possible.)
For reference, the expression of the Hessian is.
$$H(x,y) = begin{bmatrix}
-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 && -s(1-s)t(1-t) \
-s(1-s)t(1-t) && t(1-t)(1-2t)(s-t) + t^2(1-t)^2
end{bmatrix}.$$
where $s=operatorname{sigmoid}(x), t=operatorname{sigmoid}(y)$.
convex-analysis
$endgroup$
I've managed to essentially brute force the problem by calculating the Hessian of the function, and showing that its determinant and trace are non-negative.
This was done by using a change of variable to reduce the problem to showing that two certain polynomials are positive over a subset of $[0,1]^2$, proving that it's non-negative in a neighborhood of its zeros, and numerically checking that it's positive away from them.
This solution feels a bit too messy for me, so I was wondering if there isn't a cleaner approach one could use. (I'm aware we could use Sylvester's criterion to simplify the numerical step, but I'd like to avoid using that as well if possible.)
For reference, the expression of the Hessian is.
$$H(x,y) = begin{bmatrix}
-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 && -s(1-s)t(1-t) \
-s(1-s)t(1-t) && t(1-t)(1-2t)(s-t) + t^2(1-t)^2
end{bmatrix}.$$
where $s=operatorname{sigmoid}(x), t=operatorname{sigmoid}(y)$.
convex-analysis
convex-analysis
edited Jan 21 at 20:09
Kitegi
asked Jan 21 at 17:49
KitegiKitegi
4351921
4351921
$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
1
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18
add a comment |
$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
1
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18
$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
1
1
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Assume $0 leq t < s leq 1$.
Consider $f(s,t)=-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 = s(1-s)(s^2+t-2st)$.
Each factor is nonnegative (the infimum over $s$ for the third factor occurs at $s=t$), so the (1,1) position of the Hessian is nonnegative. Analogously, the (2,2) position is nonnegative, so the trace is nonnegative.
The determinant is
$$begin{align}g(s,t) &= -s(1-s)(1-2s)t(1-t)(1-2t)(s-t)^2 \
& qquad + s^2(1-s)^2t(1-t)(1-2t)(s-t) - t^2(1-t)^2s(1-s)(1-2s)(s-t) \
&= -s(1-s)t(1-t)(s-t)^2(2st-s-t).
end{align}$$
For the first expression I wrote the Hessian as $(a+b)(c+d)-H_{12}^2$ and noticed that $bd=H_{12}^2$. Then I used this tool to simplify the expression (click more forms to see the one I copied). Now $s(1-s) geq 0$, $t(1-t) geq 0$, $(s-t)^2 geq 0$, so for the Hessian to be nonnegative, it remains to be proven that $2st-s-tleq0$. We have:
$$sup_{s,t}{2st-s-t} = sup_t sup_s{ 2st-s-t }= sup_t begin{cases}t-1 & text{if } 2t-1geq 0 \ -2t(1-t) & text{otherwise.}end{cases}$$
When $2t-1geq 0$, the derivative with respect to $s$ is positive, so the supremum is attained at the largest possible value for $s$ (which is $s=1)$. Conversely, in the second branch you plug in the smallest possible value ($s=t$). Both branches are nonpositive.
Et voila!
$endgroup$
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
add a comment |
$begingroup$
A possible approach is to use the fact that $log det X$ is concave for $X$ positive definite. For a proof of this statement see Boyd & Vandenberghe page 74.
Set $$ X = begin{pmatrix} e^x & e^y \ (1 + e^y)^{-1} & (1 + e^x)^{-1}end{pmatrix}$$ such that $det X = text{sigmoid}(x) - text{sigmoid}(y)$ and substitute $a = e^x$ and $b=e^y$. The characteristic polynomial is quadratic and it is a straightforward calculation to show that both eigenvalues of $X$ are positive if $frac{a}{1+a} > frac{b}{1+b}$.
$endgroup$
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3082169%2fshow-that-x-y-to-log-operatornamesigmoidx-operatornamesigmoidy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Assume $0 leq t < s leq 1$.
Consider $f(s,t)=-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 = s(1-s)(s^2+t-2st)$.
Each factor is nonnegative (the infimum over $s$ for the third factor occurs at $s=t$), so the (1,1) position of the Hessian is nonnegative. Analogously, the (2,2) position is nonnegative, so the trace is nonnegative.
The determinant is
$$begin{align}g(s,t) &= -s(1-s)(1-2s)t(1-t)(1-2t)(s-t)^2 \
& qquad + s^2(1-s)^2t(1-t)(1-2t)(s-t) - t^2(1-t)^2s(1-s)(1-2s)(s-t) \
&= -s(1-s)t(1-t)(s-t)^2(2st-s-t).
end{align}$$
For the first expression I wrote the Hessian as $(a+b)(c+d)-H_{12}^2$ and noticed that $bd=H_{12}^2$. Then I used this tool to simplify the expression (click more forms to see the one I copied). Now $s(1-s) geq 0$, $t(1-t) geq 0$, $(s-t)^2 geq 0$, so for the Hessian to be nonnegative, it remains to be proven that $2st-s-tleq0$. We have:
$$sup_{s,t}{2st-s-t} = sup_t sup_s{ 2st-s-t }= sup_t begin{cases}t-1 & text{if } 2t-1geq 0 \ -2t(1-t) & text{otherwise.}end{cases}$$
When $2t-1geq 0$, the derivative with respect to $s$ is positive, so the supremum is attained at the largest possible value for $s$ (which is $s=1)$. Conversely, in the second branch you plug in the smallest possible value ($s=t$). Both branches are nonpositive.
Et voila!
$endgroup$
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
add a comment |
$begingroup$
Assume $0 leq t < s leq 1$.
Consider $f(s,t)=-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 = s(1-s)(s^2+t-2st)$.
Each factor is nonnegative (the infimum over $s$ for the third factor occurs at $s=t$), so the (1,1) position of the Hessian is nonnegative. Analogously, the (2,2) position is nonnegative, so the trace is nonnegative.
The determinant is
$$begin{align}g(s,t) &= -s(1-s)(1-2s)t(1-t)(1-2t)(s-t)^2 \
& qquad + s^2(1-s)^2t(1-t)(1-2t)(s-t) - t^2(1-t)^2s(1-s)(1-2s)(s-t) \
&= -s(1-s)t(1-t)(s-t)^2(2st-s-t).
end{align}$$
For the first expression I wrote the Hessian as $(a+b)(c+d)-H_{12}^2$ and noticed that $bd=H_{12}^2$. Then I used this tool to simplify the expression (click more forms to see the one I copied). Now $s(1-s) geq 0$, $t(1-t) geq 0$, $(s-t)^2 geq 0$, so for the Hessian to be nonnegative, it remains to be proven that $2st-s-tleq0$. We have:
$$sup_{s,t}{2st-s-t} = sup_t sup_s{ 2st-s-t }= sup_t begin{cases}t-1 & text{if } 2t-1geq 0 \ -2t(1-t) & text{otherwise.}end{cases}$$
When $2t-1geq 0$, the derivative with respect to $s$ is positive, so the supremum is attained at the largest possible value for $s$ (which is $s=1)$. Conversely, in the second branch you plug in the smallest possible value ($s=t$). Both branches are nonpositive.
Et voila!
$endgroup$
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
add a comment |
$begingroup$
Assume $0 leq t < s leq 1$.
Consider $f(s,t)=-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 = s(1-s)(s^2+t-2st)$.
Each factor is nonnegative (the infimum over $s$ for the third factor occurs at $s=t$), so the (1,1) position of the Hessian is nonnegative. Analogously, the (2,2) position is nonnegative, so the trace is nonnegative.
The determinant is
$$begin{align}g(s,t) &= -s(1-s)(1-2s)t(1-t)(1-2t)(s-t)^2 \
& qquad + s^2(1-s)^2t(1-t)(1-2t)(s-t) - t^2(1-t)^2s(1-s)(1-2s)(s-t) \
&= -s(1-s)t(1-t)(s-t)^2(2st-s-t).
end{align}$$
For the first expression I wrote the Hessian as $(a+b)(c+d)-H_{12}^2$ and noticed that $bd=H_{12}^2$. Then I used this tool to simplify the expression (click more forms to see the one I copied). Now $s(1-s) geq 0$, $t(1-t) geq 0$, $(s-t)^2 geq 0$, so for the Hessian to be nonnegative, it remains to be proven that $2st-s-tleq0$. We have:
$$sup_{s,t}{2st-s-t} = sup_t sup_s{ 2st-s-t }= sup_t begin{cases}t-1 & text{if } 2t-1geq 0 \ -2t(1-t) & text{otherwise.}end{cases}$$
When $2t-1geq 0$, the derivative with respect to $s$ is positive, so the supremum is attained at the largest possible value for $s$ (which is $s=1)$. Conversely, in the second branch you plug in the smallest possible value ($s=t$). Both branches are nonpositive.
Et voila!
$endgroup$
Assume $0 leq t < s leq 1$.
Consider $f(s,t)=-s(1-s)(1-2s)(s-t) + s^2(1-s)^2 = s(1-s)(s^2+t-2st)$.
Each factor is nonnegative (the infimum over $s$ for the third factor occurs at $s=t$), so the (1,1) position of the Hessian is nonnegative. Analogously, the (2,2) position is nonnegative, so the trace is nonnegative.
The determinant is
$$begin{align}g(s,t) &= -s(1-s)(1-2s)t(1-t)(1-2t)(s-t)^2 \
& qquad + s^2(1-s)^2t(1-t)(1-2t)(s-t) - t^2(1-t)^2s(1-s)(1-2s)(s-t) \
&= -s(1-s)t(1-t)(s-t)^2(2st-s-t).
end{align}$$
For the first expression I wrote the Hessian as $(a+b)(c+d)-H_{12}^2$ and noticed that $bd=H_{12}^2$. Then I used this tool to simplify the expression (click more forms to see the one I copied). Now $s(1-s) geq 0$, $t(1-t) geq 0$, $(s-t)^2 geq 0$, so for the Hessian to be nonnegative, it remains to be proven that $2st-s-tleq0$. We have:
$$sup_{s,t}{2st-s-t} = sup_t sup_s{ 2st-s-t }= sup_t begin{cases}t-1 & text{if } 2t-1geq 0 \ -2t(1-t) & text{otherwise.}end{cases}$$
When $2t-1geq 0$, the derivative with respect to $s$ is positive, so the supremum is attained at the largest possible value for $s$ (which is $s=1)$. Conversely, in the second branch you plug in the smallest possible value ($s=t$). Both branches are nonpositive.
Et voila!
answered Jan 24 at 22:20
LinAlgLinAlg
10k1521
10k1521
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
add a comment |
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
$begingroup$
Well, that was anticlimactic. But I have no complaints. Note that you can simplify the last part by writing $2st-s-t = -((1-t)s + (1-s)t)$, which is clearly nonpositive.$$ $$ The simple fact I was overlooking was that instead of trying to show that the trace was nonnegative, I could just handle the diagonal terms separately. Since it's also a necessary condition for the matrix to be positive semidefinite. ($H_{i,i} = e_i^top H e_i geq 0$).
$endgroup$
– Kitegi
Jan 24 at 22:54
add a comment |
$begingroup$
A possible approach is to use the fact that $log det X$ is concave for $X$ positive definite. For a proof of this statement see Boyd & Vandenberghe page 74.
Set $$ X = begin{pmatrix} e^x & e^y \ (1 + e^y)^{-1} & (1 + e^x)^{-1}end{pmatrix}$$ such that $det X = text{sigmoid}(x) - text{sigmoid}(y)$ and substitute $a = e^x$ and $b=e^y$. The characteristic polynomial is quadratic and it is a straightforward calculation to show that both eigenvalues of $X$ are positive if $frac{a}{1+a} > frac{b}{1+b}$.
$endgroup$
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
add a comment |
$begingroup$
A possible approach is to use the fact that $log det X$ is concave for $X$ positive definite. For a proof of this statement see Boyd & Vandenberghe page 74.
Set $$ X = begin{pmatrix} e^x & e^y \ (1 + e^y)^{-1} & (1 + e^x)^{-1}end{pmatrix}$$ such that $det X = text{sigmoid}(x) - text{sigmoid}(y)$ and substitute $a = e^x$ and $b=e^y$. The characteristic polynomial is quadratic and it is a straightforward calculation to show that both eigenvalues of $X$ are positive if $frac{a}{1+a} > frac{b}{1+b}$.
$endgroup$
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
add a comment |
$begingroup$
A possible approach is to use the fact that $log det X$ is concave for $X$ positive definite. For a proof of this statement see Boyd & Vandenberghe page 74.
Set $$ X = begin{pmatrix} e^x & e^y \ (1 + e^y)^{-1} & (1 + e^x)^{-1}end{pmatrix}$$ such that $det X = text{sigmoid}(x) - text{sigmoid}(y)$ and substitute $a = e^x$ and $b=e^y$. The characteristic polynomial is quadratic and it is a straightforward calculation to show that both eigenvalues of $X$ are positive if $frac{a}{1+a} > frac{b}{1+b}$.
$endgroup$
A possible approach is to use the fact that $log det X$ is concave for $X$ positive definite. For a proof of this statement see Boyd & Vandenberghe page 74.
Set $$ X = begin{pmatrix} e^x & e^y \ (1 + e^y)^{-1} & (1 + e^x)^{-1}end{pmatrix}$$ such that $det X = text{sigmoid}(x) - text{sigmoid}(y)$ and substitute $a = e^x$ and $b=e^y$. The characteristic polynomial is quadratic and it is a straightforward calculation to show that both eigenvalues of $X$ are positive if $frac{a}{1+a} > frac{b}{1+b}$.
edited Jan 25 at 22:27
answered Jan 24 at 23:05
g gg g
1,351417
1,351417
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
add a comment |
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
$begingroup$
An elegant solution, but I prefer the other answer since I wanted something more elementary, in this case.
$endgroup$
– Kitegi
Jan 24 at 23:17
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3082169%2fshow-that-x-y-to-log-operatornamesigmoidx-operatornamesigmoidy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown

$begingroup$
Do you have a particular sigmoid function? [Ask because difference of sigmoids might be negative, so can't take log. Maybe need absolute value of difference...]
$endgroup$
– coffeemath
Jan 21 at 17:58
$begingroup$
@coffeemath $operatorname{sigmoid}(x) = frac{1}{1+exp(-x)}$ in this case.
$endgroup$
– Kitegi
Jan 21 at 18:12
$begingroup$
Does that mean you impose one of $x<y, y<x$ to ensure input to log positive?
$endgroup$
– coffeemath
Jan 21 at 18:14
1
$begingroup$
@coffeemath Yes, the domain is $mathbb R^2$ s.t $x > y$.
$endgroup$
– Kitegi
Jan 21 at 18:18