A measure of similarity of real vectors independent of their dimension
$begingroup$
I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).
for example:
v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$
v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$
I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".
Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)
I am wondering if there is any alternative.
** clarification on why I don't like the standard norm of the euc distance **
PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1
PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41
PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73
Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.
Any ideas on how to compare?
vectors norm
$endgroup$
add a comment |
$begingroup$
I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).
for example:
v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$
v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$
I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".
Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)
I am wondering if there is any alternative.
** clarification on why I don't like the standard norm of the euc distance **
PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1
PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41
PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73
Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.
Any ideas on how to compare?
vectors norm
$endgroup$
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs$
as you would in a normal TeX / LaTeX document.
$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26
add a comment |
$begingroup$
I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).
for example:
v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$
v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$
I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".
Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)
I am wondering if there is any alternative.
** clarification on why I don't like the standard norm of the euc distance **
PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1
PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41
PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73
Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.
Any ideas on how to compare?
vectors norm
$endgroup$
I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).
for example:
v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$
v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$
I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".
Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)
I am wondering if there is any alternative.
** clarification on why I don't like the standard norm of the euc distance **
PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1
PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41
PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73
Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.
Any ideas on how to compare?
vectors norm
vectors norm
edited Jan 9 at 18:32
Jeremías Rodríguez
asked Jan 9 at 14:50
Jeremías RodríguezJeremías Rodríguez
11
11
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs$
as you would in a normal TeX / LaTeX document.
$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26
add a comment |
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs$
as you would in a normal TeX / LaTeX document.
$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs
$
as you would in a normal TeX / LaTeX document.$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs
$
as you would in a normal TeX / LaTeX document.$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$
If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.
The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067524%2fa-measure-of-similarity-of-real-vectors-independent-of-their-dimension%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$
If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.
The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.
$endgroup$
add a comment |
$begingroup$
A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$
If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.
The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.
$endgroup$
add a comment |
$begingroup$
A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$
If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.
The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.
$endgroup$
A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$
If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.
The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.
edited Jan 9 at 15:23
community wiki
2 revs
John Hughes
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067524%2fa-measure-of-similarity-of-real-vectors-independent-of-their-dimension%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs
$
as you would in a normal TeX / LaTeX document.$endgroup$
– Xander Henderson
Jan 9 at 14:57
$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58
$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02
$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23
$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26