A measure of similarity of real vectors independent of their dimension












0












$begingroup$


I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).



for example:



v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$



v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$



I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".



Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)



I am wondering if there is any alternative.



** clarification on why I don't like the standard norm of the euc distance **



PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1



PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41



PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73



Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.



Any ideas on how to compare?










share|cite|improve this question











$endgroup$












  • $begingroup$
    You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
    $endgroup$
    – Xander Henderson
    Jan 9 at 14:57










  • $begingroup$
    I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
    $endgroup$
    – John Hughes
    Jan 9 at 14:58










  • $begingroup$
    Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
    $endgroup$
    – Ethan Bolker
    Jan 9 at 15:02












  • $begingroup$
    To do what Xander suggested, you can click on the word "edit" just below your question.
    $endgroup$
    – John Hughes
    Jan 9 at 15:23










  • $begingroup$
    I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
    $endgroup$
    – Jeremías Rodríguez
    Jan 9 at 18:26
















0












$begingroup$


I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).



for example:



v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$



v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$



I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".



Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)



I am wondering if there is any alternative.



** clarification on why I don't like the standard norm of the euc distance **



PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1



PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41



PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73



Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.



Any ideas on how to compare?










share|cite|improve this question











$endgroup$












  • $begingroup$
    You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
    $endgroup$
    – Xander Henderson
    Jan 9 at 14:57










  • $begingroup$
    I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
    $endgroup$
    – John Hughes
    Jan 9 at 14:58










  • $begingroup$
    Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
    $endgroup$
    – Ethan Bolker
    Jan 9 at 15:02












  • $begingroup$
    To do what Xander suggested, you can click on the word "edit" just below your question.
    $endgroup$
    – John Hughes
    Jan 9 at 15:23










  • $begingroup$
    I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
    $endgroup$
    – Jeremías Rodríguez
    Jan 9 at 18:26














0












0








0





$begingroup$


I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).



for example:



v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$



v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$



I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".



Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)



I am wondering if there is any alternative.



** clarification on why I don't like the standard norm of the euc distance **



PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1



PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41



PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73



Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.



Any ideas on how to compare?










share|cite|improve this question











$endgroup$




I am trying to find a measure of similarity between two vectors that works for any pair of vectors v, w $in R^n$ (for any n).



for example:



v1=(1,2,4) v2=(-2,4,4) -> $sim(v1,v2) in R$



v1'=(0,0,2,0,3) v2'=(2,4,6,1,2) -> $sim(v1',v2') in R$



I want to be able to compare the results sim(v1,v2) and sim(v1',v2); so that for any pair (v1,v2) and (v1',v2'); I can tell which pair is more "similar".



Obviously I tried using the standard norm of the euclidean distance. But I found that the result is not actually working when you compare a distance in $R^2$ and a distance in $R^5$. It penalyses less the component-wise distances as the dimension grows (see example below)



I am wondering if there is any alternative.



** clarification on why I don't like the standard norm of the euc distance **



PAIR 1) v1 = (0) , v2=(1) ---> |v1-v2| = 1



PAIR 2) v1' = (0,0) , v2'=(1,1) ---> |v1'-v2'| = $sqrt(2)$ = 1.41



PAIR 3) v1''= (0,0,0), v2''=(1,1,1) ---> |v1''-v2''| = $sqrt(3)$ = 1.73



Which pair is more "alike"? I am not sure if the norm of the euclidean distance is an appropiate metric... I think that they are all as different as two vectors in its respective spaces can be. I think that the norm of the euclidean distance is not "scaled" properly.



Any ideas on how to compare?







vectors norm






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 9 at 18:32







Jeremías Rodríguez

















asked Jan 9 at 14:50









Jeremías RodríguezJeremías Rodríguez

11




11












  • $begingroup$
    You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
    $endgroup$
    – Xander Henderson
    Jan 9 at 14:57










  • $begingroup$
    I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
    $endgroup$
    – John Hughes
    Jan 9 at 14:58










  • $begingroup$
    Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
    $endgroup$
    – Ethan Bolker
    Jan 9 at 15:02












  • $begingroup$
    To do what Xander suggested, you can click on the word "edit" just below your question.
    $endgroup$
    – John Hughes
    Jan 9 at 15:23










  • $begingroup$
    I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
    $endgroup$
    – Jeremías Rodríguez
    Jan 9 at 18:26


















  • $begingroup$
    You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
    $endgroup$
    – Xander Henderson
    Jan 9 at 14:57










  • $begingroup$
    I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
    $endgroup$
    – John Hughes
    Jan 9 at 14:58










  • $begingroup$
    Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
    $endgroup$
    – Ethan Bolker
    Jan 9 at 15:02












  • $begingroup$
    To do what Xander suggested, you can click on the word "edit" just below your question.
    $endgroup$
    – John Hughes
    Jan 9 at 15:23










  • $begingroup$
    I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
    $endgroup$
    – Jeremías Rodríguez
    Jan 9 at 18:26
















$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
$endgroup$
– Xander Henderson
Jan 9 at 14:57




$begingroup$
You appear to be familiar with TeX formatting. It works here, too. Just surround the mathematics with dollar signs $ as you would in a normal TeX / LaTeX document.
$endgroup$
– Xander Henderson
Jan 9 at 14:57












$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58




$begingroup$
I'm not sure what you mean about euclidean norm. Example:$ v_1 = (2, 4), v_2 = (5, 3)$. $w = v_1 - v_2 = (-3, 1)$ has squared length $10$. Put these in 5-space, and you get $ v_1 = (2, 4,0,0,0), v_2 = (5, 3,0,0,0)$. $w = v_1 - v_2 = (-3, 1,0,0,0)$, which also has squared length $10$.
$endgroup$
– John Hughes
Jan 9 at 14:58












$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02






$begingroup$
Expanding on what @JohnHughes says: the Euclidean distance between two points in $n$-space is the Euclidean distance between them in the plane (or possibly line) they span. How is that sensitive to dimension? Perhpas edit the question to tell us more about where it comes from and just why Euclidean distance does not server your needs.
$endgroup$
– Ethan Bolker
Jan 9 at 15:02














$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23




$begingroup$
To do what Xander suggested, you can click on the word "edit" just below your question.
$endgroup$
– John Hughes
Jan 9 at 15:23












$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26




$begingroup$
I edited the post to be more clear! Thanks for the comments. Let me know if now the problem is easier to understand.
$endgroup$
– Jeremías Rodríguez
Jan 9 at 18:26










1 Answer
1






active

oldest

votes


















1












$begingroup$

A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
$$
d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
$$



If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.



The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.






share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067524%2fa-measure-of-similarity-of-real-vectors-independent-of-their-dimension%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1












    $begingroup$

    A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
    $$
    d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
    $$



    If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.



    The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.






    share|cite|improve this answer











    $endgroup$


















      1












      $begingroup$

      A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
      $$
      d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
      $$



      If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.



      The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.






      share|cite|improve this answer











      $endgroup$
















        1












        1








        1





        $begingroup$

        A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
        $$
        d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
        $$



        If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.



        The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.






        share|cite|improve this answer











        $endgroup$



        A standard sort-of solution is the "cosine similarity" (although this is usually defined for unit vectors): You compute the angle between the two vectors, thus:
        $$
        d(v_1, v_2) = cos^{-1} frac{v_1 cdot v_2}{|v_1||v_2|}
        $$



        If $v_1, v_2$ are unit vectors, then you can skip dividing by the lengths, of course. The downside? If $v_1, v_2$ point in the same direction, but have different lengths, this "distance" still returns the value $0$.



        The upside? If $v_1, v_2 in Bbb R^2 subset Bbb R^5$, and you compute the distance, you get the same answer whether your think of them as being in $Bbb R^2$ or in $Bbb R^5$.







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Jan 9 at 15:23


























        community wiki





        2 revs
        John Hughes































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067524%2fa-measure-of-similarity-of-real-vectors-independent-of-their-dimension%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            How to fix TextFormField cause rebuild widget in Flutter

            in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith