What is Vector Offset?











up vote
0
down vote

favorite












I am reading a paper on computational linguistics. Here I have problems in some basic terms of the paper. I searched internet about them, but I could not find an appropriate answer about the definitions of these terms.



Here is the Paper link



Reading the abstract these terms look to fundamental about this paper:




  • Vector Offset

  • Baseline

  • A consistent vector offset?


This is the abstract:




The offset method for solving word analogies
has become a standard evaluation tool
for vector-space semantic models: it is
considered desirable for a space to represent
semantic relations as consistent vector
offsets. We show that the method’s reliance
on cosine similarity conflates offset
consistency with largely irrelevant neighborhood
structure, and propose simple
baselines that should be used to improve
the utility of the method in vector space
evaluation.




I do not get exactly what these terms stand on. And what exactly means for a vector offset to be consistent?



Thank you in advance.










share|cite|improve this question


























    up vote
    0
    down vote

    favorite












    I am reading a paper on computational linguistics. Here I have problems in some basic terms of the paper. I searched internet about them, but I could not find an appropriate answer about the definitions of these terms.



    Here is the Paper link



    Reading the abstract these terms look to fundamental about this paper:




    • Vector Offset

    • Baseline

    • A consistent vector offset?


    This is the abstract:




    The offset method for solving word analogies
    has become a standard evaluation tool
    for vector-space semantic models: it is
    considered desirable for a space to represent
    semantic relations as consistent vector
    offsets. We show that the method’s reliance
    on cosine similarity conflates offset
    consistency with largely irrelevant neighborhood
    structure, and propose simple
    baselines that should be used to improve
    the utility of the method in vector space
    evaluation.




    I do not get exactly what these terms stand on. And what exactly means for a vector offset to be consistent?



    Thank you in advance.










    share|cite|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I am reading a paper on computational linguistics. Here I have problems in some basic terms of the paper. I searched internet about them, but I could not find an appropriate answer about the definitions of these terms.



      Here is the Paper link



      Reading the abstract these terms look to fundamental about this paper:




      • Vector Offset

      • Baseline

      • A consistent vector offset?


      This is the abstract:




      The offset method for solving word analogies
      has become a standard evaluation tool
      for vector-space semantic models: it is
      considered desirable for a space to represent
      semantic relations as consistent vector
      offsets. We show that the method’s reliance
      on cosine similarity conflates offset
      consistency with largely irrelevant neighborhood
      structure, and propose simple
      baselines that should be used to improve
      the utility of the method in vector space
      evaluation.




      I do not get exactly what these terms stand on. And what exactly means for a vector offset to be consistent?



      Thank you in advance.










      share|cite|improve this question













      I am reading a paper on computational linguistics. Here I have problems in some basic terms of the paper. I searched internet about them, but I could not find an appropriate answer about the definitions of these terms.



      Here is the Paper link



      Reading the abstract these terms look to fundamental about this paper:




      • Vector Offset

      • Baseline

      • A consistent vector offset?


      This is the abstract:




      The offset method for solving word analogies
      has become a standard evaluation tool
      for vector-space semantic models: it is
      considered desirable for a space to represent
      semantic relations as consistent vector
      offsets. We show that the method’s reliance
      on cosine similarity conflates offset
      consistency with largely irrelevant neighborhood
      structure, and propose simple
      baselines that should be used to improve
      the utility of the method in vector space
      evaluation.




      I do not get exactly what these terms stand on. And what exactly means for a vector offset to be consistent?



      Thank you in advance.







      linear-algebra vector-spaces vectors computational-algebra






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked yesterday









      horotat

      52




      52






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted











          Vector offset




          Let vector $a$ correspond to word "debug", and vector $a^star$ to word "debugging". Their difference, $a^star - a$, is the vector offset that corresponds to the linguistic relationship the "-ing" suffix denotes.




          A consistent vector offset




          If the vector offset is consistent, then if vector $b$ corresponds to word "scream", then vector $b + (a^star - a) = b + a^star - a$ is likely to correspond to word "screaming". Similarly for any other word that has a related word with the same linguistic "-ing" relationship.



          If the vector offset is not consistent, then vector $b^star$ corresponding to word "screaming" is more likely to differ, $b^star ne b + (a^star - a)$.




          Baseline




          It is their name for a set of alternative methods, or helper functions, of finding words with a specific linguistic relationship (not relying on the properties of the vector difference between them) to a known one. They call them "baselines", because they use them as "standards" or baselines, to compare the vector offset method results to.



          The paper lists five baselines: "vanilla", "add", "only-b", "ignore-a", "add-opposite", and "multiply". The "vanilla" one is the direct vector offset one:
          $$bbox{x^star = operatorname*{argmax}_{x^prime} frac{x^prime cdot ( a^star - a + x )}{leftlVert x^prime rightrVert ; leftlVert a^star - a + x rightrVert}}$$
          i.e., $x^star$ is the one among the known $x^prime$ that maximizes the cosine angle between $x^prime$ and the estimated vector from the known $x$, using vector offset ($a^star - a$ that corresponds to the linguistic relationship between the known word $x$ and the word $x^star$ we are looking for). The four others are variations.



          The two reverse ones ("reverse (add)" and "reverse (multiply)") are the same as "add" and "multiply", respectively, except when the expression is used to find $x$ when $x^star$ is known, instead.





          It might be a little clearer to write the baselines using better variable names.



          For example, let's say $y = x + a^star - a$ is the estimated vector for the word we are looking for. It is related to $x$ the same way $a^star$ is related to $a$, assuming the vectors are defined in a way that vector offsets are consistent. If $w_i$ are all known words, then
          $$bbox{x^star = operatorname*{argmax}_{w_i} frac{w_i cdot y }{leftlVert w_i rightrVert ; leftlVert y rightrVert}}$$
          i.e. $x^star$ is the $w_i$ that has the largest angle cosine to our estimated vector $y$.



          The other baselines just change how the right side is calculated.






          share|cite|improve this answer























            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005257%2fwhat-is-vector-offset%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote



            accepted











            Vector offset




            Let vector $a$ correspond to word "debug", and vector $a^star$ to word "debugging". Their difference, $a^star - a$, is the vector offset that corresponds to the linguistic relationship the "-ing" suffix denotes.




            A consistent vector offset




            If the vector offset is consistent, then if vector $b$ corresponds to word "scream", then vector $b + (a^star - a) = b + a^star - a$ is likely to correspond to word "screaming". Similarly for any other word that has a related word with the same linguistic "-ing" relationship.



            If the vector offset is not consistent, then vector $b^star$ corresponding to word "screaming" is more likely to differ, $b^star ne b + (a^star - a)$.




            Baseline




            It is their name for a set of alternative methods, or helper functions, of finding words with a specific linguistic relationship (not relying on the properties of the vector difference between them) to a known one. They call them "baselines", because they use them as "standards" or baselines, to compare the vector offset method results to.



            The paper lists five baselines: "vanilla", "add", "only-b", "ignore-a", "add-opposite", and "multiply". The "vanilla" one is the direct vector offset one:
            $$bbox{x^star = operatorname*{argmax}_{x^prime} frac{x^prime cdot ( a^star - a + x )}{leftlVert x^prime rightrVert ; leftlVert a^star - a + x rightrVert}}$$
            i.e., $x^star$ is the one among the known $x^prime$ that maximizes the cosine angle between $x^prime$ and the estimated vector from the known $x$, using vector offset ($a^star - a$ that corresponds to the linguistic relationship between the known word $x$ and the word $x^star$ we are looking for). The four others are variations.



            The two reverse ones ("reverse (add)" and "reverse (multiply)") are the same as "add" and "multiply", respectively, except when the expression is used to find $x$ when $x^star$ is known, instead.





            It might be a little clearer to write the baselines using better variable names.



            For example, let's say $y = x + a^star - a$ is the estimated vector for the word we are looking for. It is related to $x$ the same way $a^star$ is related to $a$, assuming the vectors are defined in a way that vector offsets are consistent. If $w_i$ are all known words, then
            $$bbox{x^star = operatorname*{argmax}_{w_i} frac{w_i cdot y }{leftlVert w_i rightrVert ; leftlVert y rightrVert}}$$
            i.e. $x^star$ is the $w_i$ that has the largest angle cosine to our estimated vector $y$.



            The other baselines just change how the right side is calculated.






            share|cite|improve this answer



























              up vote
              1
              down vote



              accepted











              Vector offset




              Let vector $a$ correspond to word "debug", and vector $a^star$ to word "debugging". Their difference, $a^star - a$, is the vector offset that corresponds to the linguistic relationship the "-ing" suffix denotes.




              A consistent vector offset




              If the vector offset is consistent, then if vector $b$ corresponds to word "scream", then vector $b + (a^star - a) = b + a^star - a$ is likely to correspond to word "screaming". Similarly for any other word that has a related word with the same linguistic "-ing" relationship.



              If the vector offset is not consistent, then vector $b^star$ corresponding to word "screaming" is more likely to differ, $b^star ne b + (a^star - a)$.




              Baseline




              It is their name for a set of alternative methods, or helper functions, of finding words with a specific linguistic relationship (not relying on the properties of the vector difference between them) to a known one. They call them "baselines", because they use them as "standards" or baselines, to compare the vector offset method results to.



              The paper lists five baselines: "vanilla", "add", "only-b", "ignore-a", "add-opposite", and "multiply". The "vanilla" one is the direct vector offset one:
              $$bbox{x^star = operatorname*{argmax}_{x^prime} frac{x^prime cdot ( a^star - a + x )}{leftlVert x^prime rightrVert ; leftlVert a^star - a + x rightrVert}}$$
              i.e., $x^star$ is the one among the known $x^prime$ that maximizes the cosine angle between $x^prime$ and the estimated vector from the known $x$, using vector offset ($a^star - a$ that corresponds to the linguistic relationship between the known word $x$ and the word $x^star$ we are looking for). The four others are variations.



              The two reverse ones ("reverse (add)" and "reverse (multiply)") are the same as "add" and "multiply", respectively, except when the expression is used to find $x$ when $x^star$ is known, instead.





              It might be a little clearer to write the baselines using better variable names.



              For example, let's say $y = x + a^star - a$ is the estimated vector for the word we are looking for. It is related to $x$ the same way $a^star$ is related to $a$, assuming the vectors are defined in a way that vector offsets are consistent. If $w_i$ are all known words, then
              $$bbox{x^star = operatorname*{argmax}_{w_i} frac{w_i cdot y }{leftlVert w_i rightrVert ; leftlVert y rightrVert}}$$
              i.e. $x^star$ is the $w_i$ that has the largest angle cosine to our estimated vector $y$.



              The other baselines just change how the right side is calculated.






              share|cite|improve this answer

























                up vote
                1
                down vote



                accepted







                up vote
                1
                down vote



                accepted







                Vector offset




                Let vector $a$ correspond to word "debug", and vector $a^star$ to word "debugging". Their difference, $a^star - a$, is the vector offset that corresponds to the linguistic relationship the "-ing" suffix denotes.




                A consistent vector offset




                If the vector offset is consistent, then if vector $b$ corresponds to word "scream", then vector $b + (a^star - a) = b + a^star - a$ is likely to correspond to word "screaming". Similarly for any other word that has a related word with the same linguistic "-ing" relationship.



                If the vector offset is not consistent, then vector $b^star$ corresponding to word "screaming" is more likely to differ, $b^star ne b + (a^star - a)$.




                Baseline




                It is their name for a set of alternative methods, or helper functions, of finding words with a specific linguistic relationship (not relying on the properties of the vector difference between them) to a known one. They call them "baselines", because they use them as "standards" or baselines, to compare the vector offset method results to.



                The paper lists five baselines: "vanilla", "add", "only-b", "ignore-a", "add-opposite", and "multiply". The "vanilla" one is the direct vector offset one:
                $$bbox{x^star = operatorname*{argmax}_{x^prime} frac{x^prime cdot ( a^star - a + x )}{leftlVert x^prime rightrVert ; leftlVert a^star - a + x rightrVert}}$$
                i.e., $x^star$ is the one among the known $x^prime$ that maximizes the cosine angle between $x^prime$ and the estimated vector from the known $x$, using vector offset ($a^star - a$ that corresponds to the linguistic relationship between the known word $x$ and the word $x^star$ we are looking for). The four others are variations.



                The two reverse ones ("reverse (add)" and "reverse (multiply)") are the same as "add" and "multiply", respectively, except when the expression is used to find $x$ when $x^star$ is known, instead.





                It might be a little clearer to write the baselines using better variable names.



                For example, let's say $y = x + a^star - a$ is the estimated vector for the word we are looking for. It is related to $x$ the same way $a^star$ is related to $a$, assuming the vectors are defined in a way that vector offsets are consistent. If $w_i$ are all known words, then
                $$bbox{x^star = operatorname*{argmax}_{w_i} frac{w_i cdot y }{leftlVert w_i rightrVert ; leftlVert y rightrVert}}$$
                i.e. $x^star$ is the $w_i$ that has the largest angle cosine to our estimated vector $y$.



                The other baselines just change how the right side is calculated.






                share|cite|improve this answer















                Vector offset




                Let vector $a$ correspond to word "debug", and vector $a^star$ to word "debugging". Their difference, $a^star - a$, is the vector offset that corresponds to the linguistic relationship the "-ing" suffix denotes.




                A consistent vector offset




                If the vector offset is consistent, then if vector $b$ corresponds to word "scream", then vector $b + (a^star - a) = b + a^star - a$ is likely to correspond to word "screaming". Similarly for any other word that has a related word with the same linguistic "-ing" relationship.



                If the vector offset is not consistent, then vector $b^star$ corresponding to word "screaming" is more likely to differ, $b^star ne b + (a^star - a)$.




                Baseline




                It is their name for a set of alternative methods, or helper functions, of finding words with a specific linguistic relationship (not relying on the properties of the vector difference between them) to a known one. They call them "baselines", because they use them as "standards" or baselines, to compare the vector offset method results to.



                The paper lists five baselines: "vanilla", "add", "only-b", "ignore-a", "add-opposite", and "multiply". The "vanilla" one is the direct vector offset one:
                $$bbox{x^star = operatorname*{argmax}_{x^prime} frac{x^prime cdot ( a^star - a + x )}{leftlVert x^prime rightrVert ; leftlVert a^star - a + x rightrVert}}$$
                i.e., $x^star$ is the one among the known $x^prime$ that maximizes the cosine angle between $x^prime$ and the estimated vector from the known $x$, using vector offset ($a^star - a$ that corresponds to the linguistic relationship between the known word $x$ and the word $x^star$ we are looking for). The four others are variations.



                The two reverse ones ("reverse (add)" and "reverse (multiply)") are the same as "add" and "multiply", respectively, except when the expression is used to find $x$ when $x^star$ is known, instead.





                It might be a little clearer to write the baselines using better variable names.



                For example, let's say $y = x + a^star - a$ is the estimated vector for the word we are looking for. It is related to $x$ the same way $a^star$ is related to $a$, assuming the vectors are defined in a way that vector offsets are consistent. If $w_i$ are all known words, then
                $$bbox{x^star = operatorname*{argmax}_{w_i} frac{w_i cdot y }{leftlVert w_i rightrVert ; leftlVert y rightrVert}}$$
                i.e. $x^star$ is the $w_i$ that has the largest angle cosine to our estimated vector $y$.



                The other baselines just change how the right side is calculated.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited yesterday

























                answered yesterday









                Nominal Animal

                6,5102517




                6,5102517






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3005257%2fwhat-is-vector-offset%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    MongoDB - Not Authorized To Execute Command

                    in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

                    How to fix TextFormField cause rebuild widget in Flutter