How to use Elmo word embedding with the original pre-trained model (5.5B) in interactive mode












1















I am trying to learn how to use Elmo embeddings via this tutorial:



https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md



I am specifically trying to use the interactive mode as described like this:



$ ipython
> from allennlp.commands.elmo import ElmoEmbedder
> elmo = ElmoEmbedder()
> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]
> vectors = elmo.embed_sentence(tokens)

> assert(len(vectors) == 3) # one for each layer in the ELMo output
> assert(len(vectors[0]) == len(tokens)) # the vector elements
correspond with the input tokens

> import scipy
> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for",
"breakfast"])
> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine
distance between "apple" and "carrot" in the last layer
0.18020617961883545


My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?



I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.



My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!



Thanks for your patience as I am pretty new in all this.










share|improve this question



























    1















    I am trying to learn how to use Elmo embeddings via this tutorial:



    https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md



    I am specifically trying to use the interactive mode as described like this:



    $ ipython
    > from allennlp.commands.elmo import ElmoEmbedder
    > elmo = ElmoEmbedder()
    > tokens = ["I", "ate", "an", "apple", "for", "breakfast"]
    > vectors = elmo.embed_sentence(tokens)

    > assert(len(vectors) == 3) # one for each layer in the ELMo output
    > assert(len(vectors[0]) == len(tokens)) # the vector elements
    correspond with the input tokens

    > import scipy
    > vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for",
    "breakfast"])
    > scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine
    distance between "apple" and "carrot" in the last layer
    0.18020617961883545


    My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?



    I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.



    My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!



    Thanks for your patience as I am pretty new in all this.










    share|improve this question

























      1












      1








      1


      0






      I am trying to learn how to use Elmo embeddings via this tutorial:



      https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md



      I am specifically trying to use the interactive mode as described like this:



      $ ipython
      > from allennlp.commands.elmo import ElmoEmbedder
      > elmo = ElmoEmbedder()
      > tokens = ["I", "ate", "an", "apple", "for", "breakfast"]
      > vectors = elmo.embed_sentence(tokens)

      > assert(len(vectors) == 3) # one for each layer in the ELMo output
      > assert(len(vectors[0]) == len(tokens)) # the vector elements
      correspond with the input tokens

      > import scipy
      > vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for",
      "breakfast"])
      > scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine
      distance between "apple" and "carrot" in the last layer
      0.18020617961883545


      My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?



      I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.



      My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!



      Thanks for your patience as I am pretty new in all this.










      share|improve this question














      I am trying to learn how to use Elmo embeddings via this tutorial:



      https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md



      I am specifically trying to use the interactive mode as described like this:



      $ ipython
      > from allennlp.commands.elmo import ElmoEmbedder
      > elmo = ElmoEmbedder()
      > tokens = ["I", "ate", "an", "apple", "for", "breakfast"]
      > vectors = elmo.embed_sentence(tokens)

      > assert(len(vectors) == 3) # one for each layer in the ELMo output
      > assert(len(vectors[0]) == len(tokens)) # the vector elements
      correspond with the input tokens

      > import scipy
      > vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for",
      "breakfast"])
      > scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine
      distance between "apple" and "carrot" in the last layer
      0.18020617961883545


      My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?



      I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.



      My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!



      Thanks for your patience as I am pretty new in all this.







      python machine-learning nlp artificial-intelligence






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 2 at 2:29









      somethingstrangsomethingstrang

      141314




      141314
























          1 Answer
          1






          active

          oldest

          votes


















          1














          By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:



          elmo = ElmoEmbedder(
          options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json',
          weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'
          )


          I got these links from the pretrained models table provided by AllenNLP.





          assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.





          Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:




          The first layer corresponds to the context insensitive token
          representation, followed by the two LSTM layers. See the ELMo paper or
          follow up work at EMNLP 2018 for a description of what types of
          information is captured in each layer.




          The paper provides the details on those two LSTM layers.





          Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!






          share|improve this answer
























          • Does embed_sentences() do straight forward vector averaging?

            – somethingstrang
            Jan 4 at 15:03











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54000564%2fhow-to-use-elmo-word-embedding-with-the-original-pre-trained-model-5-5b-in-int%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:



          elmo = ElmoEmbedder(
          options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json',
          weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'
          )


          I got these links from the pretrained models table provided by AllenNLP.





          assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.





          Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:




          The first layer corresponds to the context insensitive token
          representation, followed by the two LSTM layers. See the ELMo paper or
          follow up work at EMNLP 2018 for a description of what types of
          information is captured in each layer.




          The paper provides the details on those two LSTM layers.





          Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!






          share|improve this answer
























          • Does embed_sentences() do straight forward vector averaging?

            – somethingstrang
            Jan 4 at 15:03
















          1














          By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:



          elmo = ElmoEmbedder(
          options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json',
          weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'
          )


          I got these links from the pretrained models table provided by AllenNLP.





          assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.





          Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:




          The first layer corresponds to the context insensitive token
          representation, followed by the two LSTM layers. See the ELMo paper or
          follow up work at EMNLP 2018 for a description of what types of
          information is captured in each layer.




          The paper provides the details on those two LSTM layers.





          Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!






          share|improve this answer
























          • Does embed_sentences() do straight forward vector averaging?

            – somethingstrang
            Jan 4 at 15:03














          1












          1








          1







          By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:



          elmo = ElmoEmbedder(
          options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json',
          weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'
          )


          I got these links from the pretrained models table provided by AllenNLP.





          assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.





          Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:




          The first layer corresponds to the context insensitive token
          representation, followed by the two LSTM layers. See the ELMo paper or
          follow up work at EMNLP 2018 for a description of what types of
          information is captured in each layer.




          The paper provides the details on those two LSTM layers.





          Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!






          share|improve this answer













          By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:



          elmo = ElmoEmbedder(
          options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json',
          weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'
          )


          I got these links from the pretrained models table provided by AllenNLP.





          assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.





          Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:




          The first layer corresponds to the context insensitive token
          representation, followed by the two LSTM layers. See the ELMo paper or
          follow up work at EMNLP 2018 for a description of what types of
          information is captured in each layer.




          The paper provides the details on those two LSTM layers.





          Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 3 at 22:13









          Alex LAlex L

          309411




          309411













          • Does embed_sentences() do straight forward vector averaging?

            – somethingstrang
            Jan 4 at 15:03



















          • Does embed_sentences() do straight forward vector averaging?

            – somethingstrang
            Jan 4 at 15:03

















          Does embed_sentences() do straight forward vector averaging?

          – somethingstrang
          Jan 4 at 15:03





          Does embed_sentences() do straight forward vector averaging?

          – somethingstrang
          Jan 4 at 15:03




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54000564%2fhow-to-use-elmo-word-embedding-with-the-original-pre-trained-model-5-5b-in-int%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          MongoDB - Not Authorized To Execute Command

          How to fix TextFormField cause rebuild widget in Flutter

          Npm cannot find a required file even through it is in the searched directory