How to use Elmo word embedding with the original pre-trained model (5.5B) in interactive mode

I am trying to learn how to use Elmo embeddings via this tutorial:

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

I am specifically trying to use the interactive mode as described like this:

$ ipython

> from allennlp.commands.elmo import ElmoEmbedder

> elmo = ElmoEmbedder()

> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]

> vectors = elmo.embed_sentence(tokens)



> assert(len(vectors) == 3) # one for each layer in the ELMo output

> assert(len(vectors[0]) == len(tokens)) # the vector elements 

correspond with the input tokens



> import scipy

> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for", 

"breakfast"])

> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine 

distance between "apple" and "carrot" in the last layer

0.18020617961883545

My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?

I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.

My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!

Thanks for your patience as I am pretty new in all this.

asked Jan 2 at 2:29

somethingstrang

141314

add a comment |

I am trying to learn how to use Elmo embeddings via this tutorial:

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

I am specifically trying to use the interactive mode as described like this:

$ ipython

> from allennlp.commands.elmo import ElmoEmbedder

> elmo = ElmoEmbedder()

> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]

> vectors = elmo.embed_sentence(tokens)



> assert(len(vectors) == 3) # one for each layer in the ELMo output

> assert(len(vectors[0]) == len(tokens)) # the vector elements 

correspond with the input tokens



> import scipy

> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for", 

"breakfast"])

> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine 

distance between "apple" and "carrot" in the last layer

0.18020617961883545

My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?

I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.

My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!

Thanks for your patience as I am pretty new in all this.

asked Jan 2 at 2:29

somethingstrang

141314

add a comment |

I am trying to learn how to use Elmo embeddings via this tutorial:

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

I am specifically trying to use the interactive mode as described like this:

$ ipython

> from allennlp.commands.elmo import ElmoEmbedder

> elmo = ElmoEmbedder()

> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]

> vectors = elmo.embed_sentence(tokens)



> assert(len(vectors) == 3) # one for each layer in the ELMo output

> assert(len(vectors[0]) == len(tokens)) # the vector elements 

correspond with the input tokens



> import scipy

> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for", 

"breakfast"])

> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine 

distance between "apple" and "carrot" in the last layer

0.18020617961883545

My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?

I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.

My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!

Thanks for your patience as I am pretty new in all this.

asked Jan 2 at 2:29

somethingstrang

141314

I am trying to learn how to use Elmo embeddings via this tutorial:

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

I am specifically trying to use the interactive mode as described like this:

$ ipython

> from allennlp.commands.elmo import ElmoEmbedder

> elmo = ElmoEmbedder()

> tokens = ["I", "ate", "an", "apple", "for", "breakfast"]

> vectors = elmo.embed_sentence(tokens)



> assert(len(vectors) == 3) # one for each layer in the ELMo output

> assert(len(vectors[0]) == len(tokens)) # the vector elements 

correspond with the input tokens



> import scipy

> vectors2 = elmo.embed_sentence(["I", "ate", "a", "carrot", "for", 

"breakfast"])

> scipy.spatial.distance.cosine(vectors[2][3], vectors2[2][3]) # cosine 

distance between "apple" and "carrot" in the last layer

0.18020617961883545

My overall question is how do I make sure to use the pre-trained elmo model on the original 5.5B set (described here: https://allennlp.org/elmo)?

I don't quite understand why we have to call "assert" or why we use the [2][3] indexing on the vector output.

My ultimate purpose is to average the all the word embeddings in order to get a sentence embedding, so I want to make sure I do it right!

Thanks for your patience as I am pretty new in all this.

python machine-learning nlp artificial-intelligence

asked Jan 2 at 2:29

somethingstrang

141314

asked Jan 2 at 2:29

somethingstrang

141314

asked Jan 2 at 2:29

somethingstrang

141314

asked Jan 2 at 2:29

somethingstrang

141314

asked Jan 2 at 2:29

somethingstrang

141314

add a comment |

1 Answer
1

active

oldest

votes

By default, ElmoEmbedder uses the Original weights and options from the pretrained models on the 1 Bil Word benchmark. About 800 million tokens. To ensure you're using the largest model, look at the arguments of the ElmoEmbedder class. From here you could probably figure out that you can set the options and weights of the model:

elmo = ElmoEmbedder(

    options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json', 

    weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'

)

I got these links from the pretrained models table provided by AllenNLP.

assert is a convenient way to test and ensure specific values of variables. This looks like a good resource to read more. For example, the first assert statement ensure the embedding has three output matrices.

Going off of that, we index with [i][j] because the model outputs 3 layer matrices (where we choose the i-th) and each matrix has n tokens (where we choose the j-th) each of length 1024. Notice how the code compares the similarity of "apple" and "carrot", both of which are the 4th token at index j=3. From the example documentation, i represents one of:

The first layer corresponds to the context insensitive token
representation, followed by the two LSTM layers. See the ELMo paper or
follow up work at EMNLP 2018 for a description of what types of
information is captured in each layer.

The paper provides the details on those two LSTM layers.

Lastly, if you have a set of sentences, with ELMO you don't need to average the token vectors. The model is a character-wise LSTM, which works perfectly fine on tokenized whole sentences. Use one of the methods designed for working with sets of sentences: embed_sentences(), embed_batch(), etc. More in the code!

answered Jan 3 at 22:13

Alex L

309411

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54000564%2fhow-to-use-elmo-word-embedding-with-the-original-pre-trained-model-5-5b-in-int%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

elmo = ElmoEmbedder(

    options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json', 

    weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'

)

I got these links from the pretrained models table provided by AllenNLP.

The first layer corresponds to the context insensitive token
representation, followed by the two LSTM layers. See the ELMo paper or
follow up work at EMNLP 2018 for a description of what types of
information is captured in each layer.

The paper provides the details on those two LSTM layers.

answered Jan 3 at 22:13

Alex L

309411

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

add a comment |

elmo = ElmoEmbedder(

    options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json', 

    weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'

)

I got these links from the pretrained models table provided by AllenNLP.

The first layer corresponds to the context insensitive token
representation, followed by the two LSTM layers. See the ELMo paper or
follow up work at EMNLP 2018 for a description of what types of
information is captured in each layer.

The paper provides the details on those two LSTM layers.

answered Jan 3 at 22:13

Alex L

309411

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

add a comment |

elmo = ElmoEmbedder(

    options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json', 

    weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'

)

I got these links from the pretrained models table provided by AllenNLP.

The first layer corresponds to the context insensitive token
representation, followed by the two LSTM layers. See the ELMo paper or
follow up work at EMNLP 2018 for a description of what types of
information is captured in each layer.

The paper provides the details on those two LSTM layers.

answered Jan 3 at 22:13

Alex L

309411

elmo = ElmoEmbedder(

    options_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_options.json', 

    weight_file='https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway_5.5B/elmo_2x4096_512_2048cnn_2xhighway_5.5B_weights.hdf5'

)

I got these links from the pretrained models table provided by AllenNLP.

The first layer corresponds to the context insensitive token
representation, followed by the two LSTM layers. See the ELMo paper or
follow up work at EMNLP 2018 for a description of what types of
information is captured in each layer.

The paper provides the details on those two LSTM layers.

answered Jan 3 at 22:13

Alex L

309411

answered Jan 3 at 22:13

Alex L

309411

answered Jan 3 at 22:13

Alex L

309411

answered Jan 3 at 22:13

Alex L

309411

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

add a comment |

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

Does embed_sentences() do straight forward vector averaging?

– somethingstrang
Jan 4 at 15:03

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu