Meaningful State Representations with Autoencoders & Embedding Layers

As I understand, Embedding Layers are simply lookup matrices, weights of which are learned by the optimisation problem.

Suppose, for this example, my dataset contains a single categorical variable. For example, I would like to auto encode a sentence of words to itself, to learn the sentence representation.

# example model

input = tf.keras.layers.Input()

embed = tf.keras.layers.Embedding(99)(input)

encoder = tf.keras.layers.LSTM()(embed)

decoder = tf.keras.layers.LSTM()(encoder)

model = tf.keras.models.Model(input, decoder)

The error will minimise the difference between embed and decoder outputs.

However, since embeddings are learned depending on optimisation condition, I think that I will end up learning trivial representations e.g.

the embedding matrix is all ones, and decoder always outputs ones. (Or zeros even), giving me a 100% accuracy in training.

For example, in the embedding matrix all words are just a vector of ones, and the auto encoder simply returns ones.

What I would like to do is to learn a meaningful representation of categorical variables.

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

add a comment |

As I understand, Embedding Layers are simply lookup matrices, weights of which are learned by the optimisation problem.

Suppose, for this example, my dataset contains a single categorical variable. For example, I would like to auto encode a sentence of words to itself, to learn the sentence representation.

# example model

input = tf.keras.layers.Input()

embed = tf.keras.layers.Embedding(99)(input)

encoder = tf.keras.layers.LSTM()(embed)

decoder = tf.keras.layers.LSTM()(encoder)

model = tf.keras.models.Model(input, decoder)

The error will minimise the difference between embed and decoder outputs.

However, since embeddings are learned depending on optimisation condition, I think that I will end up learning trivial representations e.g.

the embedding matrix is all ones, and decoder always outputs ones. (Or zeros even), giving me a 100% accuracy in training.

For example, in the embedding matrix all words are just a vector of ones, and the auto encoder simply returns ones.

What I would like to do is to learn a meaningful representation of categorical variables.

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

add a comment |

As I understand, Embedding Layers are simply lookup matrices, weights of which are learned by the optimisation problem.

Suppose, for this example, my dataset contains a single categorical variable. For example, I would like to auto encode a sentence of words to itself, to learn the sentence representation.

# example model

input = tf.keras.layers.Input()

embed = tf.keras.layers.Embedding(99)(input)

encoder = tf.keras.layers.LSTM()(embed)

decoder = tf.keras.layers.LSTM()(encoder)

model = tf.keras.models.Model(input, decoder)

The error will minimise the difference between embed and decoder outputs.

However, since embeddings are learned depending on optimisation condition, I think that I will end up learning trivial representations e.g.

the embedding matrix is all ones, and decoder always outputs ones. (Or zeros even), giving me a 100% accuracy in training.

For example, in the embedding matrix all words are just a vector of ones, and the auto encoder simply returns ones.

What I would like to do is to learn a meaningful representation of categorical variables.

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

As I understand, Embedding Layers are simply lookup matrices, weights of which are learned by the optimisation problem.

Suppose, for this example, my dataset contains a single categorical variable. For example, I would like to auto encode a sentence of words to itself, to learn the sentence representation.

# example model

input = tf.keras.layers.Input()

embed = tf.keras.layers.Embedding(99)(input)

encoder = tf.keras.layers.LSTM()(embed)

decoder = tf.keras.layers.LSTM()(encoder)

model = tf.keras.models.Model(input, decoder)

The error will minimise the difference between embed and decoder outputs.

However, since embeddings are learned depending on optimisation condition, I think that I will end up learning trivial representations e.g.

the embedding matrix is all ones, and decoder always outputs ones. (Or zeros even), giving me a 100% accuracy in training.

For example, in the embedding matrix all words are just a vector of ones, and the auto encoder simply returns ones.

What I would like to do is to learn a meaningful representation of categorical variables.

python tensorflow keras

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

edited Jan 1 at 16:48

asked Jan 1 at 15:06

user10430178

856

asked Jan 1 at 15:06

user10430178

856

asked Jan 1 at 15:06

user10430178

856

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996520%2fmeaningful-state-representations-with-autoencoders-embedding-layers%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu