How to use keras embedding layer with 3D tensor input?

I am facing difficulty in using Keras embedding layer with one hot encoding of my input data.

Following is the toy code.

Import packages

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.embeddings import Embedding

from keras.optimizers import Adam

import matplotlib.pyplot as plt

import numpy as np

import openpyxl

import pandas as pd

from keras.callbacks import ModelCheckpoint

from keras.callbacks import ReduceLROnPlateau

The input data is text based as follows.

Train and Test data

X_train_orignal= np.array(['OC(=O)C1=C(Cl)C=CC=C1Cl', 'OC(=O)C1=C(Cl)C=C(Cl)C=C1Cl',

       'OC(=O)C1=CC=CC(=C1Cl)Cl', 'OC(=O)C1=CC(=CC=C1Cl)Cl',

       'OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=O'])



X_test_orignal=np.array(['OC(=O)C1=CC=C(Cl)C=C1Cl', 'CCOC(N)=O',

       'OC1=C(Cl)C(=C(Cl)C=C1Cl)Cl'])



Y_train=np.array(([[2.33],

       [2.59],

       [2.59],

       [2.54],

       [4.06]]))



Y_test=np.array([[2.20],

   [2.81],

   [2.00]])

Creating dictionaries

Now i create two dictionaries, characters to index vice. The unique character number is stored in len(charset) and maximum length of the string along with 5 additional characters is stored in embed. The start of each string will be padded with ! and end will be E.

charset = set("".join(list(X_train_orignal))+"!E")

char_to_int = dict((c,i) for i,c in enumerate(charset))

int_to_char = dict((i,c) for i,c in enumerate(charset))

embed = max([len(smile) for smile in X_train_orignal]) + 5

print (str(charset))

print(len(charset), embed)

One hot encoding

I convert all the train data into one hot encoding as follows.

def vectorize(smiles):

        one_hot =  np.zeros((smiles.shape[0], embed , len(charset)),dtype=np.int8)

        for i,smile in enumerate(smiles):

            #encode the startchar

            one_hot[i,0,char_to_int["!"]] = 1

            #encode the rest of the chars

            for j,c in enumerate(smile):

                one_hot[i,j+1,char_to_int[c]] = 1

            #Encode endchar

            one_hot[i,len(smile)+1:,char_to_int["E"]] = 1



        return one_hot[:,0:-1,:]



X_train = vectorize(X_train_orignal)

print(X_train.shape)

X_test = vectorize(X_test_orignal)

print(X_test.shape)

When it converts the input train data into one hot encoding, the shape of the one hot encoded data becomes (5, 44, 14) for train and (3, 44, 14) for test. For train, there are 5 example, 0-44 is the maximum length and 14 are the unique characters. The examples for which there are less number of characters, are padded with E till the maximum length.

Verifying the correct padding
Following is the code to verify if we have done the padding rightly.

mol_str_train=

mol_str_test=

for x in range(5):



    mol_str_train.append("".join([int_to_char[idx] for idx in np.argmax(X_train[x,:,:], axis=1)]))



for x in range(3):

    mol_str_test.append("".join([int_to_char[idx] for idx in np.argmax(X_test[x,:,:], axis=1)]))

and let's see, how the train set looks like.

mol_str_train



['!OC(=O)C1=C(Cl)C=CC=C1ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=C(Cl)C=C(Cl)C=C1ClEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC=CC(=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC(=CC=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=OEEE']

Now is the time to build model.

Model

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=embed))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=X_train, y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

Error

ValueError: Error when checking input: expected embedding_3_input to have 2 dimensions, but got array with shape (5, 44, 14)

The embedding layer expects two dimensional array. How can I deal with this issue so that it can accept the one hot vector encoded data.

All the above code can be run.

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

add a comment |

I am facing difficulty in using Keras embedding layer with one hot encoding of my input data.

Following is the toy code.

Import packages

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.embeddings import Embedding

from keras.optimizers import Adam

import matplotlib.pyplot as plt

import numpy as np

import openpyxl

import pandas as pd

from keras.callbacks import ModelCheckpoint

from keras.callbacks import ReduceLROnPlateau

The input data is text based as follows.

Train and Test data

X_train_orignal= np.array(['OC(=O)C1=C(Cl)C=CC=C1Cl', 'OC(=O)C1=C(Cl)C=C(Cl)C=C1Cl',

       'OC(=O)C1=CC=CC(=C1Cl)Cl', 'OC(=O)C1=CC(=CC=C1Cl)Cl',

       'OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=O'])



X_test_orignal=np.array(['OC(=O)C1=CC=C(Cl)C=C1Cl', 'CCOC(N)=O',

       'OC1=C(Cl)C(=C(Cl)C=C1Cl)Cl'])



Y_train=np.array(([[2.33],

       [2.59],

       [2.59],

       [2.54],

       [4.06]]))



Y_test=np.array([[2.20],

   [2.81],

   [2.00]])

Creating dictionaries

charset = set("".join(list(X_train_orignal))+"!E")

char_to_int = dict((c,i) for i,c in enumerate(charset))

int_to_char = dict((i,c) for i,c in enumerate(charset))

embed = max([len(smile) for smile in X_train_orignal]) + 5

print (str(charset))

print(len(charset), embed)

One hot encoding

I convert all the train data into one hot encoding as follows.

def vectorize(smiles):

        one_hot =  np.zeros((smiles.shape[0], embed , len(charset)),dtype=np.int8)

        for i,smile in enumerate(smiles):

            #encode the startchar

            one_hot[i,0,char_to_int["!"]] = 1

            #encode the rest of the chars

            for j,c in enumerate(smile):

                one_hot[i,j+1,char_to_int[c]] = 1

            #Encode endchar

            one_hot[i,len(smile)+1:,char_to_int["E"]] = 1



        return one_hot[:,0:-1,:]



X_train = vectorize(X_train_orignal)

print(X_train.shape)

X_test = vectorize(X_test_orignal)

print(X_test.shape)

Verifying the correct padding
Following is the code to verify if we have done the padding rightly.

mol_str_train=

mol_str_test=

for x in range(5):



    mol_str_train.append("".join([int_to_char[idx] for idx in np.argmax(X_train[x,:,:], axis=1)]))



for x in range(3):

    mol_str_test.append("".join([int_to_char[idx] for idx in np.argmax(X_test[x,:,:], axis=1)]))

and let's see, how the train set looks like.

mol_str_train



['!OC(=O)C1=C(Cl)C=CC=C1ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=C(Cl)C=C(Cl)C=C1ClEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC=CC(=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC(=CC=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=OEEE']

Now is the time to build model.

Model

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=embed))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=X_train, y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

Error

ValueError: Error when checking input: expected embedding_3_input to have 2 dimensions, but got array with shape (5, 44, 14)

The embedding layer expects two dimensional array. How can I deal with this issue so that it can accept the one hot vector encoded data.

All the above code can be run.

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

add a comment |

I am facing difficulty in using Keras embedding layer with one hot encoding of my input data.

Following is the toy code.

Import packages

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.embeddings import Embedding

from keras.optimizers import Adam

import matplotlib.pyplot as plt

import numpy as np

import openpyxl

import pandas as pd

from keras.callbacks import ModelCheckpoint

from keras.callbacks import ReduceLROnPlateau

The input data is text based as follows.

Train and Test data

X_train_orignal= np.array(['OC(=O)C1=C(Cl)C=CC=C1Cl', 'OC(=O)C1=C(Cl)C=C(Cl)C=C1Cl',

       'OC(=O)C1=CC=CC(=C1Cl)Cl', 'OC(=O)C1=CC(=CC=C1Cl)Cl',

       'OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=O'])



X_test_orignal=np.array(['OC(=O)C1=CC=C(Cl)C=C1Cl', 'CCOC(N)=O',

       'OC1=C(Cl)C(=C(Cl)C=C1Cl)Cl'])



Y_train=np.array(([[2.33],

       [2.59],

       [2.59],

       [2.54],

       [4.06]]))



Y_test=np.array([[2.20],

   [2.81],

   [2.00]])

Creating dictionaries

charset = set("".join(list(X_train_orignal))+"!E")

char_to_int = dict((c,i) for i,c in enumerate(charset))

int_to_char = dict((i,c) for i,c in enumerate(charset))

embed = max([len(smile) for smile in X_train_orignal]) + 5

print (str(charset))

print(len(charset), embed)

One hot encoding

I convert all the train data into one hot encoding as follows.

def vectorize(smiles):

        one_hot =  np.zeros((smiles.shape[0], embed , len(charset)),dtype=np.int8)

        for i,smile in enumerate(smiles):

            #encode the startchar

            one_hot[i,0,char_to_int["!"]] = 1

            #encode the rest of the chars

            for j,c in enumerate(smile):

                one_hot[i,j+1,char_to_int[c]] = 1

            #Encode endchar

            one_hot[i,len(smile)+1:,char_to_int["E"]] = 1



        return one_hot[:,0:-1,:]



X_train = vectorize(X_train_orignal)

print(X_train.shape)

X_test = vectorize(X_test_orignal)

print(X_test.shape)

Verifying the correct padding
Following is the code to verify if we have done the padding rightly.

mol_str_train=

mol_str_test=

for x in range(5):



    mol_str_train.append("".join([int_to_char[idx] for idx in np.argmax(X_train[x,:,:], axis=1)]))



for x in range(3):

    mol_str_test.append("".join([int_to_char[idx] for idx in np.argmax(X_test[x,:,:], axis=1)]))

and let's see, how the train set looks like.

mol_str_train



['!OC(=O)C1=C(Cl)C=CC=C1ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=C(Cl)C=C(Cl)C=C1ClEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC=CC(=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC(=CC=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=OEEE']

Now is the time to build model.

Model

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=embed))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=X_train, y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

Error

ValueError: Error when checking input: expected embedding_3_input to have 2 dimensions, but got array with shape (5, 44, 14)

The embedding layer expects two dimensional array. How can I deal with this issue so that it can accept the one hot vector encoded data.

All the above code can be run.

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

I am facing difficulty in using Keras embedding layer with one hot encoding of my input data.

Following is the toy code.

Import packages

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import Flatten

from keras.layers.embeddings import Embedding

from keras.optimizers import Adam

import matplotlib.pyplot as plt

import numpy as np

import openpyxl

import pandas as pd

from keras.callbacks import ModelCheckpoint

from keras.callbacks import ReduceLROnPlateau

The input data is text based as follows.

Train and Test data

X_train_orignal= np.array(['OC(=O)C1=C(Cl)C=CC=C1Cl', 'OC(=O)C1=C(Cl)C=C(Cl)C=C1Cl',

       'OC(=O)C1=CC=CC(=C1Cl)Cl', 'OC(=O)C1=CC(=CC=C1Cl)Cl',

       'OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=O'])



X_test_orignal=np.array(['OC(=O)C1=CC=C(Cl)C=C1Cl', 'CCOC(N)=O',

       'OC1=C(Cl)C(=C(Cl)C=C1Cl)Cl'])



Y_train=np.array(([[2.33],

       [2.59],

       [2.59],

       [2.54],

       [4.06]]))



Y_test=np.array([[2.20],

   [2.81],

   [2.00]])

Creating dictionaries

charset = set("".join(list(X_train_orignal))+"!E")

char_to_int = dict((c,i) for i,c in enumerate(charset))

int_to_char = dict((i,c) for i,c in enumerate(charset))

embed = max([len(smile) for smile in X_train_orignal]) + 5

print (str(charset))

print(len(charset), embed)

One hot encoding

I convert all the train data into one hot encoding as follows.

def vectorize(smiles):

        one_hot =  np.zeros((smiles.shape[0], embed , len(charset)),dtype=np.int8)

        for i,smile in enumerate(smiles):

            #encode the startchar

            one_hot[i,0,char_to_int["!"]] = 1

            #encode the rest of the chars

            for j,c in enumerate(smile):

                one_hot[i,j+1,char_to_int[c]] = 1

            #Encode endchar

            one_hot[i,len(smile)+1:,char_to_int["E"]] = 1



        return one_hot[:,0:-1,:]



X_train = vectorize(X_train_orignal)

print(X_train.shape)

X_test = vectorize(X_test_orignal)

print(X_test.shape)

Verifying the correct padding
Following is the code to verify if we have done the padding rightly.

mol_str_train=

mol_str_test=

for x in range(5):



    mol_str_train.append("".join([int_to_char[idx] for idx in np.argmax(X_train[x,:,:], axis=1)]))



for x in range(3):

    mol_str_test.append("".join([int_to_char[idx] for idx in np.argmax(X_test[x,:,:], axis=1)]))

and let's see, how the train set looks like.

mol_str_train



['!OC(=O)C1=C(Cl)C=CC=C1ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=C(Cl)C=C(Cl)C=C1ClEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC=CC(=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC(=O)C1=CC(=CC=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',

 '!OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=OEEE']

Now is the time to build model.

Model

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=embed))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=X_train, y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

Error

ValueError: Error when checking input: expected embedding_3_input to have 2 dimensions, but got array with shape (5, 44, 14)

The embedding layer expects two dimensional array. How can I deal with this issue so that it can accept the one hot vector encoded data.

All the above code can be run.

python machine-learning keras nlp word2vec

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

edited Nov 21 '18 at 23:15

Noman Dilawar

78931142

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

asked Nov 21 '18 at 9:01

Abdul Karim Khan

323111

add a comment |

2 Answers
2

active

oldest

votes

our input shape was not defined properly in the embedding layer. The following code works for me by reducing the steps to covert your data dimensions to 2D you can directly pass the 3-D input to your embedding layer.

#THE MISSING STUFF

#_________________________________________

Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array

max_len = len(charset)

max_features = embed-1

inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape

#__________________________________________



model = Sequential()

#model.add(Embedding(len(charset), 10, input_length=14))



model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))

model.add(Flatten())

model.add(Dense(1, activation='linear'))

print(model.summary())



optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])





callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]



history =model.fit(x=X_train, y=Y_train,

                              batch_size=10,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

add a comment |

The Keras embedding layer works with indices, not directly with one-hot encodings.
So you don't need to have (5,44,14), just (5,44) works fine.

E.g. get indices with argmax:

X_test = np.argmax(X_test, axis=2)

X_train = np.argmax(X_train, axis=2)

Although it's probably better to not one-hot encode it first =)

Besides that, your 'embed' variable says size 45, while your data is size 44.

If you change those, your model runs fine:

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=44))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination,     lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15,     verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss',         save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=np.argmax(X_train, axis=2), y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(np.argmax(X_test, axis=2),Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:21

Torec

362

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53408453%2fhow-to-use-keras-embedding-layer-with-3d-tensor-input%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

#THE MISSING STUFF

#_________________________________________

Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array

max_len = len(charset)

max_features = embed-1

inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape

#__________________________________________



model = Sequential()

#model.add(Embedding(len(charset), 10, input_length=14))



model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))

model.add(Flatten())

model.add(Dense(1, activation='linear'))

print(model.summary())



optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])





callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]



history =model.fit(x=X_train, y=Y_train,

                              batch_size=10,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

add a comment |

#THE MISSING STUFF

#_________________________________________

Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array

max_len = len(charset)

max_features = embed-1

inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape

#__________________________________________



model = Sequential()

#model.add(Embedding(len(charset), 10, input_length=14))



model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))

model.add(Flatten())

model.add(Dense(1, activation='linear'))

print(model.summary())



optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])





callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]



history =model.fit(x=X_train, y=Y_train,

                              batch_size=10,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

add a comment |

#THE MISSING STUFF

#_________________________________________

Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array

max_len = len(charset)

max_features = embed-1

inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape

#__________________________________________



model = Sequential()

#model.add(Embedding(len(charset), 10, input_length=14))



model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))

model.add(Flatten())

model.add(Dense(1, activation='linear'))

print(model.summary())



optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])





callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]



history =model.fit(x=X_train, y=Y_train,

                              batch_size=10,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

#THE MISSING STUFF

#_________________________________________

Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array

max_len = len(charset)

max_features = embed-1

inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape

#__________________________________________



model = Sequential()

#model.add(Embedding(len(charset), 10, input_length=14))



model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))

model.add(Flatten())

model.add(Dense(1, activation='linear'))

print(model.summary())



optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])





callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]



history =model.fit(x=X_train, y=Y_train,

                              batch_size=10,

                              epochs=10,

                              validation_data=(X_test,Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

answered Nov 21 '18 at 10:43

Noman Dilawar

78931142

add a comment |

The Keras embedding layer works with indices, not directly with one-hot encodings.
So you don't need to have (5,44,14), just (5,44) works fine.

E.g. get indices with argmax:

X_test = np.argmax(X_test, axis=2)

X_train = np.argmax(X_train, axis=2)

Although it's probably better to not one-hot encode it first =)

Besides that, your 'embed' variable says size 45, while your data is size 44.

If you change those, your model runs fine:

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=44))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination,     lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15,     verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss',         save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=np.argmax(X_train, axis=2), y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(np.argmax(X_test, axis=2),Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:21

Torec

362

add a comment |

The Keras embedding layer works with indices, not directly with one-hot encodings.
So you don't need to have (5,44,14), just (5,44) works fine.

E.g. get indices with argmax:

X_test = np.argmax(X_test, axis=2)

X_train = np.argmax(X_train, axis=2)

Although it's probably better to not one-hot encode it first =)

Besides that, your 'embed' variable says size 45, while your data is size 44.

If you change those, your model runs fine:

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=44))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination,     lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15,     verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss',         save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=np.argmax(X_train, axis=2), y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(np.argmax(X_test, axis=2),Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:21

Torec

362

add a comment |

The Keras embedding layer works with indices, not directly with one-hot encodings.
So you don't need to have (5,44,14), just (5,44) works fine.

E.g. get indices with argmax:

X_test = np.argmax(X_test, axis=2)

X_train = np.argmax(X_train, axis=2)

Although it's probably better to not one-hot encode it first =)

Besides that, your 'embed' variable says size 45, while your data is size 44.

If you change those, your model runs fine:

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=44))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination,     lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15,     verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss',         save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=np.argmax(X_train, axis=2), y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(np.argmax(X_test, axis=2),Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:21

Torec

362

The Keras embedding layer works with indices, not directly with one-hot encodings.
So you don't need to have (5,44,14), just (5,44) works fine.

E.g. get indices with argmax:

X_test = np.argmax(X_test, axis=2)

X_train = np.argmax(X_train, axis=2)

Although it's probably better to not one-hot encode it first =)

Besides that, your 'embed' variable says size 45, while your data is size 44.

If you change those, your model runs fine:

model = Sequential()

model.add(Embedding(len(charset), 10, input_length=44))

model.add(Flatten())

model.add(Dense(1, activation='linear'))



def coeff_determination(y_true, y_pred):

    from keras import backend as K

    SS_res =  K.sum(K.square( y_true-y_pred ))

    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )

    return ( 1 - SS_res/(SS_tot + K.epsilon()) )



def get_lr_metric(optimizer):

    def lr(y_true, y_pred):

        return optimizer.lr

    return lr





optimizer = Adam(lr=0.00025)

lr_metric = get_lr_metric(optimizer)

model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination,     lr_metric])







callbacks_list = [

    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15,     verbose=1, mode='auto',cooldown=0),

    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss',         save_best_only=True, verbose=1, mode='auto')]





history =model.fit(x=np.argmax(X_train, axis=2), y=Y_train,

                              batch_size=1,

                              epochs=10,

                              validation_data=(np.argmax(X_test, axis=2),Y_test),

                              callbacks=callbacks_list)

answered Nov 21 '18 at 10:21

Torec

362

answered Nov 21 '18 at 10:21

Torec

362

answered Nov 21 '18 at 10:21

Torec

362

answered Nov 21 '18 at 10:21

Torec

362

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu