Closed
Description
Hello everyone and happy new year
I am trying to create an LSTM Autoencoder as shown on the image bellow.
The encoder consumes the input "the cat sat",
and creates a vector depicted as the big red arrow.
The decoder takes this vector and tries to reconstruct the sequence
given the position in the sentence.
I would like to save this vector (big red arrow) to use it on another model.
The code i have wrote so far is the following:
from keras.layers import containers
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, AutoEncoder
import numpy as np
from keras.layers.recurrent import LSTM
train_x = [
[[ 1 ,3 ],[ 1 ,3 ]],
[[ 2 ,4 ],[ 2 ,4 ]],
[[ 3 ,5 ],[ 3 ,5 ]]
]
train_x = np.array(train_x)
encoder = containers.Sequential([ LSTM(output_dim=5, input_dim = 2, activation='tanh' , return_sequences=True) ])
decoder = containers.Sequential([ LSTM(output_dim=2, input_dim = 5, activation='tanh', return_sequences=True) ])
autoencoder = Sequential()
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=False))
autoencoder.compile(loss='mean_squared_error', optimizer='sgd')
autoencoder.fit(train_x,train_x, nb_epoch=10)
It is not clear to me if the code above does what i ask for.
If i do not use return_sequences=True
it yields an error.
Should i use a graph model to do exactly what i ask for?
Thank you in advance for your help.
Activity
fchollet commentedon Jan 4, 2016
What you posted does do what you figure describes.
You don't need an
AutoEncoder
layer to achieve this. You could simply do:Also don't train RNNs with SGD. Use RMSprop instead.
dpappas commentedon Jan 5, 2016
Thank you very much.
I assume that then i can save the weights of the 1st Lstm's final state weigths using
m.layers[0].get_weights()
lemuriandezapada commentedon Jan 6, 2016
In my experience Adam does better than RMSprop
dpappas commentedon Jan 15, 2016
Hello again
I would like to get the outputs of the first layer of the following model
when i type
i get
Is there a way to get the final output as a numpy array instead of the weights?
I believe the output is something like this :
Thank you in advance
jgc128 commentedon Jan 15, 2016
Hi,
you can save the weights of the first LSTM, create a separate model with only one LSTM layer and set the weights of this LSTM to your saved weights. After that you can use
predict
method to get the output of the first LSTM.dpappas commentedon Jan 16, 2016
Truthfully this is not what i want.
I do not want to use the trained LSTM
as an input to another Neural Net
I want to use the output of the LSTM as an embedding.
So i do not want 4 matrices ( the trained weights of the LSTM )
but 1 matrix with dimensionality n*1 where n is the number on nodes in the LSTM.
This matrix is the output of the 1st LSTM which was used as input to the
second LSTM as shown with the red arrow in the picture
jgc128 commentedon Jan 16, 2016
If you feed the same data to this new network with one LSTM layer you will get exactly what you want as the result of the predictions. You can save these results and use it anywhere you want.
dpappas commentedon Jan 18, 2016
If there are 100 instances there will be 100 autoencoders.
I want an autoencoder to overtrain on a specific instance and extract an embedding.
Think of it as compressing all the information for a text in a vector of size 10.
I want to use these 100 embeddings as input on another network. (size: 100 X 10)
I cannot connect all LSTMs at the same time and feed the original data once more.
Neither could i connect one lstm at a time.
I just want the output of the 1st layer to numpy array.
How can i get it ?
lemuriandezapada commentedon Jan 18, 2016
http://keras.io/faq/#how-can-i-visualize-the-output-of-an-intermediate-layer
dpappas commentedon Jan 18, 2016
Thank you.
This is what i was searching for!
MdAsifKhan commentedon Apr 21, 2016
@dpappas , I am also facing the same issue. I tried the above link and I get the attribute error
'LSTM' object has no attribute 'initial weights'
Could you please post the exactly snippet what you did.
GUR9000 commentedon Dec 24, 2016
@fchollet, using "return_sequences=True" does NOT produce what is described in the figure!
This will cause the "decoder" LSTM layer to use the output vector of the encoder at each time step instead of only the single final vector after processing the whole sequence (as shown in the figure). Or am I mistaken here?
ypxie commentedon May 3, 2017
@GUR9000 I think you are right. At every time step, the decoder needs to take the output from last step of decoder rather than the output of the encoder.
retkowski commentedon Jun 19, 2017
I want to build a LSTM autoencoder.
My data looks like this, with shape (1200, 10, 5), which is (training_size, timesteps, input_dim):
Code:
But for the decoded step it returns
ValueError: Input 0 is incompatible with layer lstm_5: expected ndim=3, found ndim=2
.Thank you in advance.
dpappas commentedon Jun 19, 2017
@ScientiaEtVeritas
Your encoded LSTM returns only the last output of the LSTM
you need to change the encoded line to
encoded = LSTM(3, input_shape=(10,5), return_sequences=True)
Finally your decoded LSTM needs a proper number of nodes for the lstm.
You give a tuple as size
You need to change this to
decoded = LSTM( 10 , return_sequences=True )
or change the number 10 to the size you want.
I suggest you read the documentation and some explanation on LSTMs
Understanding LSTM Networks
retkowski commentedon Jun 19, 2017
@dpappas : Thank you for your answer.
But for clarification and as mentioned before, using
return_sequences=True
is not what is shown in the picture. My goal is actually a single, final vector that represents the whole sequence as good as possible. I don't think this is the case when usingreturn_sequences=True
.There is another issue I found that actually describes why and that
return_sequences=False
for the encoder is the actual way to go #5138 (at the bottom).dpappas commentedon Jun 19, 2017
@ScientiaEtVeritas
You are wright.
I have not managed to do that with keras.
You could do it with tensorflow using seq2seq.
Maybe in keras you could do it with the step funtion of the lstm or a callback function to use the output of the decoder from the previous timestep as input to the new timestep
NeilYager commentedon Aug 23, 2017
@dpappas @ScientiaEtVeritas
Perhaps this has the desired effect: