Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online Learning in Keras? #1868

Closed
ashwinnaresh opened this issue Mar 2, 2016 · 31 comments
Closed

Online Learning in Keras? #1868

ashwinnaresh opened this issue Mar 2, 2016 · 31 comments

Comments

@ashwinnaresh
Copy link

I wanted to implement online learning for a LSTM RNN. Does keras support online learning as of now?
If not, can someone direct us to any source on how online learning can be implemented for RNN? (atleast on a conceptual level).

@tboquet
Copy link
Contributor

tboquet commented Mar 2, 2016

When you say online learning, do you mean this?
If it's the case, you could take a look at this example.

@sjayakum
Copy link

sjayakum commented Mar 3, 2016

Say,

If I train a model with a training set.
Now, Can i 're-fit' [update NOT re-train] the model with new data such that model parameters are just updated and not re-initialized.

Example:
The below model is loaded after being trained with a training dataset.


model = model_from_json(open('lstm.json').read())
model.load_weights('lstm_weights.h5')

Now, when I do


model.fit(new_data_input,new_data_output,verbose=1,nb_epoch=j)

Here does the above statement retrain the model from scratch or just the model parameters get updated.

@pasky
Copy link
Contributor

pasky commented Mar 3, 2016 via email

@ashwinnaresh
Copy link
Author

Okay, thank you!

@marcj
Copy link

marcj commented Mar 3, 2016

The parameters are not only getting updated but also the learned from initial training will be overwritten. The more new_data_input you have, the more the old training will be overwritten and it's probable that the accuracy for those training data goes down. I guess you mean with online learning, that with every new input all total inputs ever used are added to the network instead of overwritten, but I don't think this is the case.

@pasky
Copy link
Contributor

pasky commented Mar 3, 2016

I'm not sure I follow exactly what Marc is trying to say (maybe he meant
to use "diminished" rather than "overwritten").

But it does bear to note that when you restart the fit(), any kind
of learning rate schedule your optimizer is following (e.g. SGD+decay
or adam) is restarted too! That means you may be updating your model
on a new small sample by using a huge learning rate, which might not
be okay.

Fixing this (to set the initial #iterations > 0) will require some
small changes to keras.optimizers. But I don't know if there's any
literature on what's actually the good way to fix this, because just
setting the #iterations to number of iterations ran before does not
sound right either. So perhaps the safest solution may be to just
use plain SGD with static learning rate for these followup fit()s.

@ashwinnaresh
Copy link
Author

Yeah, I want the LSTM to learn with newer data. If it forgets what it has learnt from the older data, its fine. It needs to update itself depending on the trend in the new data.
It seems that .fit() does take care of the problem I had.
Thanks @tboquet, @pasky and @marcj

@anujgupta82
Copy link

@ashwin123 : I am also looking for online learning in keras. What I understand is that:
for every datapoint , you update the model. Once a data point is used, never use it again. Is that correct ?

Also from your experience, how did your model perform ?

@ashwinnaresh
Copy link
Author

Yes, I was updating the model with a new batch of data points. The model seemed to perform well.

@anujgupta82
Copy link

anujgupta82 commented Apr 22, 2016

@ashwin123
@fchollet
Is there a way to incorporate Passive-Aggresive Algorithm [Crammer06] in updating the weights for online variant ? To be honest I am not even sure If the 2 can be connected.

Looking for some good practices of building models using online deep learning

@anujgupta82
Copy link

anujgupta82 commented Apr 25, 2016

I have put up a basic code for Online Deep Learning in Keras.
https://github.com/anujgupta82/DeepNets/blob/master/Online_Learnin/Online_Learning_DeepNets.ipynb
The key difference is the way training is done - refer to cell number 9 and 17 in the notebook.
In Cell 17, I take one datapoint (as if its a streaming data) and fit the model on this datapoint. Then take next datapoint so on and so forth. Further, each datapoint is considered exactly once and no more.

There is a difference in the outcome of offline and online - On test data ofline gave a 97.98 accuracy, online learning gave 93.96 accuracy.

Is this a right way to implement online learning in Keras ?
Any suggestions on what changes can I make to online accuracy closer to offline ?

@toluwajosh
Copy link

@anujgupta82 the link you gave is broken. Can you please give another link that works. It will be highly appreciated. Thanks

@BoltzmannBrain
Copy link

@ashwin123 would you be willing to share your code for this issue?

@BoltzmannBrain
Copy link

Thank you for the example code @anujgupta82.

As one would assume, the training time for an online LSTM can be prohibitively slow. I would like to train my network on mini-batches, and test (run prediction) online. If anyone can help, please take a look at my question on SO.

@snlpatel001213
Copy link

@fchollet
Please provide an example for how I can train model with new data preserving previous trained Info.
Thanks

@patyork
Copy link
Contributor

patyork commented Jan 12, 2017

# create model
...
...
# train model on available data
model.fit(.....)
...
# save model
model.save('yourfileName.h5')

Some time passes

from keras.models import load_model
model = load_model('yourfileName.h5')

# continue training
model.fit(...)

@smhoang
Copy link

smhoang commented Feb 16, 2017

Example flow from @patyork is only good for transfer learning, or weight initialization, not for online learning as optimizer's parameters (the huge learning rate, decay...) are restarted when model.fit(...). What I did is to set small learning rate, reasonable number of epoch and combine online training data and apart of previous training data to fit after loading the model. The experimental result is not too bad, but I don't think this is a good way to go. Any other approach to solve online deep learning ?

@i3v
Copy link

i3v commented May 20, 2017

@smhoang,
Yep, that's true. I'm facing similar issues and, after some digging into the code it looks like, indeed, this is not supported "out of the box", see #6697. But it is still possible to save and restore the "comprehensive model state" by using custom callbacks (by manually saving and restoring their state).

@snowde
Copy link

snowde commented Nov 1, 2017

This baby should work

from keras.models import model_from_json

def sav_model(model):    
    json_string = model.to_json()
    open('model.json', 'w').write(json_string)
    model.save_weights('weights.h5', overwrite=True)
    
def lad_model():
    model = model_from_json(open('model.json').read())
    model.load_weights('weights.h5')
    sgd = SGD(lr = 0.1, momentum = 0.9, decay = 0, nesterov = False)
    model.compile(optimizer=sgd, loss = 'binary_crossentropy')
    return model

@MLDSBigGuy
Copy link

@snowde are n't you just loading the model and compiling it ? Where are u adding new data ?

@and-rewsmith
Copy link

@snowde @MLDSBigGuy How does this:

def sav_model(model):    
    json_string = model.to_json()
    open('model.json', 'w').write(json_string)
    model.save_weights('weights.h5', overwrite=True)
    
def lad_model():
    model = model_from_json(open('model.json').read())
    model.load_weights('weights.h5')
    sgd = SGD(lr = 0.1, momentum = 0.9, decay = 0, nesterov = False)
    model.compile(optimizer=sgd, loss = 'binary_crossentropy')
    return model

Differ from:

model.save('filename.txt')

@snegas
Copy link

snegas commented May 20, 2018

@toluwajosh here is the working link https://github.com/anujgupta82/DeepNets/blob/master/Online_Learning/Online_Learning_DeepNets.ipynb

Might be somebody haven't found the repository yet

@Metalkiler
Copy link

Metalkiler commented Jun 5, 2018

Hi, according to online learning I believe the issue is something like this:
Today we get a dataset with 3 diferent values (e.g, A,B,C) if at any given moment we got a new letter (named D) we could just update the model...

For instance I believe that online learning keeps the weights of the hidden layers and "dumps" the input layers and the output layer (since it will be different) so the code I developed should help with online learning :) (Ps: X is a sparse matrix )

#para atualizar o modelo com novos dados é preciso:
#deitar fora o primeiro layer e o ultimo layer
#os restantes layers ficam com os pesos precisos..
model2 = Sequential()
model2.add(Dense(1000, input_dim=len(train_X.columns), activation='relu', name='new_Inputs'))

for layer in model.layers[1:-1]:
model2.add(layer)

model2.add(Dense(1, activation='sigmoid'))

model2.compile(loss='binary_crossentropy',
optimizer='AdaDelta',
metrics=['accuracy'])

history=model2.fit( X , train_Y.target,
epochs=10,
batch_size=2**10,callbacks = callbacks_list,validation_split=0.2)

Do you agree with this ideia as a way to update the model with volatile data (such is the case with Big Data) ?

PS2: I saw two posts of this sorry for the repost all

EDIT This usually happens when you 1 hot encode data

@anujgupta82
Copy link

Guys the link is perfectly working for me
https://github.com/anujgupta82/DeepNets/blob/master/Online_Learning/Online_Learning_DeepNets.ipynb
Are you guys still not able to access it?

@Metalkiler
Copy link

I suggest looking at this paper when building reusable neural networks as well

https://ieeexplore.ieee.org/abstract/document/8851888

@GF-Huang
Copy link

So what’s the solution?

@Metalkiler
Copy link

Basically according to https://ieeexplore.ieee.org/abstract/document/8851888 paper (published in IJCNN conference), you can memorize the structure in a certain Point of time (in this case was in each U). If an input differs from the previous learnt structure you can basically assign the random Weight to that connection (between input and the first layer). If the Weight is known just assign the same weight learnt!

@dileepkumarg-sa
Copy link

I have a similar problem. What is the final solution for it?

@phdrsch
Copy link

phdrsch commented Jan 19, 2021

@dileepkumarg-sa are you able to get the solution?

@phara23
Copy link

phara23 commented Mar 9, 2021

According to this SO thread as of March 1st 2021 the following is the case:

When you restart training after using model.save it trains with the learning rate it had when you saved the model. To make sure I wrote a simple callback using the learning rate scheduler callback.The code for the callback is shown below. I then trained a model for 5 epochs, saved the model, loaded the model and trained again. The callback prints the value of the learning rate at the beginning of each epoch and shows when training resumes the learning rate was preserved.

def scheduler(epoch, lr):
    lrin=lr
    if epoch < 2:
     lrout=lr
    else:
        lrout= lr * .5
    print ('At the start of epoch ', epoch+1, 'lr is ', lrin, ' will be set to ', lrout, ' for epoch ', epoch+2)
    return lrout
lrs=tf.keras.callbacks.LearningRateScheduler(scheduler)

put this in your code before you call model.fit. then in model.fit include

callbacks=[lrs]

So online learning can be accomplished in keras, if you save the model and then load it you can just continue training it with the .fit() method.

@modaresimr
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests