Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow backend - bug in model._make_predict_function(...) #2397

Closed
2 tasks done
Froskekongen opened this issue Apr 19, 2016 · 108 comments
Closed
2 tasks done

Tensorflow backend - bug in model._make_predict_function(...) #2397

Froskekongen opened this issue Apr 19, 2016 · 108 comments

Comments

@Froskekongen
Copy link

There appears to be a bug in the make_predict_function for the tensorflow backend. The following error message appears for me when trying to call model.predict(...)

self._make_predict_function()
  File "/usr/local/lib/python3.4/dist-packages/keras/engine/training.py", line 679, in _make_predict_function
    **self._function_kwargs)
  File "/usr/local/lib/python3.4/dist-packages/keras/backend/tensorflow_backend.py", line 615, in function
    return Function(inputs, outputs, updates=updates)
  File "/usr/local/lib/python3.4/dist-packages/keras/backend/tensorflow_backend.py", line 589, in __init__
    with tf.control_dependencies(self.outputs):
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 3192, in control_dependencies
    return get_default_graph().control_dependencies(control_inputs)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 2993, in control_dependencies
    c = self.as_graph_element(c)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 2291, in as_graph_element
    raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("Sigmoid_2:0", shape=(?, 17), dtype=float32) is not an element of this graph.

This does not happen when using the theano backend.

Notes: The model is loaded from json, and is defined as follows:

    seq1=Input(dtype='int32',shape=(400,),name='input_text')
    seq2=Input(dtype='int32',shape=(20,),name='input_titles')

    embeddeding=Embedding(max_features,embedding_dims,dropout=0.3)

    encoding_1=embeddeding(seq1)
    encoding_2=embeddeding(seq2)

    filter_lengths = [1,3,6]


    def max_1d(X):
        return K.max(X, axis=1)
    convs1=[]
    convs2=[]
    for fl in filter_lengths:

        conv1=Convolution1D(nb_filter=nb_filter,
                        filter_length=fl,
                        border_mode='valid',
                        activation='relu',
                        subsample_length=1)(encoding_1)
        conv1=Lambda(max_1d, output_shape=(nb_filter,))(conv1)
        convs1.append(conv1)

        conv2=Convolution1D(nb_filter=nb_filter,
                        filter_length=fl,
                        border_mode='valid',
                        activation='relu',
                        subsample_length=1)(encoding_2)
        conv2=Lambda(max_1d, output_shape=(nb_filter,))(conv2)
        convs2.append(conv2)

    m=merge([*convs1,*convs2],mode='concat')
    m=Highway(activation='relu')(m)
    m=Highway(activation='relu')(m)
    m=Dropout(0.5)(m)
    hovedkategori_loss=Dense(labsHovedKat.shape[1],activation='sigmoid',name='hovedkategori')(m)

    m1=merge([hovedkategori_loss,m],mode='concat')
    underkategori_loss=Dense(labsUnderKat.shape[1],activation='sigmoid',name='underkategori')(m1)

    model=Model(input=[seq1,seq2],output=[hovedkategori_loss,underkategori_loss])
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics={'hovedkategori':'accuracy','underkategori':'accuracy'})
  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
    pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
@Froskekongen
Copy link
Author

I would appreciate any comments on this issue, as I want to deploy the model asap. And I need to know if I can use it or code something else.

@fchollet
Copy link
Member

Do you have a code snippet to reproduce this issue? I can guarantee you that predict does in fact work, including with TensorFlow.

@Froskekongen
Copy link
Author

It appears this bug had nothing to do with either keras or tensorflow, but rather how async events were handled by the webserver I am using.

@jstypka
Copy link
Contributor

jstypka commented May 9, 2016

@Froskekongen could you describe how you fixed this in more detail? I'm having an exactly the same error however in a different program.

It seems to work when I do it manually in a REPL, however when I deploy it as a webservice it breaks.

@pxlong
Copy link

pxlong commented May 13, 2016

I also have the same error under the tensorflow backend, however, it works using the theano backend.
@jstypka @Froskekongen Have you found a solution to fix it?

@jstypka
Copy link
Contributor

jstypka commented May 13, 2016

@pxlong it also works on Theano for me, I think it's exactly the same problem. I didn't manage to solve it though, was hoping for some hints from @Froskekongen

@rkempter
Copy link

same here, same issue! Works fine in REPL, issues running it behind a webservice.

@rkempter
Copy link

Running the webservice with gunicorn in sync mode solved the issue.

@gladuo
Copy link

gladuo commented Aug 2, 2016

Hey everybody, I'm still not sure what's wrong with this combination.
But I use meinheld instead and it workes even better than gevent.
Hope this help.

@AbhishekAshokDubey
Copy link

Same problem (model.predtict breaking) for me too, but it worked when i switched to theano backend from tensflow.

@Nr90
Copy link

Nr90 commented Aug 26, 2016

Same problem here.
Seems to work fine normally. When deployed as a webservice using Flask, get this error.

@Nr90
Copy link

Nr90 commented Aug 27, 2016

Works when using Theano as backend, doesn't work with tensorflow.

@avital
Copy link

avital commented Oct 19, 2016

I had this problem when doing inference in a different thread than where I loaded my model. Here's how I fixed the problem:

Right after loading or constructing your model, save the TensorFlow graph:

graph = tf.get_default_graph()

In the other thread (or perhaps in an asynchronous event handler), do:

global graph
with graph.as_default():
    (... do inference here ...)

I learned about this from https://www.tensorflow.org/versions/r0.11/api_docs/python/framework.html#get_default_graph

@Walid-Ahmed
Copy link

Walid-Ahmed commented Nov 4, 2016

Thanks a lot.
it worked for me.

@moinudeen
Copy link

Thanks so much! @avital
Works like a charm!

@nipe0324
Copy link

nipe0324 commented Jan 2, 2017

Thanks a lot! @avital It worked.
env: keras with tensorflow on flask

@grantwwoodford
Copy link

Thank you @avital, that did the trick! This issue really should not be closed. This should be fixed by the Keras library.

@shashwat14
Copy link

You are savior! Thanks a lot. @avital

@tlids
Copy link

tlids commented Mar 31, 2017

Hi all, I followed @avital codes but got 'AttributeError: exit' after the with statement, does anyone know how to deal with this? Thanks!!

@shengyuzhang
Copy link

Thanks a million, it works!!!!@avital

@ghost
Copy link

ghost commented Apr 13, 2017

@avital
Thanks, worked for me!

@George-Zhu
Copy link

George-Zhu commented Apr 27, 2017

@avital
Thanks! It works well with tensorflow, but my model was built on keras. How to fix this problem on keras?

@adityareddy
Copy link

adityareddy commented May 5, 2017

Worked like a charm!! :) @avital

@justttry
Copy link

justttry commented May 24, 2017

the same problem and i solve it. thanks @avital
my solution is https://justttry.github.io/justttry.github.io/not-an-element-of-Tensor-graph/

@SunnerLi
Copy link

SunnerLi commented Jun 5, 2017

Amazing solution!!
I also encounter this problem.
In my way, I define the model in the model.py.
On the other hand, I create the model instance in main.py
Just do the revision in the main.py!!!

@jglwiz
Copy link

jglwiz commented Jun 7, 2017

avital's solution works!

keras with tensorflow backend.

Details:
global thread:

        self.model = load_model(model_path)
        self.model._make_predict_function()
        self.graph = tf.get_default_graph()

another thread:

        with self.graph.as_default():
            labels = self.model.predict(data)

@SiddhardhaSaran
Copy link

@shaoeChen but gunicorn also does the same thing right? it loads the model for each process.Anyways good for you.

@shaoeChen
Copy link

@SiddhardhaSaran hi dear.
i know what you mean.
but in my case, on uwsgi even i setting one process one thread it still no work, so i use gunicorn.
anyway thanks your good advice.

@blackwool
Copy link

@jglwiz
Thank you , it solved my problem

@nithintkv
Copy link

nithintkv commented Jun 7, 2019

I faced the same issue recently when deploying the model as a webservice using django. I ended up creating a singleton class that would have the model and the tf.graph() i.e it would be instatiated only once. it solved the problem.

@jurukode
Copy link

hi @shaoeChen , you might want to try my approach: #2397 (comment)

It both save session and graph

@eliadl
Copy link

eliadl commented Aug 7, 2019

Not sure if it's relevant to the original question, but maybe it'll be useful to others:

Based on this answer, the following resolved tf's multithreading compatibility for me:

# on thread 1
session = tf.Session(graph=tf.Graph())
with session.graph.as_default():
    k.backend.set_session(session)
    model = k.models.load_model(filepath)

# on thread 2
with session.graph.as_default():
    k.backend.set_session(session)
    model.predict(x, **kwargs)

The novelty here is keeping both the Session and the Graph for other threads.
The model is loaded in their "context" (instead of the default ones) and kept for other threads to use.
(By default the model is loaded to the default Session and the default Graph)
Another plus is that they're kept in the same object - easier to handle.

@Ai-is-light
Copy link

@eliadl @emesha92 how about multi-model prediction and multiprocessing under the one main process? would you mind giving me some advice.
Looking forward to any replies.
Thx

@eliadl
Copy link

eliadl commented Sep 13, 2019

@Ai-is-light by "multiprocessing under one main process" what exactly do you mean?

@keshavatgithub
Copy link

Amazing solution!!
I also encounter this problem.
In my way, I define the model in the model.py.
On the other hand, I create the model instance in main.py
Just do the revision in the main.py!!!

could you give the code snippet

@keshavatgithub
Copy link

lazy-apps = true

How to use this ?

@gustavz
Copy link

gustavz commented Oct 10, 2019

For me the is only solvable if i load the model inside the flask @app.route POST method.
Which means I reload the model on every request which is very inefficient.
Loading the model as a global either in the beginning of the flask app script or as global in the main(), does not work.

Any ideas on how to solve this?

@Chancetc
Copy link

Chancetc commented Dec 2, 2019

avital's solution works!

keras with tensorflow backend.

Details:
global thread:

        self.model = load_model(model_path)
        self.model._make_predict_function()
        self.graph = tf.get_default_graph()

another thread:

        with self.graph.as_default():
            labels = self.model.predict(data)

thanks a million!!

@Aparnamaurya
Copy link

I had this problem when doing inference in a different thread than where I loaded my model. Here's how I fixed the problem:
Right after loading or constructing your model, save the TensorFlow graph:

graph = tf.get_default_graph()

In the other thread (or perhaps in an asynchronous event handler), do:

global graph
with graph.as_default():
    (... do inference here ...)

I learned about this from https://www.tensorflow.org/versions/r0.11/api_docs/python/framework.html#get_default_graph

I think I have fixed this bug, ths.

If multithreaded performance is not a necessity, you can also run tensorflow on a single thread

(as in my case , I came across the same issue and none of the mentioned method worked, it would be great if someone could help in figuring out but just in case)

To run tensorflow on a single thread-

global session
session_conf = tf.ConfigProto(
      intra_op_parallelism_threads=1,
      inter_op_parallelism_threads=1)
session = tf.Session(config=session_conf)

# for inference
global session
with session.graph.as_default():
      (...do infernece here ... )

@isaaclok
Copy link

I solved this upgrading to tensorflow2 and implementing a singleton pattern in my model, like so:

class MyModel(object, metaclass=Singleton):
...
class Singleton(type):
    _instances = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

Should work for both keras alone and tensorflow2.

hunterhector added a commit to hunterhector/forte that referenced this issue Jul 4, 2020
 - This is mainly done by using `graph.as_default()`.
 - Detailed discussions can be found at: keras-team/keras#2397 (comment)
 - Without explictly using `as_default`, running this will fail in Django (unless you specify -nothreading)
hunterhector added a commit to asyml/forte that referenced this issue Jul 20, 2020
* Write URIs in multi pack serialization instead of paths.

* Add rewriter example.

* add an example to read from local pack.

* some ontology

* Update example, model should be initialize separately.

* Add a fake model.

* Resolve merging conflicts

* Fix some stuff.

* Clean the code. (#5)

* Clean the code.

* Clean the code.

* stuff.

* Fix import issues.

* Make model path configurable.

* Write data to disk for every new processing.

* Fixing mypy and pylint issues in content_rewriting example.

* Resolve flake8 issues.

* Make pylint happy.

* Removing tf.flags to avoid polluting command line args.

* Make the model work in a multi-thread environment.

 - This is mainly done by using `graph.as_default()`.
 - Detailed discussions can be found at: keras-team/keras#2397 (comment)
 - Without explictly using `as_default`, running this will fail in Django (unless you specify -nothreading)

* Update with more table examples.

* Update with more table examples.

* Remove unused args.

* Fix pylint problem.

Co-authored-by: Shuai Lin <shuailin97@gmail.com>
@Minqi824
Copy link

Minqi824 commented Aug 7, 2020

Well. Currently my tf is 2.2.0 and keras is 2.3.1. I add "_make_predict_function()" after loading the model.

However, I get this error "RuntimeError: Attempting to capture an EagerTensor without building a function." and the command "tf.compat.v1.disable_eager_execution()" seems not work and generate new error.

Can anybody help me solve this problem? Thanks a lot!

@neilmario70
Copy link

neilmario70 commented Oct 25, 2020

I am not able to follow avitals instructions. Could someone share an example code of using one of the pretrained models like Resnet50 or VGGnet? My flask app only works on the development server and stop running as soon as I use nginx with uwsgi on production with multi threading. I am currently using this image https://hub.docker.com/r/tiangolo/uwsgi-nginx/ in production and here is my code for making prediction.

I am importing the predict_class function to make predictions on uploaded files from the main.py file

from tensorflow.keras.applications import ResNet50
from tensorflow.keras.applications.resnet50 import preprocess_input as resnet_preprocess_input
from PIL import Image
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.preprocessing.image import load_img
import pickle
import numpy as np
import os

folder_path = os.path.dirname(os.path.abspath(__file__))
model_path = os.path.join(folder_path, 'Logistic_model_resnet.sav')
res_model_path = os.path.join(folder_path, 'resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5')
resnet_base_model = ResNet50(weights=res_model_path, include_top=False, pooling='max')
logistic_model = pickle.load(open(model_path, 'rb'))

def predict_class(img_path):
    img = load_img(img_path, target_size=(224, 224), color_mode='grayscale')
    img_data = img_to_array(img)
    img_data = np.repeat(img_data, repeats = 3, axis = -1)
    img_data = np.expand_dims(img_data, axis=0)
    img_data = resnet_preprocess_input(img_data)
    feature_vector = resnet_base_model.predict(img_data)[0]
    prediction_result = logistic_model.predict([feature_vector])[0]
    return prediction_result 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests