Tensorflow model load is too slow when running multiple instances

Question

I am using tf.keras.models.load_model() to load the model, I also tried tf.saved_model module and was loading SavedModel format however in both cases the loading takes too long. When I further tested the code I learnt that when I am running one instance of my code loading the model takes ~4.10 seconds, but when I run multiple instances using subprocess module with Popen I get ~17.2 seconds when loading the model. First I want to understand why is it getting slower when I am running multiple instances? How can I overcome it ? Second, I need to have much faster loading time , even faster than 4 seconds ? So even if I get the same loading time with multiple instances I still have to have less than 4 sec.

score 0 · Answer 1 · answered Aug 15 '22 at 14:05

0

Perhaps the overhead of the Popen library is too large, so loading your model requires fewer lines of code than calling the Popen library. A thread I found about this is also available on StackOverflow:

Python subprocess module much slower than commands (deprecated)

answered Aug 15 '22 at 14:05

TechnicTom

66
1
4

yeah but still the the problem is in load model rather than the Popen. When I time the lines of Popen and the lines of loading the model, the load model causes bigger problems. – foxel Aug 15 '22 at 14:18

Hatfim · Answer 2 · 2023-02-16T02:47:00.073

I am facing similar problem. My segmentaiton model take nearly 3 ~4 second just on loading the model. The suggestion I got is to convert the Tensorflow core model to Tensorflow Lite which is an optimized FlatBuffer format identified by the .tflite file extension. Tensroflow Lite has multi-threaded kernels for many operators and optimized versions for most operators. The conversion can be done following the method from the offical tensorflow page , which is basically few lines of code and I put it hereunder :

# Convert the model
saved_model_dir ="model_head"  
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert()

# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

But there is a need to build Tensorflow Lite interpretor that suites your specific platform requirement. The detail is too much to mention here but it is available in the official page here convert to Lite Model and here-build interpreter , support is available for most platforms. As always, such alterinatives comes with compromises. Tensorflow Lite is not exceptions for this. Some operators may have to be refactored in case Tensorflow Lite doesn't suppor them.

Tensorflow model load is too slow when running multiple instances

2 Answers2