Tensorflow serving model versioning

68 Views Asked by At

I'm evaluation an option to host a saved tensorflow model using the tensorflow/serving docker image.

https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/docker.md

If I mount a directory with following structure:

test-model
  1
    saved model v1
  2
    saved model v2

using following command:

docker run \
-v "test-model:/models/test-model" \
-e MODEL_NAME=test-model \
tensorflow/serving

Will I be able to invoke different version of the model using different endpoint?

I tried following:

http://localhost:8501/v1/models/test-model/1 and

http://localhost:8501/v1/models/test-model/v1

but got:

{
    "error": "Malformed request: POST /v1/models/test-model/v1:predict"
}
1

There are 1 best solutions below

0
senjin.hajrulahovic On

Turns out that both the http and grcp apis support invoking of different versions of the model by providing a model_spec property.

HTTP api:

{
  "model_spec": {
    "name": "test-model",
    "version": 1
  },
  "inputs": ...
}

GRPC api:

message ModelSpec {
  // Required servable name.
  string name = 1;

  // When left unspecified, the system will serve the best available version.
  // This is typically the latest version, though during version transitions,
  // notably when serving on a fleet of instances, may be either the previous or
  // new version.
  oneof version_choice {
    // Use this specific version number.
    google.protobuf.Int64Value version = 2;

  ...
}


message PredictRequest {
  // Model Specification. If version is not specified, will use the latest
  // (numerical) version.
  ModelSpec model_spec = 1;

  ...
}

The version number corresponds to the model subdirectory.