Export a KMeans model using export_savedmodel to deploy on ml-engine

1.2k Views Asked by At

I'm doing a K-means clustering using tensorflow.contrib.learn.KMeansClustering.

I can use it default model to predict local but since I want to use ml-engine online prediction, I must export it to a export_savedmodel format.

I have google lot's of place but since KMeansClustering class needs no feature columns so I don't know how to build the correct serving_input_fn for export_savedmodel

Here is my code

# Generate input_fn
def gen_input(data):
    return tf.constant(data.as_matrix(), tf.float32, data.shape), None

# Declare dataset + export model path
TRAIN = 'train.csv'
MODEL = 'model'

# Read dataset
body = pd.read_csv(
    file_io.FileIO(TRAIN, mode='r'),
    delimiter=',',
    header=None,
    engine='python'
)

# Declare K-Means
km = KMeansClustering(
    num_clusters=2,
    model_dir=MODEL,
    relative_tolerance=0.1
)

est = km.fit(input_fn=lambda: gen_input(body))

# This place is where I stuck
fcols = [tf.contrib.layers.real_valued_column('x', dimension=5)]
fspec = tf.contrib.layers.create_feature_spec_for_parsing(fcols)
serving_input_fn = tf.contrib.learn.python.learn.\
                   utils.input_fn_utils.build_parsing_serving_input_fn(fspec)
est.export_savedmodel(MODEL, serving_input_fn)

Here is my toy train.csv

1,2,3,4,5
2,3,4,5,6
3,4,5,6,7
5,4,3,2,1
7,6,5,4,3
8,7,6,5,4

Exported model have the format of saved_model.pb with its variables folder

Deploying the model to ml-engine was successful, but when predicting with same train.csv I got the following error

{"error": "Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"Name: <unknown>, Feature: x (data type: float) is required but could not be found.\n\t [[Node: ParseExample/ParseExample = ParseExample[Ndense=1, Nsparse=0, Tdense=[DT_FLOAT], _output_shapes=-1,5, dense_shapes=5, sparse_types=[], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](_arg_input_example_tensor_0_0, ParseExample/ParseExample/names, ParseExample/ParseExample/dense_keys_0, ParseExample/Const)]]\")"}

I have struggled with this for month while all documents that I found are for pure API

I'm looking forward to your advice

Thanks in advance

1

There are 1 best solutions below

5
rhaertel80 On BEST ANSWER

The Census sample shows how to setup the serving_input_fn for CSV. Adjusted for your example:

CSV_COLUMNS = ['feat1', 'feat2', 'feat3', 'feat4', 'feat5']
CSV_COLUMN_DEFAULTS = [[0.0],[0.0],[0.0],[0.0],[0.0]] 

def parse_csv(rows_string_tensor):
  """Takes the string input tensor and returns a dict of rank-2 tensors."""

  # Takes a rank-1 tensor and converts it into rank-2 tensor
  # Example if the data is ['csv,line,1', 'csv,line,2', ..] to
  # [['csv,line,1'], ['csv,line,2']] which after parsing will result in a
  # tuple of tensors: [['csv'], ['csv']], [['line'], ['line']], [[1], [2]]
  row_columns = tf.expand_dims(rows_string_tensor, -1)
  columns = tf.decode_csv(row_columns, record_defaults=CSV_COLUMN_DEFAULTS)
  features = dict(zip(CSV_COLUMNS, columns))

  return features

def csv_serving_input_fn():
  """Build the serving inputs."""
  csv_row = tf.placeholder(
      shape=[None],
      dtype=tf.string
  )
  features = parse_csv(csv_row)
  return tf.contrib.learn.InputFnOps(features, None, {'csv_row': csv_row})

# No need for fcols/fspec
est.export_savedmodel(MODEL, serving_input_fn)

TensorFlow 1.4 will simplify at least some of this.

Also, consider using JSON, as that is the more standard approach for serving. Happy to provide details upon request.