A flexible, high-performance serving system for machine learning models
TensorFlow Serving is a flexible, high-performance serving system for
machine learning models, designed for production environments. It deals with
the inference aspect of machine learning, taking models after training and
managing their lifetimes, providing clients with versioned access via
a high-performance, reference-counted lookup table.
TensorFlow Serving provides out-of-the-box integration with TensorFlow models,
but can be easily extended to serve other types of models and data.
To note a few features:
# Download the TensorFlow Serving Docker image and repo
docker pull tensorflow/serving
git clone https://github.com/tensorflow/serving
# Location of demo models
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
# Start TensorFlow Serving container and open the REST API port
docker run -t --rm -p 8501:8501 \
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
tensorflow/serving &
# Query the model using the predict API
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
# Returns => { "predictions": [2.5, 3.0, 4.5] }
Refer to the official Tensorflow documentations site for a complete tutorial to train and serve a Tensorflow Model.
The easiest and most straight-forward way of using TensorFlow Serving is with
Docker images. We highly recommend this route unless you have specific needs
that are not addressed by running in a container.
In order to serve a Tensorflow model, simply export a SavedModel from your
Tensorflow program.
SavedModel
is a language-neutral, recoverable, hermetic serialization format that enables
higher-level systems and tools to produce, consume, and transform TensorFlow
models.
Please refer to Tensorflow documentation
for detailed instructions on how to export SavedModels.
Tensorflow Serving’s architecture is highly modular. You can use some parts
individually (e.g. batch scheduling) and/or extend it to serve new use cases.
If you’d like to contribute to TensorFlow Serving, be sure to review the
contribution guidelines.
Please refer to the official TensorFlow website for
more information.