Deploy a Tensorflow Model to Production
Build and train the model
We are going to create a simple Tensorflow Sequential
model to recognize handwritten digits from 0 to 9.
First, we load the data:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Then we define the model:
model = tf.keras.models.Sequential([
tf.keras.layers.Rescaling(1./255),
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
The output is a Dense
layer with 10 units. The model returns a vector of logits, one for each class.
predictions = model(x_train[:1]).numpy()
print(predictions)
"""
array([[0.10152689, 0.17136204, 0.12844075, 0.06615458, 0.11072428,
0.11157966, 0.04291946, 0.10779996, 0.08779941, 0.07169289]],
dtype=float32)
"""
We can compile the model and train it:
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
After training, we evaluate our model on the test set to see how well it performs on data it has never seen before:
model.evaluate(x_test, y_test, verbose=2)
# [0.0701991468667984, 0.9783000349998474]
It achieve 97.8% of accuracy. Not bad!
Save the model
We save the tensorflow model:
import os
model_version = "0001"
model_name = "my_mnist_model"
model_path = os.path.join(model_name, model_version)
tf.saved_model.save(model, model_path)
Deploy on Google Cloud Platform
Now, we are going to deploy our model on Google Cloud Platform to make it available for live predictions.
You need to have the gcloud CLI installed on your computer to run the following commands.
Define the variables
Open a terminal and declare some variables for convenience:
export PROJECT_ID="project-123"
export BUCKET_NAME="my-bucket"
export REGION="europe-west4"
Create a Storage bucket
gsutil mb -p ${PROJECT_ID} -l ${REGION} -b on "gs://${BUCKET_NAME}"
Upload the model folder to your bucket
gsutil cp -r ./my_mnist_model "gs://${BUCKET_NAME}/"
Create the Vertex AI model
gcloud ai models upload \
--region=${REGION} \
--display-name="mnist" \
--container-image-uri="europe-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest" \
--artifact-uri="gs://${BUCKET_NAME}/my_mnist_model/0001"
export MODEL_ID=$(gcloud ai models list --region=${REGION} --filter=display_name="mnist" --format="value(MODEL_ID)")
Create the endpoint
gcloud ai endpoints create \
--region=${REGION} \
--display-name="mnist-endpoint"
export ENDPOINT_ID=$(gcloud ai endpoints list --region=${REGION} --filter=display_name="mnist-endpoint" --format="value(ENDPOINT_ID)")
Deploy the model
gcloud ai endpoints deploy-model ${ENDPOINT_ID} \
--region=${REGION} \
--model=${MODEL_ID} \
--display-name="mnist" \
--machine-type="n1-standard-2" \
--traffic-split="0=100"
Use the model endpoint
Once our model has been deployed on GCP, we can use the endpoint to do batch predictions.
First let's generate some data from the test set and save it in a JSON file:
pred_file = {
'instances': x_test[0:5].tolist(),
}
import json
# Write to a file
with open('pred_file.json', 'w') as f:
json.dump(pred_file, f)
We can now send a simple CURL command to predict the results:
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/endpoints/${ENDPOINT_ID}:predict \
-d "@pred_file.json"
Cleanup GCP resources (optional)
To avoid incurring charges, you can delete the resources.
Undeploy the model
export DEPLOYED_MODEL_ID=$(gcloud ai endpoints describe ${ENDPOINT_ID} --project=${PROJECT_ID} --region=${REGION} --format="value(deployedModels.id)")
gcloud ai endpoints undeploy-model ${ENDPOINT_ID} --deployed-model-id=${DEPLOYED_MODEL_ID} --project=${PROJECT_ID} --region=${REGION}
Delete the endpoint
gcloud ai endpoints delete ${ENDPOINT_ID} --project=${PROJECT_ID} --region=${REGION}
Delete the model
gcloud ai models delete ${MODEL_ID} --project=${PROJECT_ID} --region=${REGION}