In this lab, you will use the What-if Tool to analyze and compare two different models deployed on Cloud AI Platform.

What you learn

You'll learn how to:

The total cost to run this lab on Google Cloud is about $1.

You'll need a Google Cloud Platform project with billing enabled to run this codelab. To create a project, follow the instructions here.

Step 1: Enable the Cloud AI Platform Models API

Navigate to the AI Platform Models section of your Cloud Console and click Enable if it isn't already enabled.

Step 2: Enable the Compute Engine API

Navigate to Compute Engine and select Enable if it isn't already enabled. You'll need this to create your notebook instance.

Step 3: Create an AI Platform Notebooks instance

Navigate to AI Platform Notebooks section of your Cloud Console and click New Instance. Then select the latest TF 1.x instance type without GPUs:

Use the default options and then click Create. Once the instance has been created, select Open JupyterLab:

Step 4: Import Python packages

In the first cell of your notebook, add the following imports and run the cell. You can run it by pressing the right arrow button in the top menu or pressing command-enter:

import pandas as pd
import numpy as np
import tensorflow as tf
import witwidget
import os
import pickle

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

from sklearn.utils import shuffle
from sklearn.linear_model import LinearRegression
from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

# This should be version 1.14
print(tf.__version__)

Make sure your notebook is using TensorFlow 1.14, you should see the version logged in the cell output.

The models you'll build will predict the quality score of a wine given 11 numerical data points about that wine. You'll train your models on the UCI wine quality dataset.

Step 1: Download UCI wine quality data

You can download the data directly from the UCI machine learning website:

!wget 'http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv'

Step 2: Create a Pandas DataFrame

Next we'll read the data into a Pandas DataFrame to see what we'll be working with. It's important to shuffle our data in case the original dataset is ordered in a specific way. We use an sklearn utility called shuffle to do this, which we imported in the first cell:

data = pd.read_csv('winequality-white.csv', index_col=False, delimiter=';')
data = shuffle(data, random_state=4)

data.head()

data.head() lets us preview the first five rows of our dataset in Pandas. You should see something like this after running the cell above:

The quality columns is the thing our model will predict. This is the score associated with a particular wine. To see the distribution of wine scores in the dataset, run the following:

labels = data['quality']
print(labels.value_counts())

Since the score of a wine in this dataset can be anything from 3 to 9, we'll build a regression model. That means our model will predict a numerical value for a given wine, rather than assign it a particular class.

Step 3: Splitting data into train and test sets

An important concept in machine learning is train / test split. We'll take the majority of our data and use it to train our model, and we'll set aside the rest for testing our model on data it's never seen before.

Add the following code to your notebook, which drops the label column from our dataset, and splits our data and labels into training and test sets:

data = data.drop(columns=['quality'])

train_size = int(len(data) * 0.8)
train_data = data[:train_size]
train_labels = labels[:train_size]

test_data = data[train_size:]
test_labels = labels[train_size:]

Now you're ready to build and train your first model!

We'll use TensorFlow's Keras API to build our first model. To do this we'll create a 4-layer deep neural network. We imported Keras from TensorFlow at the beginning of our notebook.

Step 1: Build TensorFlow model

The input to our model, or features, will be the remaining columns in our Pandas dataframe (everything except the quality column). Run the code below to get the size of the arrays we'll be feeding into our model for each example:

input_size = len(train_data.iloc[0])
print(input_size)

Next we'll be defining our model using the Keras Sequential model API. The Sequential API lets us define our model as a stack of layers. Our model will take our 11-element feature array as input, transform it into a series of hidden layers, and output a 1-element array for each example. This output array will contain the predicted quality score for each wine input. Run the following code to build your model:

model = Sequential()
model.add(Dense(200, input_shape=(input_size,), activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

The activation function is how Keras calculates the output of each layer. Since we want our model to output a numerical value for the wine score, we don't need an activation function on the last layer. To use our model we must compile it by passing the loss and optimizer functions our model should use. Mean squared error is often used with regression models, you can read more about it here.

Keras has a handy summary() function that you can use to understand the number of trainable parameters (weights and biases) in each layer of your model. Run the following in your next cell to get a summary:

model.summary()

Step 2: Train and evaluate the model

We can train our model with one function call using the fit() method, and passing it a few parameters:

Run the following to train your model:

model.fit(
  train_data.values,
  train_labels.values, 
  epochs=4, 
  batch_size=32, 
  validation_split=0.1
)

Now you can evaluate your model on your test set:

model.evaluate(
  test_data.values,
  test_labels.values, 
  batch_size=32
)

And to make sure things are working correctly, you can run a prediction on the local model. We'll pass the first example from our test set to the model for a prediction:

test_prediction = model.predict(test_data.values[0:1])
print('Predicted wine score:', test_prediction[0][0])
print('Actual wine score:', test_labels.values[0:1][0])

In the output you should see your model's prediction, along with the actual prediction for the first example, 6.

We've got our model working locally, but it would be nice if we could make predictions on it from anywhere (not just this notebook!). In this step we'll deploy it to the cloud.

Step 1: Create a Cloud Storage bucket for our model

Let's first define some environment variables that we'll be using throughout the rest of the codelab. Fill in the values below with the name of your Google Cloud project, the name of the cloud storage bucket you'd like to create (must be globally unique), and the version name for the first version of your model:

# Update these to your own GCP project, model, and version names
GCP_PROJECT = 'your-gcp-project'
KERAS_MODEL_BUCKET = 'gs://storage_bucket_name'
KERAS_VERSION_NAME = 'v1'

Now we're ready to create a storage bucket to store our Keras model file. We'll point Cloud AI Platform at this file when we deploy. Run this gsutil command from within your notebook to create a bucket, replacing your_bucket_name with a unique name for your bucket:

!gsutil mb $KERAS_MODEL_BUCKET

Step 2: Prepare our Keras model for serving

In order to serve our tf.keras model on Cloud AI Platform, we'll need to add a serving layer to it and export it to the TensorFlow SavedModel format. Run the following code in your notebook to do that:

# Add the serving input layer below in order to serve our model on AI Platform
class ServingInput(tf.keras.layers.Layer):
  # the important detail in this boilerplate code is "trainable=False"
  def __init__(self, name, dtype, batch_input_shape=None):
    super(ServingInput, self).__init__(trainable=False, name=name, dtype=dtype, batch_input_shape=batch_input_shape)
  def get_config(self):
    return {'batch_input_shape': self._batch_input_shape, 'dtype': self.dtype, 'name': self.name }

restored_model = model

serving_model = tf.keras.Sequential()
serving_model.add(ServingInput('serving', tf.float32, (None, input_size)))
serving_model.add(restored_model)
tf.contrib.saved_model.save_keras_model(serving_model, os.path.join(KERAS_MODEL_BUCKET, 'keras_export'))  # export the model to your GCS bucket
export_path = KERAS_MODEL_BUCKET + '/keras_export'

This should have exported your Keras model and uploaded it to the Cloud Storage bucket you just created. Head over to the Storage browser in your Cloud Console to confirm these files have been copied to your bucket. You should see something like this:

Step 3: Create and deploy the model

We're almost ready to deploy the model! First, run the command below to configure the gcloud CLI to use your current project:

!gcloud config set project $GCP_PROJECT

The following ai-platform gcloud command will create a new model in your project. We'll call this one keras_wine:

!gcloud ai-platform models create keras_wine

Now it's time to deploy the model. We can do that with this gcloud command:

!gcloud beta ai-platform versions create $KERAS_VERSION_NAME --model keras_wine \
--origin=$export_path \
--python-version=3.5 \
--runtime-version=1.14 \
--framework='TENSORFLOW'

While this is running, check the models section of your AI Platform console. You should see your new version deploying there:

When the deploy completes successfully you'll see a green check mark where the loading spinner is. The deploy should take 2-3 minutes.

Step 4: Test the deployed model

To make sure your deployed model is working, test it out using gcloud to make a prediction. First, save a JSON file with one test instance for prediction:

%%writefile predictions.json
[7.8, 0.21, 0.49, 1.2, 0.036, 20.0, 99.0, 0.99, 3.05, 0.28, 12.1]

Test your model by running this code:

prediction = !gcloud ai-platform predict --model=keras_wine --json-instances=predictions.json --version=$KERAS_VERSION_NAME
print(prediction[1])

You should see your model's prediction in the output.

Our goal here is to use the What-if Tool to compare two different models trained on the same dataset. In order to do that, we've got one more model to build. We'll build a Scikit Learn model in this section. Since we've already processed and split our data, this step will be pretty fast.

Step 1: Create a Storage Bucket for your model

First, define some new environment variables for your Scikit Learn model. Be sure to replace your_sklearn_bucket with a unique bucket name:

SKLEARN_VERSION_NAME = 'v1'
SKLEARN_MODEL_BUCKET = 'gs://your_sklearn_bucket'

Then create that bucket using gsutil:

!gsutil mb $SKLEARN_MODEL_BUCKET

Step 2: Train and export the model

You can train a Scikit Learn regression model in one line of code, using the built-in LinearRegression model which we imported at the beginning of the notebook:

scikit_model = LinearRegression().fit(
  train_data.values, 
  train_labels.values
)

When the model is trained, export it to a local file using pickle:

pickle.dump(scikit_model, open('model.pkl', 'wb'))

Step 1: Copy your model to Cloud Storage

Copy the model you just exported to your Storage Bucket:

!gsutil cp ./model.pkl $SKLEARN_MODEL_BUCKET/model.pkl

Step 2: Create and deploy the model

Create your model using gcloud. We'll call this one sklearn_wine:

!gcloud ai-platform models create sklearn_wine

Then deploy the model:

!gcloud beta ai-platform versions create $SKLEARN_VERSION_NAME --model=sklearn_wine \
--origin=$SKLEARN_MODEL_BUCKET \
--runtime-version=1.14 \
--python-version=3.5 \
--framework='SCIKIT_LEARN'

Step 3: Test the deployed model

Finally, make sure your new Scikit Learn model is working correctly by passing it the same test JSON file as above:

!gcloud ai-platform predict --model=sklearn_wine --json-instances=predictions.json --version=$SKLEARN_VERSION_NAME

With both models deployed, we're ready to compare them using the What-if Tool!

Step 1: Create the What-if Tool visualization

To connect the What-if Tool to your AI Platform models, you need to pass it a subset of your test examples along with the ground truth values for those examples. Let's create a Numpy array of 200 of our test examples along with their wine score values:

test_examples = np.hstack((test_data[:200].values,test_labels[:200].values.reshape(-1,1)))

Instantiating the What-if Tool is as simple as creating a WitConfigBuilder object and passing it the two AI Platform models we'd like to compare. We use set_predict_output_tensor('sequential').set_uses_predict_api(True) calls when we create the visualization here because our tf.keras model returns results inside a dict with the key of sequential:

config_builder = (WitConfigBuilder(test_examples.tolist(), data.columns.tolist() + ['quality'])
  .set_ai_platform_model(GCP_PROJECT, 'keras_wine', KERAS_VERSION_NAME).set_predict_output_tensor('sequential').set_uses_predict_api(True)
  .set_target_feature('quality')
  .set_model_type('regression')
  .set_compare_ai_platform_model(GCP_PROJECT, 'sklearn_wine', SKLEARN_VERSION_NAME))
WitWidget(config_builder, height=800)

Note that it'll take a minute to load the visualization. When it loads, you should see the following:

Step 2: Explore individual data points

The default view on the What-if Tool is the Datapoint editor tab. Here you can click on any individual data point to see it's features, compare each model's prediction, and even change feature values:

Let's change the alcohol percentage to 8 on this example and see how that affects the prediction. You can edit the value right in the tool and then select Run inference again:

Looks like decreasing the alcohol percentage decreased the wine's score for each model, though it had more of an effect on the Keras model.

Step 3: Look at partial dependence plots

To see how each feature affects each model's prediction, check the Partial dependence plots box:

Here we can see that alcohol percentage affects both models, but it affects the Keras model slightly more. A wine's density, on the other hand, has no effect on the Keras model's predictions, but has an inverse relationship on model predictions for our Scikit Learn model.

Step 4: Compare model performance

Navigate to the Performance tab in the What-if Tool. Choose quality as the ground truth feature, and slice by a different feature. You can compare the error for each model, and for individual feature values:

Step 5: Explore feature distribution

Finally, navigate to the Features tab in the What-if Tool. This shows you the distribution of values for each feature in your dataset:

You can use this tab to make sure your dataset is balanced. For example, if we only had wines with 8% alcohol or pH values around 3, the model's predictions wouldn't necessarily reflect real world data. This tab gives us a good opportunity to see where our dataset might fall short, so that we can go back and collect more data to make it balanced.

We've described just a few What-if Tool exploration ideas here. Feel free to keep playing around with the tool, there are plenty more areas to explore!

If you'd like to continue using this notebook, it is recommended that you turn it off when not in use. From the Notebooks UI in your Cloud Console, select the notebook and then select Stop:

If you'd like to delete all resources you've created in this lab, simply delete the notebook instance instead of stopping it.

Using the Navigation menu in your Cloud Console, browse to Storage and delete both buckets you created to store your model assets.