1. Overview
This lab provides the instructions to deploy a LIT demo quickly. The objective is to familiarize you with the LIT tool to explore the model behavior. You will conduct a sentimental analysis and use the Counterfactual LIT feature to find the importance of specific words. The demo includes a dataset to carry out the analysis. The lab provides steps to deploy LIT in the Google Cloud Platform, analyze the data, and delete the deployed services.
What is the Learning Interpretability Tool (LIT)?
🔥LIT is a visual, interactive ML model-understanding tool that supports text, image, and tabular data. It can be run as a standalone server, or inside of notebook environments such as Colab, Jupyter, and Google Cloud Vertex AI notebooks.
LIT is built to answer questions such as:
- What kind of examples does my model perform poorly on?
- Why did my model make this prediction? Can this prediction be attributed to adversarial behavior, or to undesirable priors in the training set?
- Does my model behave consistently if I change things like textual style, verb tense, or pronoun gender?
The tool is available in LIT Github Repo. A paper for it also is available in the ArXiv.
What you will do
You will create a docker image available from the LIT Git Repo, push it to a registry, and deploy to Cloud Run. You will then explore the LIT tool. You have used the Cloud Run service to deploy the container but you can deploy the image to Google Kubernetes Engine.
You will use Google Cloud Shell and Google Cloud Workbench as launch pads to run the commands. You will use the latter specifically to build, push, and deploy the container image. Vertex AI Workbench instances are Jupyter notebook-based development environments for the entire data science workflow. You can interact with Vertex AI and other Google Cloud services from within a Vertex AI Workbench instance's Jupyter notebook.
You will use Google Artifactory Registry that enables you to centrally store artifacts and build dependencies as part of an integrated Google Cloud experience. You will upload the docker image to the Artifactory registry. You can learn more about the it at Google Cloud Artifactory Registry documentation.
You will use Google Cloud Cloud Run, a managed Knative service, to deploy the docker image. It is a managed compute platform that lets you run containers directly on top of Google's scalable infrastructure.
You could also deploy the image in Google Cloud Kubernetes Engine, a managed Kubernetes container orchestration service. You can use it to deploy and operate containerized applications at scale using Google's infrastructure.
Dataset
The demo uses the Stanford Sentiment Treebank dataset.
"The Stanford Sentiment Treebank is the first corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language. The corpus is based on the dataset introduced by Pang and Lee (2005) and consists of 11,855 single sentences extracted from movie reviews. It was parsed with the Stanford parser (Klein and Manning, 2003) and includes a total of 215,154 unique phrases from those parse trees, each annotated by 3 human judges." Reference paper
Before you begin
For this reference guide, you need a Google Cloud project. You can create a new one, or select a project you already created.
2. Launch Google Cloud Console and a Cloud Shell
You will launch a Google Cloud Console and use the Google Cloud Shell in this step.
2-a: Launch a Google Cloud Console
Launch a browser and go to Google Cloud Console.
The Google Cloud Console is a powerful, secure web admin interface that lets you manage your Google Cloud resources quickly. It's a DevOps tool on the go.
2-b: Launch a Google Cloud Shell
Launch a Google Cloud Shell. See the picture below for reference.
You should see something like this:
You will be using the command prompt in the next steps.
Cloud Shell is an online development and operations environment accessible anywhere with your browser. You can manage your resources with its online terminal preloaded with utilities such as the gcloud command-line tool, kubectl, and more. You can also develop, build, debug, and deploy your cloud-based apps using the online Cloud Shell Editor. Cloud Shell provides a developer ready, online environment with a preinstalled favorite tool set and 5GB of persistent storage space.
2-c: Set Google Cloud Project
Set the Google Cloud project and the location where you will create the Google Cloud services. You will use the information to create a Google Cloud Workbench and an Artifactory Registry. You will use the former to build, and push a container. You will use the latter to store the container image.
You must set the only mandatory PROJECT_ID variable. You can modify the other variables, but the default values are sufficient to run the lab. Set the correct Google Cloud Project. This is used by the gcloud command.
# Set your GCP Project ID.
export PROJECT_ID=[Your project ID]
# Set Google Cloud Location.
export GCP_LOCATION=us-central1
# Set image container artifact repo name.
export ARTIFACT_REPO=lit-demo
# Set lit demo name.
export DEMO_NAME=demo1
# Set Google cloud to use your project.
gcloud config set project $PROJECT_ID
# You will see the below confirmation message.
Updated property [core/project].
3. Create a Google Cloud Artifact Registry
You will store the container image of the LIT demo in the registry.
# Use below cmd to list all Google Cloud Artifact locations:
# gcloud artifacts locations list
# Create a repo to upload the docker container images.
gcloud artifacts repositories create $ARTIFACT_REPO \
--repository-format=docker \
--location=$GCP_LOCATION \
--description="LIT Demos"
# Validate the repo creation.
gcloud artifacts repositories describe $ARTIFACT_REPO \
--location=$GCP_LOCATION
4. Create a Google Cloud Workbench
You will create a Workbench instance. You will use the Workbench Terminal to build, and push a container image of the LIT demo.
4-a: Set a few variables
Set shell variables that are used to create a Workbench instance.
export MACHINE_TYPE=e2-standard-16
export INSTANCE_NAME=lit-demo
# You need to choose a zone for the Workbench instance.
# You can list all the available zones for the regions
# using the below command. However, you will use the
# first zone from the output of the list command.
# The below command lists the zones for this region:
# gcloud compute zones list \
# --format="value(selfLink.scope())" | grep $GCP_LOCATION
export LOCATION_ZONE=`gcloud compute zones list --format="value(selfLink.scope())" | grep $GCP_LOCATION | head -n 1`
4-b: Create a Google Cloud Workbench
Create a Workbench instance. You will use this instance to build, push, and deploy the container image with LIT demo.
# Create a Google Cloud Workbench instance.
gcloud workbench instances create $INSTANCE_NAME \
--project=$PROJECT_ID \
--location=$LOCATION_ZONE \
--vm-image-name=$VM_IMAGE_NAME \
--machine-type=$MACHINE_TYPE
# Gather the Workbench URI.
export WB_URL=`gcloud workbench instances describe $INSTANCE_NAME \
--location=$LOCATION_ZONE \
--format="value(proxyUri)"`
export WB_URL="https://${WB_URL}/lab"
echo $WB_URL
Click on the web url, $WB_URL, to launch the Workbench in a web browser.
You have used the --format
flag to gather specific value from the gcloud response. You can do a lot more. You can read about format, filters, etc. in the following blog post: Filter, format, and transform data with gcloud, Google Cloud's command line interface.
5. Launch a Terminal in the Workbench (WB)
Launch a Terminal from the Workbench (WB) web page. See the picture below for reference.
You should see something like this:
You will be working on the terminal in the next few Steps.
6. Git clone LIT repo
You will use the Workbench terminal to build, and push a container image of the LIT demo.
6-a: Set variables in the WB Terminal
Set the Google Cloud project and the location where you will create the Google Cloud services.
# Set your GCP Project ID.
export PROJECT_ID=[Your project ID]
# Set Google Cloud Location.
export GCP_LOCATION=us-central1
6-b: Authenticate with Google Cloud
Set up the proper Google Cloud security context. The email you will sign with is the context we need to continue the lab.
# Authorize gcloud to access the Cloud Platform with Google user credentials
# and follow the instructions.
gcloud auth login
There are various ways to sign in to the gcloud. You can learn them in our doucumentation page.
6-c: Set Google Cloud project
Set the correct Google Cloud Project. This is used by the gcloud command.
# Set Google cloud to use your project.
gcloud config set project $PROJECT_ID
# Authorize with Google Cloud.
# You will see the below confirmation message:
Updated property [core/project].
Create a LIT directory for your work
export LIT_DEMO_DIR=litdemo
mkdir ~/$LIT_DEMO_DIR; cd ~/$LIT_DEMO_DIR
Clone the LIT Git Repo
git clone https://github.com/PAIR-code/lit.git .
# List the files that you cloned.
# You will use the Dockerfile to build an image soon.
ls -l ~/$LIT_DEMO_DIR
7. Build a container image and push it to the Artifact Registry
In the Workbench terminal, run the commands below. You can optionally set a few shell variables for which we already set the default values.
# Set image container artifact repo name.
export ARTIFACT_REPO=lit-demo
# Set lit demo name.
export DEMO_NAME=demo1
# Build a container image of the LIT Demo. Takes a while...be patient.
docker build -t \
${GCP_LOCATION}-docker.pkg.dev/${PROJECT_ID}/$ARTIFACT_REPO/$DEMO_NAME:v1 .
# Check the container image.
docker image ls \
${GCP_LOCATION}-docker.pkg.dev/${PROJECT_ID}/$ARTIFACT_REPO/$DEMO_NAME:v1
# Authenticate the registry.
gcloud auth configure-docker \
${GCP_LOCATION}-docker.pkg.dev
# Push the image to the registry. Takes a while...be patient.
docker push \
${GCP_LOCATION}-docker.pkg.dev/${PROJECT_ID}/$ARTIFACT_REPO/$DEMO_NAME:v1
# Check the container image.
gcloud artifacts docker images list \
${GCP_LOCATION}-docker.pkg.dev/${PROJECT_ID}/$ARTIFACT_REPO/$DEMO_NAME
8. Create a Cloud Run with the above docker image
Deploy the demo LIT container that you created above to a Cloud Run service. In the Workbench terminal, run the below commands.
# Gather the default compute engine account.
export DEFAULT_SA=`gcloud iam service-accounts list --format="value(email)" | grep '\-compute@developer.gserviceaccount.com'`
echo $DEFAULT_SA
# Deploy the image to the Cloud Run. Takes a while...be patient.
gcloud run deploy $DEMO_NAME \
--image=${GCP_LOCATION}-docker.pkg.dev/${PROJECT_ID}/$ARTIFACT_REPO/$DEMO_NAME:v1 \
--allow-unauthenticated \
--port=5432 \
--service-account=$DEFAULT_SA \
--cpu=8 \
--memory=32Gi \
--cpu-boost \
--region=$GCP_LOCATION \
--project=$PROJECT_ID
After creating the service i.e. executing the gcloud run above, you can watch the logs in the Google Cloud Console.
Navigation: Top Bar in the Google Cloud Console → Cloud Run (in the search bar) → select the demo1 application → select the LOGS. You can also check the METRICS, etc.
In the Google Cloud Console use the Search and type ‘Cloud Run'. See the picture below for reference.
Select the ‘demo1' service that you just created. See the picture below for your reference.
You can check the LOGS section. See the picture below for your reference.
You can check the METRICS section. See the picture below for your reference.
9. Gather the LIT Service deployed URL
Either gather the url using the following command or click the ‘URL' in the Google Cloud Console Cloud Run service.
See the picture below for your reference.
Use the command below to gather the LIT service url:
# Gather the demo url from the Cloud Run service.
gcloud run services describe \
$DEMO_NAME \
--region=$GCP_LOCATION \
--format="value(status.url)"
The LIT demo takes a while to render the web page; after all, it loads a large image and downloads the model it uses. You can watch the Cloud Run LOGS for these activities.
10. Browse the URL and play with the LIT demo
The LIT demo looks like the screenshot below:
You will be checking the Sentimental Analysis on the Stanford Sentiment Treebank dataset. Follow the below steps
- Use the search function in the LIT's data table to find the 56 data points containing the word ‘not'.
- Check the BERT model accuracy in the Metrics Table. The BERT model's accuracy is high.
- Select individual data points and look for explanations. Search for the word ‘depression'.
- Select "It's not the ultimate depression-era gangster movie." Check the Salience Map. Salience maps suggest that "not" and "ultimate" are important to the prediction.
There are many LIT features that you can try. You can find our short Youtube video or the LIT ArXiv explaining the LIT features.
11. Congratulations
Well done on completing the codelab! Time to chill!
Clean up
To clean up the lab, delete all the Google Cloud Services created for the lab. Use Google Cloud Shell to run the following commands.
If the Google Cloud Connection is lost because of inactivity, then reset the variables. Follow 2-c and 4-1 to set the shell variables and to set the Google Cloud Project.
# Delete the Cloud Run Service.
gcloud run services delete $DEMO_NAME \
--region=$GCP_LOCATION
# Delete the Artifact Registry.
gcloud artifacts repositories delete $ARTIFACT_REPO\
--location=$GCP_LOCATION
# Delete the Workbench.
gcloud workbench instances delete $INSTANCE_NAME \
--location=$LOCATION_ZONE
Further reading
Continue learning the LIT tool features with the below materials:
License
This work is licensed under a Creative Commons Attribution 2.0 Generic License.