1. Overview
This lab provides the instructions to deploy a LIT demo quickly. The objective is to familiarize you with the LIT tool to explore the model behavior. You will conduct a sentimental analysis and use the Counterfactual LIT feature to find the importance of specific words. The demo includes a dataset to carry out the analysis. The lab provides steps to deploy LIT in the Google Cloud Platform, analyze the data, and delete the deployed services.
What is the Learning Interpretability Tool (LIT)?
🔥LIT is a visual, interactive ML model-understanding tool that supports text, image, and tabular data. It can be run as a standalone server, or inside of notebook environments such as Colab, Jupyter, and Google Cloud Vertex AI notebooks.
LIT is built to answer questions such as:
- What kind of examples does my model perform poorly on?
- Why did my model make this prediction? Can this prediction be attributed to adversarial behavior, or to undesirable priors in the training set?
- Does my model behave consistently if I change things like textual style, verb tense, or pronoun gender?
The tool is available in LIT GitHub Repo. A paper for it also is available in the ArXiv.
What you will do
You will use Google Cloud Shell to pull, tag, push, and deploy the container image.
You will use Google Artifact Registry that enables you to centrally store artifacts and build dependencies as part of an integrated Google Cloud experience. You will upload the docker image to the Artifact Registry. You can learn more about the it at Google Cloud Artifact Registry documentation.
You will use Google Cloud Cloud Run, a managed Knative service, to deploy the docker image. It is a managed compute platform that lets you run containers directly on top of Google's scalable infrastructure.
Dataset
The demo uses the Stanford Sentiment Treebank dataset.
"The Stanford Sentiment Treebank is the first corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language. The corpus is based on the dataset introduced by Pang and Lee (2005) and consists of 11,855 single sentences extracted from movie reviews. It was parsed with the Stanford parser (Klein and Manning, 2003) and includes a total of 215,154 unique phrases from those parse trees, each annotated by 3 human judges." Reference paper
Before you begin
For this reference guide, you need a Google Cloud project. You can create a new one, or select a project you already created.
2. Launch Google Cloud Console and a Cloud Shell
You will launch a Google Cloud Console and use the Google Cloud Shell in this step.
2-a: Launch a Google Cloud Console
Launch a browser and go to Google Cloud Console.
The Google Cloud Console is a powerful, secure web admin interface that lets you manage your Google Cloud resources quickly. It's a DevOps tool on the go.
2-b: Launch a Google Cloud Shell
Launch a Google Cloud Shell. See the picture below for reference.
You should see something like this:
You will be using the command prompt in the next steps.
Cloud Shell is an online development and operations environment accessible anywhere with your browser. You can manage your resources with its online terminal preloaded with utilities such as the gcloud command-line tool, kubectl, and more. You can also develop, build, debug, and deploy your cloud-based apps using the online Cloud Shell Editor. Cloud Shell provides a developer ready, online environment with a preinstalled favorite tool set and 5GB of persistent storage space.
2-c: Set Google Cloud Project
Set the Google Cloud project and the location where you will create the Google Cloud services. You will use the information to create a Google Cloud Workbench and an Artifact Registry. You will use the former to build, and push a container. You will use the latter to store the container image.
You must set the only mandatory PROJECT_ID variable. You can modify the other variables, but the default values are sufficient to run the lab. Set the correct Google Cloud Project. This is used by the gcloud command.
# Set your GCP Project ID.
export PROJECT_ID=[Your project ID]
3. Deploy the Docker Images to Google Cloud Artifact Registry
3-a: Create a Google Cloud Artifact Registry
First, You need to create an artifact registry to store docker images.
# Set Google Cloud Location.
export GCP_LOCATION=us-central1
# Set image container artifact repo name.
export ARTIFACT_REPO=lit-demo
# Set lit demo name.
export DEMO_NAME=demo1
# Use below cmd to list all Google Cloud Artifact locations:
# gcloud artifacts locations list
# Create a repo to upload the docker container images.
gcloud artifacts repositories create $ARTIFACT_REPO \
--repository-format=docker \
--location=$GCP_LOCATION \
--description="LIT Demos"
# Validate the repo creation.
gcloud artifacts repositories describe $ARTIFACT_REPO \
--location=$GCP_LOCATION
3-b: Pull Docker Image
Secondly, list all the LIT docker images in the public repository using the command below.
# List all the public LIT docker images.
gcloud container images list-tags us-east4-docker.pkg.dev/lit-demos/lit-app/lit-app
You can find all available images and their tag, which refers to the image version, and choose the tag you want to deploy.
Pull your chosen docker image using the command below.
# Set your chosen tag.
export TAG=[Your Chosen Tag]
# Pull the chosen docker image.
docker pull us-east4-docker.pkg.dev/lit-demos/lit-app/lit-app:$TAG
3-c: Tag Docker Image
Thirdly, tag the image you just pulled to target repository.
# Push the pulled docker image to target repository.
docker tag us-east4-docker.pkg.dev/lit-demos/lit-app/lit-app:$TAG $GCP_LOCATION-docker.pkg.dev/$PROJECT_ID/$ARTIFACT_REPO/$DEMO_NAME:$TAG
3-d: Push Docker Image
Next, push the docker image to target repository.
# Push the pulled docker image to the target repository.
docker push $GCP_LOCATION-docker.pkg.dev/$PROJECT_ID/$ARTIFACT_REPO/$DEMO_NAME:$TAG
3-e: Deploy the Docker Image to CloudRun
Deploy the docker image to cloud run from the target repository.
In deploy option, select "Deploy one revision from an existing container image".
In Config, name your Service to be the same as $DEMO_NAME, and select the Region to be the same as $GCP_LOCATION.
For Authentication, you can choose either "Allow unauthenticated invocations" or "Require authentication." If you select "Require authentication," you may need to complete additional steps to access the demo. Therefore, it is recommended to select "Allow unauthenticated invocations" if you just want to get familiar with the demo.
In Containers config, select the Container port to be 5432, Memory to be 32Gib, and CPU to be 8.
After setting the config, create the instance.
4. View LIT Service on GCP
After creating the service, you can watch the logs in the Google Cloud Console.
Navigation: Top Bar in the Google Cloud Console → Cloud Run (in the search bar) → select the demo1 application → select the LOGS. You can also check the METRICS, etc.
In the Google Cloud Console use the Search and type ‘Cloud Run'. See the picture below for reference.
Select the ‘demo1' service that you just created. See the picture below for your reference.
You can check the LOGS section. In the meantime, you can find the URL of the reference. See the picture below for your reference.
You can check the METRICS section. See the picture below for your reference.
5. Browse the LIT Demo URL
If you cannot access the URL due to Forbidden error, you need to change the service and allow unauthenticated invocation.
Or you can proxy the service to local host using the command below.
# Proxy the service to local host.
gcloud run services proxy $DEMO_NAME --project $PROJECT_ID
Make sure the region to be the same as GCP_LOCATION. Then you can browse the local host url.
The LIT demo looks like the screenshot below:
You will be checking the Sentimental Analysis on the Stanford Sentiment Treebank dataset. Follow the below steps
- Use the search function in the LIT's data table to find the 56 data points containing the word ‘not'.
- Check the BERT model accuracy in the Metrics Table. The BERT model's accuracy is high.
- Select individual data points and look for explanations. Search for the word ‘depression'.
- Select "It's not the ultimate depression-era gangster movie." Check the Salience Map. Salience maps suggest that "not" and "ultimate" are important to the prediction.
There are many LIT features that you can try. You can find our short Youtube video or the LIT ArXiv explaining the LIT features.
6. Congratulations
Well done on completing the codelab! Time to chill!
Clean up
To clean up the lab, delete all the Google Cloud Services created for the lab. Use Google Cloud Shell to run the following commands.
If the Google Cloud Connection is lost because of inactivity, then reset the variables. Follow 2-c and 4-1 to set the shell variables and to set the Google Cloud Project.
# Delete the Cloud Run Service.
gcloud run services delete $DEMO_NAME \
--region=$GCP_LOCATION
# Delete the Artifact Registry.
gcloud artifacts repositories delete $ARTIFACT_REPO\
--location=$GCP_LOCATION
### **Further reading**
Continue learning the LIT tool features with the below materials:
* LIT open source code base: [Git repo](https://github.com/PAIR-code/lit)
* LIT paper: [ArXiv](https://arxiv.org/pdf/2008.05122.pdf)
* LIT feature video demo: [Youtube](https://www.youtube.com/watch?v=CuRI_VK83dU)
### **License**
This work is licensed under a Creative Commons Attribution 2.0 Generic License.