Kubeflow is a machine learning toolkit for Kubernetes. The project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable, and scalable. The goal is to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures.

What does a Kubeflow deployment look like?

A Kubeflow deployment is:

It is a means of organizing loosely-coupled microservices as a single unit and deploying them to a variety of locations, whether that's a laptop or the cloud. This codelab will walk you through creating your own Kubeflow deployment.

What you'll build

In this codelab, you're going to build a web app that summarizes GitHub issues using a trained model. It is based on the walkthrough provided in the Kubeflow Examples repo. Upon completion, your infrastructure will contain:

What you'll learn

What you'll need

This is an advanced codelab focused on Kubeflow. For more background and an introduction to the platform, see the Introduction to Kubeflow on Kubernetes codelab. Non-relevant concepts and code blocks are glossed over and provided for you to simply copy and paste.

Choose one of the following environments for running this codelab:

Cloud Shell

This link clones the Kubeflow Examples repo and places it in the ~/examples directory.

Download in Google Cloud Shell

Once you have the project files, checkout the v0.4.0-rc.2 branch, which contains the resources you will need:

cd ${HOME}/examples/github_issue_summarization
export KUBEFLOW_TAG=0.4.0-rc.2
git checkout v${KUBEFLOW_TAG}

Enable Boost Mode

In the Cloud Shell window, click on the Settings dropdown at the far right. Select Enable Boost Mode. This will provision a larger instance for your Cloud Shell session, resulting in speedier Docker builds. If you can't find this menu, ensure the main Navigation Menu is hidden by clicking the three lines at the top left of the screen, next to the Google Cloud Platform logo.

Local Linux or MacOS

This link downloads an archive of the Kubeflow examples repo. Unpacking the downloaded zip file will produce a root folder (examples-0.4.0-rc.2) containing all of the official Kubeflow examples.

Download locally

Unzip and move the folder for consistency with the absolute paths in this codelab:

export KUBEFLOW_TAG=0.4.0-rc.2
unzip v${KUBEFLOW_TAG}.zip
mv examples-${KUBEFLOW_TAG} ${HOME}/examples

Set your GitHub token

This codelab involves the use of many different files obtained from public repos on GitHub. To prevent rate-limiting, especially at events where a large number of anonymized requests are sent to the GitHub APIs, setup an access token with no permissions. This is simply to authorize you as an individual rather than anonymous user.

  1. Navigate to https://github.com/settings/tokens and generate a new token with no permissions.
  2. Save it somewhere safe. If you lose it, you will need to delete and create a new one.
  3. Set the GITHUB_TOKEN environment variable:
export GITHUB_TOKEN=<token>

Installing pyyaml

Ensure that pyyaml is installed by running:

pip install --user pyyaml

Installing ksonnet

Set the correct version

To install on Cloud Shell or a local Linux machine, set this environment variable:

export KS_VER=0.13.1
export KS_BIN=ks_${KS_VER}_linux_amd64

To install on a Mac, set this environment variable:

export KS_BIN=ks_${KS_VER}_darwin_amd64

Install ksonnet

Download and unpack the appropriate binary, then add it to your $PATH:

wget -O /tmp/$KS_BIN.tar.gz https://github.com/ksonnet/ksonnet/releases/download/v${KS_VER}/${KS_BIN}.tar.gz

mkdir -p ${HOME}/bin
tar -xvf /tmp/${KS_BIN}.tar.gz -C ${HOME}/bin

export PATH=$PATH:${HOME}/bin/${KS_BIN}

To familiarize yourself with ksonnet concepts, see this diagram.

Install kfctl

Download and unpack kfctl, then add it to your $PATH:

wget -P /tmp https://github.com/kubeflow/kubeflow/archive/v${KUBEFLOW_TAG}.tar.gz
mkdir -p ${HOME}/src
tar -xvf /tmp/v${KUBEFLOW_TAG}.tar.gz -C ${HOME}/src
cd ${HOME}/src/kubeflow-${KUBEFLOW_TAG}/scripts
ln -s kfctl.sh kfctl
export PATH=$PATH:${HOME}/src/kubeflow-${KUBEFLOW_TAG}/scripts
cd ${HOME}

Set your GCP project ID

Store the GCP Project ID and activate the latest scopes in Kubernetes Engine:

export PROJECT_ID=<gcp_project_id>
export ZONE=us-central1-a
gcloud config set project ${PROJECT_ID}
gcloud config set compute/zone ${ZONE}
gcloud config set container/new_scopes_behavior true

Authorize Docker

Allow Docker access to your project's Container Registry:

gcloud auth configure-docker

Create a service account

Create a service account with read/write access to storage buckets:

export SERVICE_ACCOUNT=user-gcp-sa
export SERVICE_ACCOUNT_EMAIL=${SERVICE_ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com
gcloud iam service-accounts create ${SERVICE_ACCOUNT} \
  --display-name "GCP Service Account for use with kubeflow examples"

gcloud projects add-iam-policy-binding ${PROJECT_ID} --member \
  serviceAccount:${SERVICE_ACCOUNT_EMAIL} \
  --role=roles/storage.admin

Generate a credentials file for upload to the cluster:

export KEY_FILE=${HOME}/secrets/${SERVICE_ACCOUNT_EMAIL}.json
gcloud iam service-accounts keys create ${KEY_FILE} \
  --iam-account ${SERVICE_ACCOUNT_EMAIL}

Create a storage bucket

Create a Cloud Storage bucket for storing your trained model. Fill in a new, unique bucket name and issue the "mb" (make bucket) command:

export BUCKET_NAME=kubeflow-${PROJECT_ID}
gsutil mb -c regional -l us-central1 gs://${BUCKET_NAME}

Create a cluster

Create a managed Kubernetes cluster on Kubernetes Engine:

kfctl init kubeflow-codelab --platform gcp --project ${PROJECT_ID}
cd kubeflow-codelab
kfctl generate platform
kfctl apply platform

To verify the connection, run the following command:

kubectl cluster-info

Verify that this IP address matches the IP address corresponding to the Endpoint in your Google Cloud Platform Console by comparing the Kubernetes master IP is the same as the Master_IP address in the previous step.

Upload service account credentials

kubectl create secret generic user-gcp-sa \
  --from-file=user-gcp-sa.json="${KEY_FILE}"

Install Kubeflow with Seldon

kfctl generate k8s
kfctl apply k8s

ksonnet is a templating framework which allows you to utilize common object definitions and customize them to your environment. You'll begin by referencing Kubeflow templates and apply environment-specific parameters. Once manifests have been generated specifically for your cluster, they can be applied like any other Kubernetes object using `kubectl`.

Add Seldon to the default installation

cd ${HOME}/kubeflow-codelab/ks_app
ks generate seldon seldon
ks apply default -c seldon

Congratulations! Your cluster now contains a Kubeflow installation with Seldon. You can view the components by running:

kubectl get pods

You should see output similar to this:

In this section, you will create a component that trains a model.

Set the component parameters

cd ${HOME}/kubeflow-codelab/ks_app
ks generate tf-job-simple-v1beta1 tfjob --name tfjob-issue-summarization
cp ${HOME}/examples/github_issue_summarization/ks_app/components/tfjob.jsonnet components/
ks param set tfjob gcpSecretName "user-gcp-sa"
ks param set tfjob gcpSecretFile "user-gcp-sa.json"
ks param set tfjob image "gcr.io/kubeflow-examples/tf-job-issue-summarization:v20180629-v0.1-2-g98ed4b4-dirty-182929"
ks param set tfjob input_data "gs://kubeflow-examples/github-issue-summarization-data/github_issues_sample.csv"
ks param set tfjob input_data_gcs_bucket "kubeflow-examples"
ks param set tfjob input_data_gcs_path "github-issue-summarization-data/github-issues.zip"
ks param set tfjob num_epochs "7"
ks param set tfjob output_model "/tmp/model.h5"
ks param set tfjob output_model_gcs_bucket "${BUCKET_NAME}"
ks param set tfjob output_model_gcs_path "github-issue-summarization-data"
ks param set tfjob sample_size "100000"

The training component tfjob is now configured to use a pre-built container image.

Launch training

Apply the component manifests to the cluster:

ks apply default -c tfjob

View the running job

View the resulting pods:

kubectl get pod -l tf_job_name=tfjob-issue-summarization

The training pod should look similar to this:

It can take a few minutes to pull the image and start the container. Once the "tfjob-issue-summarization-master" pod is running, tail the logs:

kubectl logs -f tfjob-issue-summarization-master-0

Inside the pod, you will see the download of source data (github-issues.zip) before training begins. Continue tailing the logs until the pod exits on its own and you find yourself back at the command prompt.

To verify that training completed successfully, check to make sure all three model files were uploaded to your Cloud Storage bucket:

gsutil ls gs://${BUCKET_NAME}/github-issue-summarization-data

In this section, you will create a component that serves a trained model.

Set serving image path

export SERVING_IMAGE=gcr.io/kubeflow-examples/issue-summarization-model:v20180718-g98ed4b4-codelab

Create the serving component

The serving component is configured to run a pre-built image. Using a Seldon ksonnet template, generate the serving component. Navigate back to the ksonnet app directory for Kubeflow, and issue the following commands:

cd ${HOME}/kubeflow-codelab/ks_app
ks generate seldon-serve-simple-v1alpha2 issue-summarization-model \
  --name=issue-summarization \
  --image=${SERVING_IMAGE} \
  --replicas=2

Launch serving

Apply the component manifests to the cluster:

ks apply default -c issue-summarization-model

View the running pods

You will see several new pods appear:

kubectl get pods -l seldon-deployment-id=issue-summarization

Once the pod is running, tail the logs for one of the serving containers to verify that it is running on port 9000:

kubectl logs \
  $(kubectl get pods \
    -lseldon-deployment-id=issue-summarization \
    -o=jsonpath='{.items[0].metadata.name}') \
  issue-summarization

In this section, you will create a component that provides browser access to the serving component.

Set parameter values

cd ${HOME}/kubeflow-codelab/ks_app
ks generate deployed-service ui \
  --name issue-summarization-ui \
  --image gcr.io/kubeflow-examples/issue-summarization-ui:v20180629-v0.1-2-g98ed4b4-dirty-182929

cp ${HOME}/examples/github_issue_summarization/ks_app/components/ui.jsonnet components/

ks param set ui githubToken ${GITHUB_TOKEN}
ks param set ui modelUrl "http://issue-summarization.kubeflow.svc.cluster.local:8000/api/v0.1/predictions"

The UI component is now configured to use a pre-built container image which is made available in Container Registry (gcr.io).

(Optional) Create the UI image

The UI component is now configured to use a pre-built container image which we've made available in Container Registry (gcr.io). If you would prefer to generate your own image instead, continue with this step.

Switch to the docker directory and build the image for the UI:

cd ${HOME}/examples/github_issue_summarization/docker
docker build -t gcr.io/${PROJECT_ID}/issue-summarization-ui:latest .

After the image has been successfully built, store it in Container Registry:

docker push gcr.io/${PROJECT_ID}/issue-summarization-ui:latest

Update the component parameter with a link that points to the custom image:

cd ${HOME}/kubeflow-codelab/ks_app
ks param set ui image gcr.io/${PROJECT_ID}/issue-summarization-ui:latest

Launch the UI

Apply the component manifests to the cluster:

ks apply default -c ui

You should see an additional pod with the status ContainerCreating:

kubectl get pods -l app=issue-summarization-ui

View the UI

To view the UI, open a port to the ambassador service:

kubectl port-forward svc/ambassador 8080:80

In Cloud Shell, click on the Web Preview button and select "Preview on port 8080."

This will open a new browser tab that shows the Kubeflow Central Dashboard. Add the text "issue-summarization/" to the end of the URL and press Enter (don't forget the trailing slash).

You should see something like this:

Click the Populate Random Issue button to fill in the large text box with a random issue summary. Then click the Generate Title button to view the machine generated title produced by your trained model.

View serving container logs

Tail the logs of one of the serving containers to verify that it is receiving a request from the UI and providing a prediction in response:

kubectl logs \
  $(kubectl get pods \
    -lseldon-deployment-id=issue-summarization \
    -o=jsonpath='{.items[0].metadata.name}') \
  issue-summarization

Press the Generate Title button in the UI a few times to view the POST request. Since there are two serving containers, you might need to try a few times before you see the log entry.

Press Ctrl-C to return to the command prompt.

Destroy the cluster

Delete the previously created cluster with the following command:

gcloud deployment-manager deployments delete kubeflow-codelab \
  --zone us-central1-a

Destroy images

These snippets will remove all versions of the training, serving, and UI images that were stored in your project:

export IMAGE=gcr.io/${PROJECT_ID}/tf-job-issue-summarization
for digest in $(gcloud container images list-tags \
  ${IMAGE} --limit=999999 \
  --format='get(digest)'); do
    gcloud container images delete -q --force-delete-tags "${IMAGE}@${digest}"
done

export IMAGE=gcr.io/${PROJECT_ID}/issue-summarization-model
for digest in $(gcloud container images list-tags \
  ${IMAGE} --limit=999999 \
  --format='get(digest)'); do
    gcloud container images delete -q --force-delete-tags "${IMAGE}@${digest}"
done

export IMAGE=gcr.io/${PROJECT_ID}/issue-summarization-ui
for digest in $(gcloud container images list-tags \
  ${IMAGE} --limit=999999 \
  --format='get(digest)'); do
    gcloud container images delete -q --force-delete-tags "${IMAGE}@${digest}"
done

Destroy the storage bucket

gsutil rm -r gs://${BUCKET_NAME}

Destroy the service account

gcloud iam service-accounts delete ${SERVICE_ACCOUNT_EMAIL}

gcloud projects remove-iam-policy-binding ${PROJECT_ID} --member \
  serviceAccount:${SERVICE_ACCOUNT_EMAIL} \
  --role=roles/storage.admin

rm ${HOME}/secrets/${SERVICE_ACCOUNT_EMAIL}.json

Remove Ksonnet

rm /tmp/${KS_VER}.tar.gz
rm -rf ${HOME}/bin/${KS_VER}

Remove sample code

rm -rf ${HOME}/examples

Remove GitHub token

Navigate to https://github.com/settings/tokens and remove the generated token.