AutoML Vision helps developers with limited ML expertise train high quality image recognition models. Once you upload images to the AutoML UI, you can train a model that will be immediately available on GCP for generating predictions via an easy to use REST API.

In this lab, we will upload images to Cloud Storage and use them to train a custom model to recognize different types of clouds (cumulus, cumulonimbus, etc.).

What you'll learn

What you'll need

How will you use this tutorial?

Read it through only Read it and complete the exercises

How would rate your experience with Google Cloud Platform?

Novice Intermediate Proficient

Self-paced environment setup

If you don't already have a Google Account (Gmail or Google Apps), you must create one. Sign-in to Google Cloud Platform console (console.cloud.google.com) and create a new project:

Remember the project ID, a unique name across all Google Cloud projects (the name above has already been taken and will not work for you, sorry!). It will be referred to later in this codelab as PROJECT_ID.

Next, you'll need to enable billing in the Cloud Console in order to use Google Cloud resources.

Running through this codelab shouldn't cost you more than a few dollars, but it could be more if you decide to use more resources or if you leave them running (see "cleanup" section at the end of this document).

New users of Google Cloud Platform are eligible for a $300 free trial.

Google Cloud Shell is a command line environment running in the Cloud. This Debian-based virtual machine is loaded with all the development tools you'll need (gcloud, bq, git and others) and offers a persistent 5GB home directory. We'll use Cloud Shell to upload our training images to Google Cloud Storage.

To get started with Cloud Shell, Click on the "Activate Google Cloud Shell" icon in top right hand corner of the header bar

A Cloud Shell session opens inside a new frame at the bottom of the console and displays a command-line prompt. Wait until the user@project:~$ prompt appears

AutoML Vision provides an interface for all the steps in training an image classification model and generating predictions on it. Let's start by navigating to the AutoML UI. Once you're there, click on Set Up Now to enable the necessary APIs for AutoML Vision. It may take a couple of minutes for this to complete.

If asked for the project ID, you can find it by navigating to the Home tab in your console and looking at Project info:

In order to train a model to classify images of clouds, we need to provide it labeled training data so it can develop an understanding of the image features associated with different types of clouds. In this example our model will learn to classify three different types of clouds: cirrus, cumulus, and cumulonimbus. To use AutoML Vision we need to put our training images in Google Cloud Storage.

In the Cloud console, navigate to the Storage browser for your project:

Once you're there, you should see a bucket created for you starting with your project ID and ending in `-vcm`:

We've made the training images publicly available in another Cloud Storage bucket. Let's copy them to your bucket so you can take a look. First, create an environment variable with the name of your bucket by running the following command in Cloud Shell, making sure to replace YOUR_BUCKET_NAME in the command below with the name of your bucket:

export BUCKET=YOUR_BUCKET_NAME

Next, using the `gsutil` command line utility for Cloud Storage, copy the training images into your bucket:

gsutil -m cp -r gs://automl-codelab-clouds/* gs://${BUCKET}

When the images finish copying, hit refresh on your storage bucket viewer and you should see the following folders of photos of each of the 3 different cloud types we'll be classifying:

If you click on the individual image files in each folder you can see the photos we'll be using to train our model for each type of cloud.

Now that we've got our training data in Cloud Storage, we need a way for AutoML Vision to find them. To do this we'll create a CSV where each row contains a URL to a training image and the associated label for that image. We've created this CSV file for you, you just need to update it with your bucket name.

First, copy this file to your Cloud Shell instance:

gsutil cp gs://automl-codelab-metadata/data.csv .

Then run the following command update the CSV with the files in your project:

sed -i -e "s/placeholder/${BUCKET}/g" ./data.csv

Now you're ready to upload this file to your Cloud Storage bucket:

gsutil cp ./data.csv gs://${BUCKET}

Confirm that you see the CSV file in your storage browser and then navigate back to the AutoML Vision UI. Click on "Get started with AutoML".

With the second option selected (Select a CSV file on Cloud Storage), enter the URL of the file you just uploaded (i.e. gs://your-project-name-vcm/data.csv).

For this example, leave "enable multi-label classification" unchecked. In the future, you may want to check this box if you're doing multi-class classification.

Select Create Dataset.

It will take around 2 minutes for your images to finish importing. Once the import has completed, you'll be brought to a page with all the images in your dataset.

After the import completes, you should be brought to the Images tab, shown above, automatically. Try filtering by different labels (ie. click cumulus) to review the training images:

If any images are labeled incorrectly you can click on them to switch the label or delete the image from your training set:

To see a summary of how many images you have for each label, click on Label stats. You should see the following show up on the left side of your browser.

We're ready to start training our model! AutoML Vision handles this for us automatically, without requiring us to write any of the model code. To train your clouds model, go to the Train tab and click start training.

Enter a name for your model, or use the default auto-generated name, and click start training.

Since this is a small dataset, it will only take around 15 minutes to complete. When training completes, you'll be brought to the Evaluate tab.

In the Evaluate tab, you'll see information about AUC, precision and recall of the model.

You can also play around with score threshold:

Finally, you can take a look at the confusion matrix.

All of this provides some common machine learning metrics to evaluate your model accuracy and see where you can improve your training data. Since our focus here was not on accuracy, we'll skip to the prediction section but feel free to browse the accuracy metrics on your own.

Now it's time for the most important part: generating predictions on our trained model using data it hasn't seen before. Navigate to the Predict tab in the AutoML UI:

There are a few ways to generate predictions. In this lab we'll use the UI to upload images. We'll see how our model does at classifying these two images (the first is a cirrus cloud, the second is a cumulonimbus):

Download these images by right-clicking. Return to the UI, select upload images and upload them to the online prediction UI. When the prediction request completes you should see something like the following:

Pretty cool - our model classified each type of cloud correctly!

You've learned how to train your own custom machine learning model and generate predictions on it through the web UI. Now you've got what it takes to train a model on your own image dataset.

What we've covered

Next Steps