In this lab, you go from exploring a taxicab dataset to training and deploying a high-accuracy distributed model with Cloud ML Engine.

What you need

To complete this lab, you need:

What you learn

In this series of labs, you go from exploring a taxicab dataset to training and deploying a high-accuracy distributed model with Cloud ML Engine.

Activate Google Cloud Shell

From the GCP Console click the icon (as depicted below) on the top right toolbar:

Then click "Start Cloud Shell" as shown here:

It should only take a few moments to provision and connect to the environment:

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on the Google Cloud, greatly enhancing network performance and authentication. Much if not all of your work in this lab can be done with simply a browser or your Google Chromebook.

Once connected to the cloud shell, you should see that you are already authenticated and that the project is already set to your PROJECT_ID:

gcloud auth list

Command output

Credentialed accounts:
 - <myaccount>@<mydomain>.com (active)
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

To launch Cloud Datalab:

Step 1

In Cloud Shell, type:

gcloud compute zones list

Pick a zone in a geographically closeby region.

Step 2

In Cloud Shell, type:

datalab create dataengvm --zone <ZONE>

Datalab will take about 5 minutes to start.

Note: follow the prompts during this process.

If you are not yet familiar with Datalab, what follows is a graphical cheat sheet for the main Datalab functionality:

Move on to the next step.

Step 1

If necessary, wait for Datalab to finish launching. Datalab is ready when you see a message prompting you to do a "Web Preview".

Step 2

Click on the Web Preview icon on the top-left corner of the Cloud Shell ribbon. Switch to port 8081.

Note: The connection to your Datalab instance remains open for as long as the datalab command is active. If the cloud shell used for running the datalab command is closed or interrupted, the connection to your Cloud Datalab VM will terminate.

Step 3

In Datalab, click on the icon for "Open ungit" in the top-right ribbon.

Step 4

In the Ungit window, select the text that reads /content/datalab/notebooks and remove the notebooks so that it reads /content/datalab, then hit enter.

In the panel that comes up, type the following as the GitHub repository to Clone from:

https://github.com/GoogleCloudPlatform/training-data-analyst

Then, click on Clone repository.

In this lab, you will:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/datasets/ and open create_datasets.ipynb.

Step 2

In Datalab, click on Clear | All Cells (click on Clear, then in the drop-down menu, select All Cells). Now, read the narrative and execute each cell in turn.

In this lab, you will learn how the TensorFlow Python API works:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open a_tfstart.ipynb

Step 2

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

In this lab, you will implement a simple machine learning model using tf.learn:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open b_tflearn.ipynb

Step 2

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

In this lab, you will learn how to:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open c_batched.ipynb.

Step 2

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

In this lab, you will learn how to:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open d_experiment.ipynb.

Step 2

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

In this lab, you will learn how to:

Step 1

If you don't already have a bucket on Cloud Storage, create one from the Storage section of the GCP console. Bucket names have to be globally unique.

Step 2

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/cloudmle and open cloudmle.ipynb.

Step 3

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

In this lab, you will improve the ML model using feature engineering. In the process, you will learn how to:

Step 1

In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/feateng and open feateng.ipynb.

Step 2

In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.

Your instructor will demo notebooks that contain hyper-parameter tuning and training on 500 million rows of data. The changes to the model are minor -- essentially just command-line parameters, but the impact on model accuracy is huge:

┬ęGoogle, Inc. or its affiliates. All rights reserved. Do not distribute.