In this lab, you load data into BigQuery in multiple ways. You also transform the data you load, and you query the data.

What you need

To complete this lab, you need:

Access to a supported Internet browser:

A Google Cloud Platform project

What you learn

In this lab, you:

Task: In this section of the lab, you upload a CSV file to BigQuery using the BigQuery web UI.

BigQuery supports the following data formats when loading data into tables: CSV, JSON, AVRO, or Cloud Datastore backups. This example focuses on loading a CSV file into BigQuery.

Step 1

Open the Google Cloud Console (in the incognito window) and using the menu, navigate into BigQuery web UI.

Step 2

Click the blue arrow to the right of your project name and choose Create new dataset.

Step 3

In the ‘Create Dataset' dialog, for Dataset ID, type cpb101_flight_data and then click OK.

Step 4

Download the following file to your local machine. This file contains the data that will populate the first table.

Download airports.csv

Step 5

Create a new table in the cpb101_flight_data dataset to store the data from the CSV file. Click the create table icon (the plus sign) to the right of the cpb101_flight_data dataset.

Step 6

On the Create Table page, in the Source Data section:

Note: When you have created a table previously, the Create from Previous Job option allows you to quickly use your settings to create similar tables.

Step 7

In the Destination Table section:

Step 8

In the Schema section:

Step 9

In the Options section:

Step 10

Once the load job is complete, click cpb101_flight_data > AIRPORTS.

Step 11

On the Table Details page, click Details to view the table properties and then click Preview to view the table data.

Task: In this section of the lab, you upload multiple JSON files and an associated schema file to BigQuery using the CLI.

Step 1

Navigate to the Google Cloud Platform Console and to the right of your project name, click Activate Google Cloud Shell.

Step 2

Type the following command to download schema_flight_performance.json (the schema file for the table in this example) to your working directory.

curl https://storage.googleapis.com/cloud-training/CPB200/BQ/lab4/schema_flight_performance.json -o schema_flight_performance.json

Step 3

The JSON files containing the data for your table are stored in a Google Cloud Storage bucket. They have URIs like the following:

gs://cloud-training/CPB200/BQ/lab4/domestic_2014_flights_*.json

Type the following command to create a table named flights_2014 in the cpb101_flight_data dataset, using data from files in Google Cloud Storage and the schema file stored on your virtual machine.

Note that your Project ID is stored as a variable in Cloud Shell ($DEVSHELL_PROJECT_ID) so there's no need for you to remember it. If you require it, you can view your Project ID in the command line to the right of your username (after the @ symbol).

bq load --source_format=NEWLINE_DELIMITED_JSON $DEVSHELL_PROJECT_ID:cpb101_flight_data.flights_2014 gs://cloud-training/CPB200/BQ/lab4/domestic_2014_flights_*.json ./schema_flight_performance.json

If you are prompted to select a project to be set as default, choose the Project ID that was setup when you started this qwiklab.

Step 4

Once the table is created, type the following command to verify table flights_2014 exists in dataset cpb101_flight_data.

bq ls $DEVSHELL_PROJECT_ID:cpb101_flight_data

The output should look like the following:

tableId Type

-------------- -------

AIRPORTS TABLE

flights_2014 TABLE

Task: In this section of the lab, you export a BigQuery table using the web UI.

Step 1

If you don't already have a bucket on Cloud Storage, create one from the Storage section of the GCP console. Bucket names have to be globally unique.

Step 2

Back to the Google Cloud Console (in the incognito window) and using the menu, navigate into BigQuery web UI

Step 3

Select the AIRPORTS table that you created recently, and using the "down" button to its right, select the option for Export Table.

Step 4

In the dialog, specify gs://<your-bucket-name>/bq/airports.csv and click OK.

Step 5

Use the CLI to export the table:

bq extract cpb101_flight_data.AIRPORTS gs://<your-bucket-name>/bq/airports2.csv

Remember to change <your-bucket-name> with the bucket you created earlier.

Step 6

Browse to your bucket and ensure that both .csv files have been created.

In this lab, you:

©Google, Inc. or its affiliates. All rights reserved. Do not distribute.