BigQuery is a fully-managed, petabyte-scale, low-cost enterprise data warehouse for analytics. BigQuery is serverless. You do not need to set up and manage clusters.

A BigQuery dataset resides in a project. A dataset contains one or more tables with data.

In this codelab, you will use the BigQuery web UI to query public datasets, load your own data, and export data to a Cloud Storage bucket.

For more information about BigQuery, see BigQuery documentation.

What you'll learn

What you'll need

To complete this lab, you need:

Create a project

Open the Cloud Platform Console. Click Select a project, and then click the + icon to create a project.

In the New Project dialog, for Project name, type BQProject. Click Create.

Browse the Stackoverflow dataset

Open the Stackoverflow dataset. The BigQuery web UI opens in a new browser tab. The BigQuery web UI displays the tables in the Stackoverflow dataset. The navigation pane lists other public datasets. Each dataset comprises one or more tables.

For more information about all the public datasets available in BigQuery, see Google BigQuery Public Datasets.

Query the Stackoverflow dataset

Click Compose Query. In the New Query text area, copy and paste the following SQL query. The query uses the #standardSQL prefix to specify that BigQuery should treat this query as standard SQL. The UI validates the query and displays a green checkmark below the text area to indicate that the syntax is valid.

#standardSQL
SELECT badge_name AS First_Gold_Badge, 
       COUNT(1) AS Num_Users,
       ROUND(AVG(tenure_in_days)) AS Avg_Num_Days
FROM
(
  SELECT 
    badges.user_id AS user_id,
    badges.name AS badge_name,
    TIMESTAMP_DIFF(badges.date, users.creation_date, DAY) AS tenure_in_days,
    ROW_NUMBER() OVER (PARTITION BY badges.user_id
                       ORDER BY badges.date) AS row_number
  FROM 
    `bigquery-public-data.stackoverflow.badges` badges
  JOIN
    `bigquery-public-data.stackoverflow.users` users
  ON badges.user_id = users.id
  WHERE badges.class = 1 
) 
WHERE row_number = 1
GROUP BY First_Gold_Badge
ORDER BY Num_Users DESC
LIMIT 10

Click Run Query. The query generates the top 10 gold badges, ranked by how many users got them as their first gold badges. The query also determines how many days it took for these gold badges to be obtained on average.

Row

First_Gold_Badge

Num_Users

Avg_Num_Days

1

Famous Question

176982

1169.0

2

Fanatic

15085

618.0

3

Unsung Hero

12875

595.0

4

Great Answer

10641

1363.0

5

Electorate

5769

829.0

6

Populist

5491

1227.0

7

Steward

1005

993.0

8

Great Question

582

717.0

9

Copy Editor

253

595.0

You can also query sets of tables using wildcard table names. For more information, see Querying sets of tables using wildcard tables.

Load table data

Download the baby names zip file. The zip file, provided by the US Social Security Administration, contains approximately 7 MB of data about popular baby names.

Unzip the file in a local folder and note its location.

In the BigQuery web UI's navigation pane, next to the project name BQProject, click the down arrow icon, and click Create new dataset.

In the Create Dataset dialog, for Dataset ID, type babynames. Leave all of the other default settings in place and click OK.

In the navigation pane, hover over the babynames dataset ID that you just created. Click the down arrow icon next to the ID and click Create new table.

In the Source Data section, click the Choose file button. Navigate to the data you unzipped earlier, and select the yob2014.txt file.

In the Destination Table section, for the destination table name, type names_2014

In the Schema section, click Edit as Text.

Replace the contents of the Schema input area with the following schema:

name:string,gender:string,count:integer

Click Create Table.

Export table data

Switch to the Cloud Platform Console browser tab. In the Cloud Platform Console's Products & Services menu, click Storage, and then click Create bucket.

In the Create a bucket page, specify a unique bucket name, Regional storage class, and us-central1 regional location. Click Create. Note the bucket name. You will use this bucket to store the data exported from BigQuery.

Switch to the BigQuery web UI browser tab. In the navigation pane, under the babynames dataset, hover over the names_yob2014 table. Click the down arrow icon, and then click Export Table.

In the Export to Google Cloud Storage dialog, select CSV as the export format.

For Google Cloud Storage URI, type <export bucket name>/babynames2014.csv. Replace the angle brackets and the text inside with the name of the bucket you created earlier.

Click OK.

After the export completes, switch to the Cloud Platform Console browser tab. Refresh the page to view the complete contents of the Cloud Storage bucket. The babynames2014.csv object in the bucket contains the exported data.

Delete the resources

In the BigQuery web UI's navigation pane, hover over the babynames dataset ID. Click the down arrow icon next to the ID and click Delete dataset.

In the Delete Dataset dialog, for Dataset ID, type babynames, and click OK.

Switch to the Cloud Platform Console browser tab. In the navigation pane, under Storage, click Browser.

Select the bucket that contains your BigQuery export data, and click Delete.

Delete the project

In the Product & Services menu, click IAM & Admin.

Click Settings, then click the Delete.

In the Shut down project dialog, read the notes and then, type the project ID.

Click Shut down.

Close the Cloud Platform Console.

Congratulations! You have now learned how to use the BigQuery web UI to query public datasets, export data, and load your own data. Note that you did not have to set up or manage clusters to work with datasets.