BigQuery is a serverless, highly scalable, and cost-effective data warehouse. Simply move your data into BigQuery and let us handle the hard work so you can focus on what truly matters, running your business. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.
In this lab, you will discover the analytical possibilities of the BigQuery. You'll learn how to import a dataset from a Google Cloud Storage bucket and get a grasp of the BigQuery UI by working with a Retail banking dataset. Additionally, this lab will teach you how to uncover key features in BigQuery that make your day to day analytics much easier such as exporting query results in a spreadsheet, viewing and running queries from your query history, viewing query performance, and creating table views for to be used by other teams and departments.
In this lab, you learn how to perform the following tasks:
In this section you will learn how to navigate the BigQuery UI, view available datasets and run a simple query.
[Competitive Talking Point]: Integration with the Google Data Catalog means that BigQuery metadata can be managed along with other data sources, such as data lakes or operational data sources. This is one example that shows that Google Cloud is not just a relational data warehouse, it is an entire Analytical Data platform.
SELECT distinct (merchant) FROM bq_demo.card_transactions LIMIT 1000
Sharing data and governance is crucial, this can be done intuitively in the BQ UI. In this section you will learn how to create a new dataset, populate it with a view, and share that dataset.
[Competitive Talking Point]: BigQuery is fully ANSI SQL compliant and supports both simple and complex multi-table joins and rich analytical functions. We have continuously released enhanced support for common SQL data types and functions used in traditional data warehouses to ease the migration process.
WITH revenue_by_month AS ( SELECT card.type AS card_type, FORMAT_DATE('%Y-%m', trans_date) as revenue_date, SUM(amount) as revenue FROM bq_demo.card_transactions JOIN bq_demo.card ON card_transactions.cc_number = card.card_number WHERE trans_date DATE_ADD(CURRENT_DATE, INTERVAL -1 YEAR) GROUP BY card_type, revenue_date ) SELECT card_type, revenue_date, revenue as monthly_rev, revenue - LAG(revenue) OVER (ORDER BY card_type, revenue_date ASC) as rev_change FROM revenue_by_month ORDER BY card_type, revenue_date ASC;
[Competitive Talking Point]: Another benefit of BigQuery compared to its competitors is the BI Engine. BI Engine can be used to make BI type summary queries return in less than a second through in-memory caching engine. This is currently supported by Google Data Studio but will soon be available to accelerate all queries in BigQuery.
Snowflake relies on 3rd party BI tools for dashboards and data visualization while GCP offers a range of integrated BI tools, including Connected Sheets, Data Studio, and Looker.
In this section you will learn how to create a new table and perform a JOINS on one of the many public datasets that Google Cloud has available.
[Competitive Talking Point]:
BigQuery has supported shared data sets for years. Customers in any project can query both public data sets and data sets in other projects that have been shared with them.
BigQuery can support data lakes in GCS through the use of external tables. In addition to bulk loading, BigQuery supports the ability to stream data into the database at rates upwards of hundreds of MB per second. Snowflake has no support for streaming data.
SELECT CAST(geo_id as STRING) AS zip_code, total_pop, median_age, households, income_per_capita, housing_units, vacant_housing_units_for_sale, ROUND(SAFE_DIVIDE(employed_pop, pop_16_over),4) AS rate_employment, ROUND(SAFE_DIVIDE(bachelors_degree_or_higher_25_64, pop_25_64),4) AS rate_bachelors_degree_or_higher_25_64 FROM `bigquery-public-data.census_bureau_acs.zip_codes_2017_5yr`;
WITH customer_counts AS ( select regexp_extract(address, "[0-9][0-9][0-9][0-9][0-9]") as zip_code, count(*) as num_clients FROM bq_demo.client GROUP BY zip_code ) SELECT CAST(geo_id as STRING) AS zip_code, total_pop, median_age, households, income_per_capita, ROUND(SAFE_DIVIDE(employed_pop, pop_16_over),4) AS rate_employment, num_clients FROM `bigquery-public-data.census_bureau_acs.zip_codes_2017_5yr` JOIN customer_counts on zip_code = geo_id ORDER BY num_clients DESC
BQ offers multiple pricing models to meet your needs. Most large customers primarily leverage flat rate for predictable pricing with reserved capacity. For bursting beyond that baseline capacity, BQ offers flex slots which allow you to grow into additional capacity on the fly and then automatically shrink back with no impact on running queries. BQ also has a byte scan model which allows you to only pay for the queries you run.
[Competitive Talking Point: Some competitors work exclusively on a fixed capacity model where customers have to allocate a virtual warehouse for each workload in their organization. In addition to a low-cost per-query model that makes it easy to get started with BigQuery, we support a flat rate capacity pricing model where idle capacity can be shared among a set of workloads.]