Maps can be a very powerful tool when visualizing the patterns in a dataset that is related to location in some way. This relation could be the name of a place, a specific latitude and longitude value, or the name of an area that has a specific boundary like a census tract or a postal code.

When these datasets get very large, they can be hard to query and visualize using conventional tools. By using Google BigQuery to query the data and the Google Maps APIs to construct the query and visualize the output, you can quickly explore geographic patterns in your data with very little setup or coding, and without having to manage a system to store very large datasets.

What you'll build

In this codelab, you'll write and run some queries that demonstrate how to provide location based insights into very large public datasets using BigQuery. You'll also build a web page that loads a map using the Google Maps JavaScript API, and that runs and visualizes spatial queries against the same very large public datasets using the Google APIs Client Library for Javascript, and the BigQuery API.

What you'll learn

What you'll need

BigQuery

BigQuery is Google's data analytics service for very large datasets. It has a RESTful API and supports queries written in SQL. If you have data with latitude and longitude values they can be used to query your data by location. The advantage is you can visually explore very large datasets to look at the patterns without having to manage any server or database infrastructure. You can get answers to your questions in a few seconds no matter how large your tables grow using BigQuery's massive scalability and managed infrastructure.

Google Maps API

The Google Maps APIs provide programmatic access to the data in Google Maps. Over 2 million websites and apps currently use it to provide embedded maps and location based queries to their users.

The Google Maps Javascript API Drawing Layer allows you to draw shapes on the map. These can be converted into input to run queries against BigQuery tables that have latitude and longitude values stored in columns.

To get started you need a Google Developers Console project with BigQuery and Maps APIs enabled.

Google Account

If you don't already have a Google Account (Gmail or Google Apps), you must create one.

Create a Project

Sign in to Google Cloud Platform console (console.cloud.google.com) and create a new project. At the top of your screen, there is a Project drop down menu:

Once you click on this project drop down menu, you will get a menu item that allows you to create a new project:

In the box that says "Enter a new name for your project", enter a name for your new project, for example "BigQuery Codelab":

A Project ID will be generated for you. The Project ID is a unique name across all Google Cloud projects. Remember your Project ID, as you'll use it later. The name above has already been taken and will not work for you. Insert your own Project ID wherever you see YOUR_PROJECT_ID in this codelab.

Enable Billing

To sign up for BigQuery, use the project selected or created in the previous step. Billing must be enabled on this project. Once billing is enabled, you can enable the BigQuery API.

How you enable billing depends on whether you're creating a new project or you're re-enabling billing for an existing project.

Google offers a 12 month free trial for up to $300 worth of Google Cloud Platform usage which you may be able to use for this Codelab, find out more details at https://cloud.google.com/free/.

New projects

When you create a new project, you're prompted to choose which of your billing accounts you want to link to the project. If you have only one billing account, that account is automatically linked to your project.

If you don't have a billing account, you must create one and enable billing for your project before you can use many Google Cloud Platform features. To create a new billing account and enable billing for your project, follow the instructions in Create a new billing account.

Existing projects

If you have a project that you temporarily disabled billing for, you can re-enable billing:

  1. Go to the Cloud Platform Console.
  2. From the projects list, select the project to re-enable billing for.
  3. Open the console left side menu and select Billing . You're prompted to select a billing account.
  4. Click Set account.

Create a new Billing account

To create a new billing account:

  1. Go to the Cloud Platform Console and sign in or, if you don't already have an account, sign up.
  2. Open the console left side menu and select Billing
  3. Click the New billing account button. (Note that if this is not your first billing account, first you need to open the billing account list by clicking the name of your existing billing account near the top of the page, and then clicking Manage billing accounts.)
  4. Enter the name of the billing account and enter your billing information. The options you see depend on the country of your billing address. Note that for United States accounts, you cannot change tax status after the account is created.
  5. Click Submit and enable billing.

By default, the person who creates the billing account is a billing administrator for the account.

For information about verifying bank accounts and adding backup methods of payment, see Add, remove, or update a payment method.

Enable the BigQuery API

Open the Google Cloud Console and click on the API Manager item in the left menu. You may need to open the "Product and Services" menu by clicking the three-line icon at the top left.

If you're on the Dashboard page it may also be shown as a card in the main page.

From the API Manager screen, click "Enable API" at the top of the main screen.

A list of APIs will be shown.

Find and click "Google BigQuery API".

On the BigQuery page, if the API is not already enabled, click "Enable".

There are three ways to query location data stored as latitude, longitude values in BigQuery.

To get started, use the Web UI to run the following queries against the NYC taxi data.

Rectangle Queries

Rectangle queries are quite straightforward to construct in BigQuery. You just need to add a WHERE clause that limits the results returned to those with locations between the minimum and maximum values for latitude and longitude.

Try the example below in the BigQuery console. This queries for some average trip statistics for rides that started in a rectangular area that contains midtown and lower Manhattan. There are two different locations you can try, uncomment the second WHERE clause to run the query on rides that started at JFK airport.

SELECT 
ROUND(AVG(tip_amount),2) as avg_tip, 
ROUND(AVG(fare_amount),2) as avg_fare, 
ROUND(AVG(trip_distance),2) as avg_distance, 
ROUND(AVG(tip_proportion),2) as avg_tip_pc, 
ROUND(AVG(fare_per_mile),2) as avg_fare_mile FROM

(SELECT 

pickup_latitude, pickup_longitude, tip_amount, fare_amount, trip_distance, (tip_amount / fare_amount)*100.0 as tip_proportion, fare_amount / trip_distance as fare_per_mile

FROM [bigquery-public-data:new_york.tlc_yellow_trips_2015] 

WHERE trip_distance > 0.01 AND fare_amount <100 AND payment_type = "1"
)

--Manhattan
WHERE pickup_latitude < 40.7679 AND pickup_latitude > 40.7000 AND pickup_longitude < -73.97 and pickup_longitude > -74.01

--JFK
--WHERE pickup_latitude < 40.654626 AND pickup_latitude > 40.639547 AND pickup_longitude < -73.771497 and pickup_longitude > -73.793755

The results for the two queries show that there are big differences in average trip distance, fare, and tip for pick ups in the two locations.

Manhattan

avg_tip

avg_fare

avg_distance

avg_tip_pc

avg_fare_mile

2.52

12.03

9.97

22.39

5.97

JFK

avg_tip

avg_fare

avg_distance

avg_tip_pc

avg_fare_mile

9.22

48.48

41.19

22.48

4.36

Radius Queries

Radius queries are also easy to construct in SQL if you know a bit of math. Using BigQuery's Legacy SQL Math functions you can construct an SQL query using the Haversine Formula which approximates a circular area or spherical cap on the earth's surface.

Here's an example BigQuery SQL statement for a circle query centred at 40.73943, -73.99585 with a radius of 0.1km.

It uses a constant value of 111.045 kilometers to approximate the distance represented by one degree.

This is based on an example found at http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/:

SELECT pickup_latitude, pickup_longitude, 
    (111.045 * DEGREES( 
      ACOS( 
        COS( RADIANS(40.73943) ) * 
        COS( RADIANS( pickup_latitude ) ) * 
        COS( 
          RADIANS( -73.99585 ) - 
          RADIANS( pickup_longitude ) 
        ) + 
        SIN( RADIANS(40.73943) ) * 
        SIN( RADIANS( pickup_latitude ) ) 
      ) 
     ) 
    ) AS distance FROM [project:dataset.tableName] 
    HAVING distance < 0.1 

BigQuery supports two versions of SQL: Legacy SQL and Standard SQL. The latter is the 2011 ANSI standard.

Standard SQL

In BigQuery Legacy SQL, this query will run as is. However Standard SQL doesn't support the DEGREES and RADIANS functions, so to use Standard SQL the query will need to be changed. DEGREES and RADIANS can be easily calculated in Standard SQL using User Defined Functions.

Also, note that the syntax for referencing the table name is different in Standard SQL: `project.dataset.tableName` instead of [project:dataset.tableName].

The same query looks like this in Standard SQL. For clarity the distance calculation has been written as a User Defined Function too.

#standardSQL
CREATE TEMPORARY FUNCTION Degrees(radians FLOAT64) RETURNS FLOAT64 AS
(
  (radians*180)/(22/7)
);

CREATE TEMPORARY FUNCTION Radians(degrees FLOAT64) AS (
  (degrees*(22/7))/180
);

CREATE TEMPORARY FUNCTION DistanceKm(lat FLOAT64, lon FLOAT64, lat1 FLOAT64, lon1 FLOAT64) AS (
     Degrees( 
      ACOS( 
        COS( Radians(lat1) ) * 
        COS( Radians(lat) ) *  
        COS( Radians(lon1 ) -  
        Radians( lon ) ) +  
        SIN( Radians(lat1) ) *  
        SIN( Radians( lat ) ) 
        ) 
    ) * 111.045
);

SELECT pickup_latitude, pickup_longitude, DistanceKm(pickup_latitude, pickup_longitude, 40.73943, -73.99585) 

FROM `project.dataset.tableName` 

WHERE 
DistanceKm(pickup_latitude, pickup_longitude, 40.73943, -73.99585) < 0.1

The SQL for the Haversine Formula looks complicated but all you need to do is plug in your circle centre coordinate, the radius and the project, dataset and table names for BigQuery.

Here's an example query that computes some average trip statistics for pickups within 100m of the Empire State Building. Copy and paste this into the BigQuery web console to see the results. Change the latitude and longitude to compare with other areas such as the location in the Bronx.

Legacy SQL

SELECT 

ROUND(AVG(tip_amount),2) as avg_tip,
ROUND(AVG(fare_amount),2) as avg_fare,
ROUND(AVG(trip_distance),2) as avg_distance,
ROUND(AVG(tip_proportion), 2) as avg_tip_pc,
ROUND(AVG(fare_per_mile),2) as avg_fare_mile

FROM

-- EMPIRE STATE BLDG 40.748459, -73.985731
-- BRONX 40.895597, -73.856085
(SELECT pickup_latitude, pickup_longitude, tip_amount, fare_amount, trip_distance, tip_amount/fare_amount*100 as tip_proportion, fare_amount / trip_distance as fare_per_mile, 
    (111.045 * DEGREES( 
      ACOS( 
        COS( RADIANS(40.748459) ) * 
        COS( RADIANS( pickup_latitude ) ) * 
        COS( 
          RADIANS( -73.985731 ) - 
          RADIANS( pickup_longitude ) 
        ) + 
        SIN( RADIANS(40.748459) ) * 
        SIN( RADIANS( pickup_latitude ) ) 
      ) 
     ) 
    ) AS distance FROM [bigquery-public-data:new_york.tlc_yellow_trips_2015] 
    WHERE fare_amount > 0 and trip_distance > 0
    HAVING distance < 0.1 )
WHERE fare_amount < 100

Standard SQL

#standardSQL
CREATE TEMPORARY FUNCTION Degrees(radians FLOAT64) RETURNS FLOAT64 AS
(
  (radians*180)/(22/7)
);

CREATE TEMPORARY FUNCTION Radians(degrees FLOAT64) AS (
  (degrees*(22/7))/180
);

CREATE TEMPORARY FUNCTION DistanceKm(lat FLOAT64, lon FLOAT64, lat1 FLOAT64, lon1 FLOAT64) AS (
     Degrees( 
      ACOS( 
        COS( Radians(lat1) ) * 
        COS( Radians(lat) ) *  
        COS( Radians(lon1 ) -  
        Radians( lon ) ) +  
        SIN( Radians(lat1) ) *  
        SIN( Radians( lat ) ) 
        ) 
    ) * 111.045
);

SELECT 

ROUND(AVG(tip_amount),2) as avg_tip,
ROUND(AVG(fare_amount),2) as avg_fare,
ROUND(AVG(trip_distance),2) as avg_distance,
ROUND(AVG(tip_proportion), 2) as avg_tip_pc,
ROUND(AVG(fare_per_mile),2) as avg_fare_mile

FROM

-- EMPIRE STATE BLDG 40.748459, -73.985731
-- BRONX 40.895597, -73.856085

(SELECT pickup_latitude, pickup_longitude, tip_amount, fare_amount, trip_distance, tip_amount/fare_amount*100 as tip_proportion, fare_amount / trip_distance as fare_per_mile, DistanceKm(pickup_latitude, pickup_longitude, 40.748459, -73.985731)


FROM `bigquery-public-data.new_york.tlc_yellow_trips_2015` 

WHERE 
  DistanceKm(pickup_latitude, pickup_longitude, 40.748459, -73.985731) < 0.1
  AND fare_amount > 0 and trip_distance > 0
  )
WHERE fare_amount < 100

The results of the query are below. You can see that there are big differences in the average tip, fare, trip distance, the proportionate size of the tip to the fare, and the average fare per mile driven.

Empire State Building:

avg_tip

avg_fare

avg_distance

avg_tip_pc

avg_fare_mile

1.17

11.08

45.28

10.53

6.42

The Bronx

avg_tip

avg_fare

avg_distance

avg_tip_pc

avg_fare_mile

0.52

17.63

4.75

4.74

10.9

Polygon Queries

SQL doesn't support querying using arbitrary shapes other than rectangles and circles. BigQuery doesn't have any native geometry data type or spatial index, so to run queries using polygon shapes you need a different approach to straightforward SQL queries. One approach is to define a geometry function in JavaScript and execute it as a User Defined Function (UDF) in BigQuery.

Many geometry operations can be written in Javascript so it's easy to take one and execute it against a BigQuery table that contains latitude and longitude values. You need to pass the custom polygon in via a UDF and perform a test against each row, returning only rows where the latitude and longitude fall inside the polygon. Find out more about UDFs in the BigQuery reference.

Point In Polygon algorithm

There are many ways to compute whether a point falls inside a polygon in Javascript. Here's one that is a port from C of a well known implementation that uses a ray-tracing algorithm to determine if a point is inside or outside a polygon by counting the number of times an infinitely long line crosses the boundary of the shape. It only takes up a few lines of code:

function pointInPoly(nvert, vertx, verty, testx, testy){
  var i, j, c = 0;
  for (i = 0, j = nvert-1; i < nvert; j = i++) {
    if ( ((verty[i]>testy) != (verty[j]>testy)) &&
                (testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) )
      c = !c;
  }
  return c;
}

Porting to JavaScript

The JavaScript version of this algorithm is in third_party/point-in-polygon.js

/* This function includes a port of C code to calculate point in polygon
* see http://www.ecse.rpi.edu/Homepages/wrf/Research/Short_Notes/pnpoly.html for license
*/

function pointInPoly(polygon, point){
    // Convert a JSON poly into two arrays and a vertex count.
    var vertx = [];
    var verty = [];
    var nvert = 0;
    var testx = point[0];
    var testy = point[1];
    for (coord in polygon){
      vertx[nvert] = polygon[coord][0];
      verty[nvert] = polygon[coord][1];
      nvert ++;
    }

        
    // The rest of this function is the ported implementation.
    var i, j, c = 0;
    for (i = 0, j = nvert-1; i < nvert; j = i++) {
      if ( ((verty[i]>testy) != (verty[j]>testy)) &&
         (testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) )
        c = !c;
    }
    return c;
}

Legacy SQL vs Standard SQL

The UDF approach is quite different in Standard SQL to Legacy SQL.

In Legacy SQL you must specify a JavaScript function using the built in bigquery.defineFunction() method. You then reference the defined JavaScript function by name in the SQL statement. The BigQuery web console gives you two separate windows for these two separate parts of the query.

Legacy SQL example


To try this out in Legacy SQL, copy & paste the following JavaScript function into the UDF editor window of a new query.

//JavaScript UDF for point in polygon calculation using Legacy SQL approach
bigquery.defineFunction("pointInPolygon",
  ["pickup_latitude", "pickup_longitude", "pickup_datetime"],
  [
    {name: "latitude", type: "float"},
    {name: "longitude", type: "float"},
    {name:"pickup_datetime", type:"string"}
  ],
  inPoly
);

function pointInPoly(polygon, point){
  var vertx = [];
  var verty = [];
  var nvert = 0;
  var testx = point[0];
  var testy = point[1];
  for(coord in polygon){
    vertx[nvert] = polygon[coord][0];
    verty[nvert] = polygon[coord][1];
    nvert ++;
  }
  var i, j, c = 0;
  for (i = 0, j = nvert-1; i < nvert; j = i++) {
    if ( ((verty[i]>testy) != (verty[j]>testy)) &&(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) )
    {c = !c;}
  }
  return c;
}

function inPoly(row, emit){
  var poly=[
    [-73.98925602436066,40.743249676056955],
    [-73.98836016654968,40.74280666503313],
    [-73.98915946483612,40.741676770346295],
    [-73.98967981338501,40.74191656974406]
  ];
  var pt = [row.pickup_longitude,row.pickup_latitude];
  var result = pointInPoly(poly, pt);
  if(result) {
    emit({latitude: pt[1], longitude: pt[0]});
  }
}

Next, copy and paste the following SQL statement. Notice how it references pointInPolygon(), the JS function defined above.

SELECT latitude,longitude FROM pointInPolygon(
  SELECT pickup_longitude, pickup_latitude, pickup_datetime
  FROM [bigquery-public-data:new_york.tlc_yellow_trips_2016]
  WHERE pickup_datetime BETWEEN TIMESTAMP("2016-01-01 00:00:01") AND TIMESTAMP("2016-01-29 23:59:59")
)
LIMIT 1000

Click "Run Query" to try this out.

Standard SQL example

When using Standard SQL in BigQuery, the UDF approach requires just a single statement, but the UDF must be defined as a temporary function in the statement. Here's an example. To use it, click "View Options" next to the Run Query button, and uncheck "Use Legacy SQL". The UDF Editor window will not be available. Paste the SQL statement below into the Query Editor window.

CREATE TEMPORARY FUNCTION pointInPolygon(latitude FLOAT64, longitude FLOAT64)
RETURNS BOOL LANGUAGE js AS """
  var polygon=[[-73.98925602436066,40.743249676056955],[-73.98836016654968,40.74280666503313],[-73.98915946483612,40.741676770346295],[-73.98967981338501,40.74191656974406]];

  var vertx = [];
  var verty = [];
  var nvert = 0;
  var testx = longitude;
  var testy = latitude;

  for(coord in polygon){
    vertx[nvert] = polygon[coord][0];
    verty[nvert] = polygon[coord][1];
    nvert ++;
  }
  var i, j, c = 0;
  for (i = 0, j = nvert-1; i < nvert; j = i++) {
    if ( ((verty[i]>testy) != (verty[j]>testy)) &&(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) ){
      c = !c;
    }
  }
  return c;
""";

SELECT pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude, pickup_datetime
FROM `bigquery-public-data.new_york.tlc_yellow_trips_2016`
WHERE pointInPolygon(pickup_latitude, pickup_longitude) = TRUE
AND (pickup_datetime BETWEEN TIMESTAMP("2016-01-01 00:00:01") AND TIMESTAMP("2016-02-28 23:59:59"))
LIMIT 1000

Congratulations!

You have now run three types of spatial query using BigQuery. As you have seen, location makes a big difference to the result data for the queries against this dataset. But, unless you guess where to run your queries, it is hard to discover spatial patterns ad-hoc using just SQL queries. If only we could visualise the data on a map, and explore the data by defining arbitrary areas of interest! Well, using the Google Maps APIs you can do just that. First, you need to enable the Maps API, set up a simple web page running on your local machine, and start using the BigQuery API to send queries from your web page.

Having run some simple spatial queries, the next step is to visualize the output to see the patterns. To do this you will enable the Google Maps API and build a web page that sends queries from a map to BigQuery, and draws the results on the map.

Enable the Google Maps API

To sign up for the Google Maps APIs, visit the developer site and click 'Get a Key'.

You'll see a dialog pop up asking you to select or create a project.

To use the project you just created, or an existing project, select it from the list and click "Enable API".

You should see a dialog with your new API key for the Google Maps Javascript API. Click Finish.

Click the following button to download all the code for this codelab:

Download the code

Unpack the downloaded zip file. This will unpack a root folder (bigquery), which contains one folder for each step of this codelab, along with all of the resources you will need.

The stepN folders contain the desired end state of each step of this codelab. They are there for reference. We'll be doing all our coding work in a directory called work.

Set up a local web server

While you're free to use your own web server, this codelab is designed to work well with the Chrome Web Server. If you don't have that app installed yet, you can install it from the Chrome Web Store.

Install Web Server for Chrome

Once installed open the app. In Chrome you can do this as follows:

Open Chrome.

In the address bar at the top, type chrome://apps.

Press Enter.

Click the app you want to open.

Right-click an app to open it in a regular or pinned tab, full screen, or new window.

In the window that opens, click on the Web Server icon:

You'll see this dialog next, which allows you to configure your local web server:

Click "CHOOSE FOLDER" and select the location that you downloaded the codelab sample files to.

Under Options, check the box next to "Automatically show index.html", as shown below:

Then stop and restart the server by sliding the toggle labeled "Web Server: STARTED" to the left and then back to the right.

Use a text editor or Integrated Development Environment of your choice

This codelab can be completed using any text editor.

Create a basic map page

Start with a simple HTML page that loads a Google Map using the Google Maps API and a few lines of Javascript. The code from the Google Maps Javascript API Simple Map Sample is a great place to start. It's reproduced here for you to copy and paste into a text editor or IDE of your choice, or you can find it by opening index.html from the repo you downloaded.

Copy index.html to the work folder in your local copy of the repo.

Open index.html in a text editor of your choice. Change the API key to the one you created earlier.

<script src="https://maps.googleapis.com/maps/api/js?key=YOUR_API_KEY&callback=initMap"
    async defer></script>

Change the default location and zoom level

This tutorial works with with BigQuery taxi trip data for New York, so change the map initialization code to center on a location in New York City at an appropriate zoom level. 11 or 12 should work well. NB latitude/longitude data is not in data from 2016 and later.

The code that sets the location and zoom level is on lines 27 & 28 of index.html:

<script>
      var map;
      function initMap() {
        map = new google.maps.Map(document.getElementById('map'), {
          center: {lat: -34.397, lng: 150.644},
          zoom: 8
        });
      }
</script>

When you've changed the code to center the map, load index.html in a web browser to check that it's working OK. The exact URL will vary depending on which web server you have set up earlier in the codelab. If you are using Web Server for Chrome, the URL will be http://127.0.0.1:8887/work/index.html.

Load the drawing and visualization libraries

To add drawing capabilities to your map, load the Google Maps JavaScript API with the optional drawing library enabled. First, make sure you're referencing the drawing library when you load the Javascript API. This codelab uses the HeatmapLayer, so you also need the visualization library. To do this, change your script tag to add the "libraries" parameter, specifying the visualization and drawing libraries as values, e.g. libraries=visualization,drawing

It should look like this:

<script src='http://maps.googleapis.com/maps/api/js?libraries=visualization,drawing&callback=initMap&key=YOUR_API_KEY' async defer></script>

Add the DrawingManager

To use user-drawn shapes as input to a query, add the DrawingManager to your map, with the Circle, Rectangle and Polygon tools enabled.

You can create a DrawingManager object as follows:

var drawingManager = new google.maps.drawing.DrawingManager();
drawingManager.setMap(map);

It is a good idea to put all of the DrawingManager set up code into a new function.

In your copy of index.html, add a function called setUpDrawingTools(). Add a line of code to the initMap() function to call this new function after the map object is created.

function initMap() {
  map = new google.maps.Map(document.getElementById('map'), {
    center: {lat: 40.744593, lng: -73.990370}, // Manhattan, New York.
    zoom: 12
  });

  setUpDrawingTools();
}

function setUpDrawingTools(){
    //code will go here
}

Options

Add code inside the setUpDrawingTools() function to create the DrawingManager and set its map property to reference the map object in the page.

Provide DrawingManager Options to set the default shape drawing type, and display options for drawn shapes. For selecting areas of map to send as queries, the shapes should have an opacity of zero. You can do all of this by adding the following code to the setUpDrawingTools() function.

// Initialize drawing manager
drawingManager = new google.maps.drawing.DrawingManager({
  drawingMode: google.maps.drawing.OverlayType.CIRCLE,
  drawingControl: true,
  drawingControlOptions: {
    position: google.maps.ControlPosition.TOP_LEFT,
    drawingModes: [
      google.maps.drawing.OverlayType.CIRCLE,
      google.maps.drawing.OverlayType.POLYGON,
      google.maps.drawing.OverlayType.RECTANGLE
    ]
  },
  circleOptions: {
    fillOpacity: 0
  },
  polygonOptions: {
    fillOpacity: 0
  },
  rectangleOptions: {
    fillOpacity: 0
  }
});
drawingManager.setMap(map);

Events

You need some code to handle the events that are fired when a user finishes drawing a shape, as you need the coordinates of the drawn shapes to construct SQL queries. We will add code this in a later step, but for now add three empty event handlers to handle separate rectanglecomplete, circlecomplete and polygoncomplete events. The handlers do not have to run any code at this stage. Here's an example for the rectanglecomplete event handler.

drawingManager.addListener('rectanglecomplete', function (rectangle) {
    // We will add code here in a later step.
});
drawingManager.addListener('circlecomplete', function (circle) {
  // We will add code here in a later step.
});

drawingManager.addListener('polygoncomplete', function (polygon) {
  // We will add code here in a later step.
});

You can find a working example of this code in your local copy of the repo, in the step2 folder: step2/map.html.

Open your copy of index.html in your web server to check that you have the drawing tools visible and that you can draw circles, rectangles and polygon shapes (you can click and drag to draw circles and rectangles but polygons need to be drawn by clicking for each vertex, and double-clicking to finish the shape).

The Google BigQuery Client API will help you avoid writing lots of boilerplate code to build the requests, parse responses, and handle authentication. This codelab uses the BigQuery API via Google APIs Client Library for Javascript as we will be developing a browser based application.

You will add code to load this API in a web page and use it to interact with BigQuery.

Add the Google Client API for JavaScript

You will use the Google Client API for Javascript to run queries against BigQuery. In your copy of index.html (in your work folder), load the API using a <script> tag like this. Put the tag immediately below the <script> tag that loads the Maps API:

<script src='https://apis.google.com/js/client.js'></script>

After loading the Google Client API, authorize the user to access the data in BigQuery. To do this you can use OAuth 2.0. First, you need to set up some credentials in your Google Cloud Console Project.

Create OAuth 2.0 Credentials

In your Google Cloud Console project, open the API Manager. You may need to use the "Products and Services" menu from the three-line icon at the top left.

In the API Manager page, click "Credentials" from the menu on the left.

In the main page you will see a "Create Credentials" button. Click this.

From the drop down list that appears, select "OAuth Client ID".

Before you can set up your credentials, you need to add some configuration for the Authorization screen that an and user of your application will see when they authorize your app to access BigQuery data on their behalf.

Click the "Configure Consent Screen" button.

In the "Product name shown to users" box, type a name for your project. For example "BigQuery and Maps API Codelab".

Click "Save".

Now choose "Web application" from the Application Type options.

A form will appear. Enter a name for your client and click "Create". A pop up will show you the client ID and client secret.

You need the client ID to perform authentication against BigQuery. Copy it and paste it into work/index.html as a new global JavaScript variable called clientId. You also need to add a variable for the scopes parameter that OAuth 2.0 needs. In this case we just need to use the scope 'https://www.googleapis.com/auth/bigquery'

var clientId = 'YOUR_CLIENT_ID';
var scopes = 'https://www.googleapis.com/auth/bigquery';

Your web page will need to authorize the user to access BigQuery before initializing the map. In this example we use OAuth 2.0 as described in the authorization section of the JavaScript Client API documentation. You need to use the OAuth client ID and your Project ID to send queries.

When the Google Client API is loaded in the web page you need to perform the following steps:

See step3/map.html for an example of how the finished HTML page would look.

Authorize the user

The end user of the application needs to authorize the application to access data in BigQuery on her or his behalf. The Google Client API for JavaScript handles the OAuth logic to do this.

In a real-world application you have many choices about how to integrate the authorization step.

For example, you could call authorize() from a UI element like a button, or do it when the page has loaded. Here we have chosen to authorize the user after the Google Client API for JavaScript has been loaded, by using a callback function in the gapi.load()method.

Write some code immediately after the <script> tag that loads the Google Client API for Javascript to load both the client library and the auth module so that we can authenticate the user straight away.

<script src='https://apis.google.com/js/client.js'></script>
<script type='text/javascript'>
  gapi.load('client:auth', authorize);
</script>

If authorized, load the BigQuery API

After the user has been authorized, load the BigQuery API.

Firstly, call gapi.auth.authorize() with the clientId and scopes variables you added in the previous step. Handle the response in a callback function called handleAuthResult.

The immediate parameter controls whether a pop up is shown to the user. Set it to true to suppress the authorization pop up if the user is already authorized.

Add a function to your page called handleAuthResult(). The function needs to take an authresult parameter which will let you control the flow of logic depending on whether the user was successfully authorized or not.

Also add a function called loadApi to load the BigQuery API if the user is successfully authorized.

Add logic in the handleAuthResult() function to call loadApi() if there is an authResult object passed to the function, and if the object's error property has a value of false.

Add code to the loadApi() function to load the BigQuery API v2 using the gapi.client.load() method.

var clientId = 'your-client-id-here';
var scopes = 'https://www.googleapis.com/auth/bigquery';

// Check if the user is authorized.
function authorize(event) {
  gapi.auth.authorize({client_id: clientId, scope: scopes, immediate: false}, handleAuthResult);
  return false;
}

// If authorized, load BigQuery API
function handleAuthResult(authResult) {
  if (authResult && !authResult.error) {
    loadApi();
  } else {
    console.error('Not authorized.')
  }
}

// Load BigQuery client API
function loadApi(){
  gapi.client.load('bigquery', 'v2');
}

Load the map

The final step is to initialize the map. You need to change the order of the logic slightly to do this. Currently it initializes when the Maps API JavaScript has loaded.

You can do this by calling the initMap() function from the then() method after the load() method on the gapi.client object.

// Load BigQuery client API
function loadApi(){
  gapi.client.load('bigquery', 'v2').then(
    function() {
      initMap();
    }
  );
}

BigQuery API calls usually execute in seconds but may not return a response immediately. You need some logic to poll BigQuery to find out the status of long-running jobs, and only fetch the results when the job is complete.

The complete code for this step is in step4/map.html.

Sending a request

Add a Javascript function to work/index.html to send a query using the API, and some variables to store the values of the BigQuery dataset and project containing the table to query, and the project ID that will be billed for any charges.

var datasetId = 'your_dataset_id';
var billingProjectId = 'your_project_id';
var publicProjectId = 'bigquery-public-data';

function sendQuery(queryString){
  var request = gapi.client.bigquery.jobs.query({
      'query': queryString,
      'timeoutMs': 30000,
      'datasetId': datasetId,
      'projectId': billingProjectId,
      'useLegacySql':false
  });
  request.execute(function(response) {
      //code to handle the query response goes here.
  });
}

Check the status of a job

The checkJobStatus function below shows how to check the status of a job periodically, using the get API method and the jobId returned by the original query request. Here's an example that runs every 500 milliseconds until the job is complete.

var jobCheckTimer;

function checkJobStatus(jobId){
  var request = gapi.client.bigquery.jobs.get({
    'projectId': publicProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    if (response.status.errorResult){
      // Handle any errors.
      console.log(response.status.error);
    } else {
      if (response.status.state == 'DONE'){
        // Get the results.
        clearTimeout(jobCheckTimer);
        getQueryResults(jobId);
      } else {
        // Not finished, check again in a moment.
        jobCheckTimer = setTimeout(checkJobStatus, 500, [jobId]);
      }
    }
  });
}

Amend the sendQuery method to call the checkJobStatus() method as a callback in the request.execute() call. Pass the job ID to checkJobStatus. This is exposed by the response object as jobReference.jobId.

function sendQuery(queryString){
  var request = gapi.client.bigquery.jobs.query({
      'query': queryString,
      'timeoutMs': 30000,
      'datasetId': datasetId,
      'projectId': billingProjectId,
      'useLegacySql':false
  });
  request.execute(function(response) {
      checkJobStatus(response.jobReference.jobId);
  });
}

Getting the results of a query

To get the results of a query when it has finished running, use the jobs.getQueryResults API call. Add a function to your page called getQueryResults(), that accepts a parameter of jobId:

function getQueryResults(jobId){
  var request = gapi.client.bigquery.jobs.getQueryResults({
    'projectId': billingProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    // Do something with the results.
  })
}

There are three ways to use SQL to run spatial queries against data in BigQuery:

There are examples of bounding box and radius queries in the Mathematical Functions section of the BigQuery legacy SQL reference, under 'Advanced Examples'.

For bounding box and radius queries, you can call the BigQuery API query method. Construct the SQL for each query and pass it to the sendQuery function you created in the previous step.

A working example of the code for this step is in step4/map.html.

Rectangle queries

The simplest way to display BigQuery data on a map is to request all rows where the latitude and longitude fall within a rectangle, using a less than and greater than comparison. This could be the current map view or a shape drawn on the map.

To use a shape drawn by the user, change the code in index.html to handle the drawing event fired when a rectangle is completed. In this example the code uses getBounds() on the rectangle object to get an object representing the extent of the rectangle in map coordinates, and passes it to a function called rectangleQuery:

drawingManager.addListener('rectanglecomplete', function (rectangle) {
  rectangleQuery(rectangle.getBounds());
});

The rectangleQuery function just needs to use the top right (north east) and lower left (south west) coordinates to construct a less than/greater than comparison against each row in your BigQuery table. Here's an example that queries a table that has columns called 'pickup_latitude' and 'pickup_longitude' which store the location values.

Specifying the BigQuery table

To query a table using the BigQuery API you need to supply the name of the table in fully qualified form in your SQL query. The format in Standard SQL is project.dataset.tablename. In Legacy SQL it's project:dataset.tablename.

There are many tables of NYC Taxi trips available. To see them, go to the BigQuery web console and expand the "public datasets" menu item. Find the dataset called new_york and expand it to see the tables. Choose the most recent Yellow Taxi trips table (at the time of writing this codelab, it's bigquery-public-data:new_york.tlc_yellow_trips_2016).

Specifying the Project ID

In the API call, you need to specify the name of your Google Cloud Platform project for billing purposes. In this codelab, this is not the same project as the one containing the table. If you were working with a table that you had created in your own project by uploading data, then this Project ID would be the same as the one in your SQL statement.

Add JavaScript variables to your code to hold references to the Public Datasets project that contains the table you are querying, plus the table name and dataset name. You also need a separate variable to refer to your own billing Project ID.

Add global Javascript variables called billingProjectId, publicProjectId, datasetId and tableName to your copy of index.html.

Initialize the variables 'publicProjectId', 'datasetId' and 'tableName' with the details from the BigQuery Public Datasets project. Initialize billingProjectId with your own Project ID (the one you created in "Getting Set Up" earlier in this codelab).

var billingProjectId = 'YOUR_PROJECT_ID';
var publicProjectId = 'bigquery-public-data';
var datasetId = 'new_york';
var tableName = 'tlc_yellow_trips_2016';

Now add two functions to your code to generate the SQL and to send the query to BigQuery using the sendQuery function you created in the previous step.

The first function should be called rectangleSQL() and needs to accept two arguments, a pair of google.Maps.LatLng objects representing the corners of the rectangle in map coordinates.

The second function should be called rectangleQuery(). This passes the query text to the sendQuery function.

var billingProjectId = 'YOUR_PROJECT_ID';
var publicProjectId = 'bigquery-public-data';
var datasetId = 'new_york';
var tableName = 'tlc_yellow_trips_2016';

function rectangleQuery(latLngBounds){
  var queryString = rectangleSQL(latLngBounds.getNorthEast(), latLngBounds.getSouthWest());
  sendQuery(queryString);
}

function rectangleSQL(ne, sw){
  var queryString = 'SELECT pickup_latitude, pickup_longitude '
  queryString +=  'FROM `' + publicProjectId +'.' + datasetId + '.' + tableName + '`'
  queryString += ' WHERE pickup_latitude > ' + sw.lat();
  queryString += ' AND pickup_latitude < ' + ne.lat();
  queryString += ' AND pickup_longitude > ' + sw.lng();
  queryString += ' AND pickup_longitude < ' + ne.lng();
  return queryString;
}

At this point, you have enough code to send a query to BigQuery for all rows contained by a rectangle drawn by the user. Before we add other query methods for circles and freehand shapes, let's have a look at how to handle the data that comes back from a query.

BigQuery tables can be very large—Petabytes of data—and can grow by hundreds of thousands of rows per second. So it's important to try and limit the amount of data returned so that it can be drawn on the map. Drawing the location of every row in a very large result set (tens of thousands of rows or greater) will result in an unreadable map. There are many techniques for aggregating the locations both in the SQL query and on the map, and you can limit the results a query will return.

Full code for this step is available in step5/map.html.

To keep the amount of data transferred to your web page down to a reasonable size for this codelab, modify the rectangleSQL() function to add a statement that limits the response to 10000 rows. In the example below this is specified in a global variable called recordLimit, so that all query functions can use the same value.

var recordLimit = 10000;
function rectangleSQL(ne, sw){
  var queryString = 'SELECT pickup_latitude, pickup_longitude '
  queryString +=  'FROM `' + publicProjectId +'.' + datasetId + '.' + tableName + '`'
  queryString += ' WHERE pickup_latitude > ' + sw.lat();
  queryString += ' AND pickup_latitude < ' + ne.lat();
  queryString += ' AND pickup_longitude > ' + sw.lng();
  queryString += ' AND pickup_longitude < ' + ne.lng();
  queryString += ' LIMIT ' + recordLimit;
  return queryString;
}

To visualize the density of locations you can use a heatmap. The Maps Javascript API has a HeatmapLayer class for this purpose. The HeatmapLayer takes an array of latitude, longitude coordinates so it is quite easy to convert the rows returned from the query into a heatmap.

In the getQueryResults function, pass the response.result.rows array to a new Javascript function called doHeatMap() that will create a heatmap.

Each row will have a property called f which is an array of columns. Each column will have a v property containing the value.

Your code needs to loop through the columns in each row and extract the values.

In the SQL query, you have only asked for the Latitude and Longitude values of the taxi pickups so there will only be two columns in the response.


Don't forget to call setMap() on the heatmap layer when you have assigned the array of positions to it. This will make it visible on the map.

Here's an example:

function getQueryResults(jobId){
  var request = gapi.client.bigquery.jobs.getQueryResults({
    'projectId': billingProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    doHeatMap(response.result.rows);
  })
}

var heatmap;

function doHeatMap(rows){
  var heatmapData = [];
  if (heatmap!=null){
    heatmap.setMap(null);
  }
  for (var i = 0; i < rows.length; i++) {
      var f = rows[i].f;
      var coords = { lat: parseFloat(f[0].v), lng: parseFloat(f[1].v) };
      var latLng = new google.maps.LatLng(coords);
      heatmapData.push(latLng);
  }
  heatmap = new google.maps.visualization.HeatmapLayer({
      data: heatmapData
  });
  heatmap.setMap(map);
}

At this point, you should be able to:

Here is an example of the result from a rectangle query against the 2016 NYC Yellow Taxi data, drawn as a heatmap. This shows the distribution of pickups around the Empire State Building on a Saturday in July:

Radius queries are very similar. Using BigQuery's Legacy SQL Math functions you can construct an SQL query using the Haversine Formula which approximates a circular area on the earth's surface.

Using the same technique for rectangles, you can handle an OverlayComplete event to get the center and radius of a user-drawn circle, and build up the SQL for the query in the same way.

A working example of the code for this step is included in the code repository as step6/map.html.

drawingManager.addListener('circlecomplete', function (circle) {
  circleQuery(circle);
});

In your copy of index.html, add two new empty functions: circleQuery() and haversineSQL().

Then, add a circlecomplete event handler that passes the centre and radius to a new function called circleQuery().

The circleQuery() function will call haversineSQL() to construct the SQL for the query and then send the query by calling the sendQuery() function as per the following example code.

function circleQuery(circle){
  var queryString = haversineSQL(circle.getCenter(), circle.radius);
  sendQuery(queryString);
}

// Calculate a circular area on the surface of a sphere based on a center and radius.
function haversineSQL(center, radius){
  var queryString;
  var centerLat = center.lat();
  var centerLng = center.lng();
  var kmPerDegree = 111.045;

  queryString = 'CREATE TEMPORARY FUNCTION Degrees(radians FLOAT64) RETURNS FLOAT64 LANGUAGE js AS ';
  queryString += '""" ';
  queryString += 'return (radians*180)/(22/7);';
  queryString += '"""; ';

  queryString += 'CREATE TEMPORARY FUNCTION Radians(degrees FLOAT64) RETURNS FLOAT64 LANGUAGE js AS';
  queryString += '""" ';
  queryString += 'return (degrees*(22/7))/180;';
  queryString += '"""; ';

  queryString += 'SELECT pickup_latitude, pickup_longitude '
  queryString += 'FROM `' + projectId +'.' + datasetId + '.' + tableName + '` ';
  queryString += 'WHERE '
  queryString += '(' + kmPerDegree + ' * DEGREES( ACOS( COS( RADIANS('
  queryString += centerLat;
  queryString += ') ) * COS( RADIANS( pickup_latitude ) ) * COS( RADIANS( ' + centerLng + ' ) - RADIANS('
  queryString += ' pickup_longitude ';
  queryString += ') ) + SIN( RADIANS('
  queryString += centerLat;
  queryString += ') ) * SIN( RADIANS( pickup_latitude ) ) ) ) ) ';

  queryString += ' < ' + radius/1000;
  queryString += ' LIMIT ' + recordLimit;
  return queryString;
}

If you want to use Legacy SQL to take advantage of the DEGREES and RADIANS functions, the haversineSql() function would look like the following, and the sendQuery() function needs to set the parameter useLegacySql to true.

There is an example of running this query with Legacy SQL in the code repository at step6/map_legacySql.html.

function haversineSQL(center, radius){
  var queryString;
  var centerLat = center.lat();
  var centerLng = center.lng();
  var kmPerDegree = 111.045;

  queryString = 'SELECT pickup_latitude, pickup_longitude, '
  queryString += '(' + kmPerDegree + ' * DEGREES( ACOS( COS( RADIANS('
  queryString += centerLat;
  queryString += ') ) * COS( RADIANS( pickup_latitude ) ) * COS( RADIANS( ' + center_lng + ' ) - RADIANS('
  queryString += ' pickup_longitude ';
  queryString += ') ) + SIN( RADIANS('
  queryString += centerLat;
  queryString += ') ) * SIN( RADIANS( pickup_latitude ) ) ) ) ) AS distance ';
  queryString += 'FROM [' + publicProjectId +':' + datasetId + '.' + tableName + '] ';
  queryString += 'HAVING distance < ' + radius/1000;
  queryString += ' LIMIT ' + recordLimit;
  return queryString;
}

Try it!

Add the code above and try the 'Circle' tool to select an area of map. The result should look something like this:

Recap: SQL doesn't support querying using arbitrary shapes other than rectangles and circles. BigQuery doesn't have any native geometry data type, so to run queries using polygon shapes you need a different approach to straightforward SQL queries.

One very powerful BigQuery feature that can be used for this is User Defined Functions (UDF). UDFs execute Javascript code inside an SQL query.

Working code for this step is in step7/map.html.

UDFs in the BigQuery API

The BigQuery API approach for UDFs is slightly different to the web console: you'll need to call the jobs.insert method.

The exact code you write depends on the version of SQL you wish to use - BigQuery supports two versions of SQL: Standard SQL and Legacy SQL. Both options are described below.

Specifying an SQL version

In the Google Client API, you can specify whether or not to use Standard SQL in the bigquery.configuration object that is used to send a new job to BigQuery. There is a property called useLegacySql, set this to true or false as required.

Standard SQL

For Standard SQL queries via the API, just a single SQL statement is needed to use a User Defined Function. The value of useLegacySql must be set to false. The JavaScript example below shows a function that creates and sends a request object to insert a new job, in this case a query with a User Defined Function.

A working example of this approach is in step7/map.html.

function polygonSql(polygon) {
  var request = gapi.client.bigquery.jobs.insert({
    'projectId' : billingProjectId,
      'resource' : {
        'configuration':
          {
            'query':
            {
              'query': polygonSql(polygon),
              'useLegacySql': false
            }
          }
      }
  });
  request.execute(function(response) {
    checkJobStatus(response.jobReference.jobId);
  });
}

The SQL query is constructed as follows:

function polygonSql(poly){
  var queryString = 'CREATE TEMPORARY FUNCTION pointInPolygon(latitude FLOAT64, longitude FLOAT64) ';
  queryString += 'RETURNS BOOL LANGUAGE js AS """ ';
  queryString += 'var polygon=' + JSON.stringify(poly) + ';';
  queryString += 'var vertx = [];';
  queryString += 'var verty = [];';
  queryString += 'var nvert = 0;';
  queryString += 'var testx = longitude;';
  queryString += 'var testy = latitude;';
  queryString += 'for(coord in polygon){';
  queryString += '  vertx[nvert] = polygon[coord][0];';
  queryString += '  verty[nvert] = polygon[coord][1];';
  queryString += '  nvert ++;';
  queryString += '}';
  queryString += 'var i, j, c = 0;';
  queryString += 'for (i = 0, j = nvert-1; i < nvert; j = i++) {';
  queryString += '  if ( ((verty[i]>testy) != (verty[j]>testy)) &&(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) ){';
  queryString += '    c = !c;';
  queryString += '  }';
  queryString += '}';
  queryString += 'return c;';
  queryString += '"""; ';
  queryString += 'SELECT pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude, pickup_datetime ';
  queryString += 'FROM `' + projectId + '.' + datasetId + '.' + tableName + '` ';
  queryString += 'WHERE pointInPolygon(pickup_latitude, pickup_longitude) = TRUE ';
  queryString += 'LIMIT ' + recordLimit;
  return queryString;
}

There are two things going on here. Firstly the code is creating the CREATE TEMPORARY FUNCTION statement that encapsulates the JavaScript code to work out if a given point is inside a polygon. The polygon coordinates are inserted using the JSON.stringify(poly) method call to convert a JavaScript array of x,y coordinate pairs into a string. The polygon object is passed as an argument to the function that builds the SQL.

Secondly the code builds the main SQL SELECT statement. The UDF is called in the WHERE expression in this example.

Legacy SQL approach

BigQuery also supports a legacy version of SQL which implements UDFs slightly differently. For Legacy SQL queries via the API, the BigQuery UDF must be generated as a separate UDF resource which is attached to the API request as a parameter called inlineCode.


A full example of this approach is at step7/map_legacySql.html

// These lines truncated here for readability.
var bigqueryDefine = 'bigquery.defineFunction("pointInPolygon", []...';
var inlineJS = 'function pointInPolygon(....';

function polygonQuery(polygon) {
 var request = gapi.client.bigquery.jobs.insert({
    'projectId' : billingProjectId,
    'resource' : {
      'configuration':
      {
        'query':
        {
          'userDefinedFunctionResources':
          [
            {
              'inlineCode': buildUDF(polygon)
            }
          ],
          'query': polygonSQL(),
          'useLegacySql': true
        }
      }
    }
  });
  request.execute(function(response) {
      checkJobStatus(response.jobReference.jobId);
  });
}

function buildUDF(poly){
  var UDFjs = bigqueryDefine + inLineJs;
  UDFjs += 'function inPoly(row, emit){var poly=' + JSON.stringify(poly);
  UDFjs += ';var pt = [row.pickup_longitude,row.pickup_latitude];';
  UDFjs += 'var result = pointInPoly(poly, pt);if (result) emit({latitude: pt[1], longitude: pt[0]});}'
  return UDFjs;
}

There are two JavaScript methods called which create the UDF and the SQL separately.

The Legacy SQL query is constructed like this:

function polygonSQL(){
  var queryString = 'SELECT latitude,longitude ';
  queryString += 'FROM pointInPolygon('; 
  queryString += '  SELECT pickup_longitude, pickup_latitude, pickup_datetime ';
  queryString += '  FROM [' + publicProjectId + ':' + datasetId + '.' + table_name + '] ) ';
  return queryString;
}

The JavaScript for the inline UDF code is omitted from the snippet above for readability. There's a copy in the repository in the UDF folder, called bigquery_udf_legacySql.js. It looks like this:

// The BigQuery User Defined Function definition.
var bigqueryDefine = 'bigquery.defineFunction(' +
  '"pointInPolygon", ' +
  '["pickup_latitude", "pickup_longitude", "pickup_datetime"],' +
  '[' +
    '{name: "latitude", type: "float"},' +
    '{name: "longitude", type: "float"},' +
    '{name:"pickup_datetime", type:"string"}' +
  '], inPoly); 
';

// Point in polygon code.
var inLineJs = 'function pointInPoly(polygon, point){' +
  'var vertx = [];' +
  'var verty = [];' +
  'var nvert = 0;' +
  'var testx = point[0];' +
  'var testy = point[1];' +
  'for(coord in polygon){' +
    'vertx[nvert] = polygon[coord][0];' +
    'verty[nvert] = polygon[coord][1];' +
    'nvert ++;' +
  '}' +
  'var i, j, c = 0;' +
  'for (i = 0, j = nvert-1; i < nvert; j = i++) {' +
    'if ( ((verty[i]>testy) != (verty[j]>testy)) &&(testx < (vertx[j]-vertx[i]) * (testy-verty[i]) / (verty[j]-verty[i]) + vertx[i]) ){' +
      'c = !c;' +
    '}' +
  '}' +
  'return c;' +
'}';

Integrating with the Maps API

To use this with the Maps API drawing library, we need to save the polygon drawn by the user and pass this into the UDF part of the SQL query.

First, we need to handle the polygoncomplete drawing event, to get the coordinates of the shape as an array of longitude and latitude pairs:

drawingManager.addListener('polygoncomplete', function (polygon) {
  var path = polygon.getPaths().getAt(0);
  var queryPolygon = path.map(function(element) {
    return [element.lng(), element.lat()];
  });
  polygonQuery(queryPolygon);
});

The polygonQuery function can then construct the UDF Javascript functions as a string, as well as the SQL statement which will call the UDF function.


See step7/map.html for a working example of this.

Example output

Here's an example result of querying pickups from The 2016 NYC TLC Yellow Taxi data in BigQuery using a freehand polygon, with the selected data drawn as a heatmap.

Here are some suggestions for ways to extend this codelab to look at other aspect of the data. You can find a working example of these ideas at step8/map.html in the code repository.

Mapping drop offs

So far we've only mapped pick up locations. By requesting the dropoff_latitude and dropoff_longitude columns and modifying the heatmap code to plot these instead, you can see the destinations of taxi journeys that started at a specific location. For example, let's see where taxis tend to drop people off when they request a pick up around the Empire State Building.

Change the code for the SQL statement to request these columns in addition to the pickup location.

function polygonSql(){
      var queryString = 'SELECT pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude, pickup_datetime ';
      queryString += 'FROM `' + publicProjectId + '.' + datasetId + '.' + tableName + '` ';
      queryString += 'WHERE pointInPolygon(pickup_latitude, pickup_longitude) = TRUE ';
      queryString += 'LIMIT ' + recordLimit;
      return queryString;
}

The doHeatMap function can then use the dropoff values instead. The result object has a schema that can be inspected to find the location of these columns in the array. In this case they would be at index positions 2 and 3. These indices can be read from a variable to make the code more manageable. NB the maxIntensity of the heatmap is set to show density of 20 drop offs per pixel as the maximum.

Add some variables to allow you to change which columns you use for the heatmap data.

// Show query results as a Heatmap.
function doHeatMap(rows){
  var latCol = 2;
  var lngCol = 3;
  var heatmapData = [];
  if (heatmap!=null){
    heatmap.setMap(null);
  }
  for (var i = 0; i < rows.length; i++) {
      var f = rows[i].f;
      var coords = { lat: parseFloat(f[latCol].v), lng: parseFloat(f[lngCol].v) };
      var latLng = new google.maps.LatLng(coords);
      heatmapData.push(latLng);
  }
  heatmap = new google.maps.visualization.HeatmapLayer({
      data: heatmapData,
      maxIntensity: 20
  });
  heatmap.setMap(map);
}

Here is a heatmap showing the distribution of drop offs from all pickups immediately around the Empire State Building in 2016. You can see large concentrations (the red blobs) of midtown destinations especially around Times Square, as well as along 5th Avenue between 23rd St and 14th St. Other high density locations not shown at this zoom level include La Guardia and JFK airports, the World Trace Center and Battery Park.

Styling the basemap

When you create a Google Map using the JavaScript API, you can set the map style using a JSON object. For data visualizations it can be useful to mute the colors in the map. You can create and try out map styles using the Google Maps API Styling Wizard at mapstyle.withgoogle.com.

You can set a map style when you initialize a map object, or at any subsequent time afterwards. Here's how you'd add a custom style in the initMap() function:

function initMap() {
  map = new google.maps.Map(document.getElementById('map'), {
        center: {lat: 40.744593, lng: -73.990370}, // Manhattan, New York.
  zoom: 15,
  styles: [
    {
        "elementType": "geometry",
          "stylers": [
            {
              "color": "#f5f5f5"
            }
          ]
        },
        {
          "elementType": "labels.icon",
            "stylers": [
              {
                "visibility": "on"
              }
            ]
        },
        {
          "featureType": "water",
            "elementType": "labels.text.fill",
              "stylers": [
                {
                  "color": "#9e9e9e"
                }
              ]
        }
      ]
    });
  setUpDrawingTools();
}

The sample style below shows a greyscale map with points of interest labels.

[
  {
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#f5f5f5"
      }
    ]
  },
  {
    "elementType": "labels.icon",
    "stylers": [
      {
        "visibility": "on"
      }
    ]
  },
  {
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#616161"
      }
    ]
  },
  {
    "elementType": "labels.text.stroke",
    "stylers": [
      {
        "color": "#f5f5f5"
      }
    ]
  },
  {
    "featureType": "administrative.land_parcel",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#bdbdbd"
      }
    ]
  },
  {
    "featureType": "poi",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#eeeeee"
      }
    ]
  },
  {
    "featureType": "poi",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#757575"
      }
    ]
  },
  {
    "featureType": "poi.park",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#e5e5e5"
      }
    ]
  },
  {
    "featureType": "poi.park",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#9e9e9e"
      }
    ]
  },
  {
    "featureType": "road",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#ffffff"
      }
    ]
  },
  {
    "featureType": "road.arterial",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#757575"
      }
    ]
  },
  {
    "featureType": "road.highway",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#dadada"
      }
    ]
  },
  {
    "featureType": "road.highway",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#616161"
      }
    ]
  },
  {
    "featureType": "road.local",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#9e9e9e"
      }
    ]
  },
  {
    "featureType": "transit.line",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#e5e5e5"
      }
    ]
  },
  {
    "featureType": "transit.station",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#eeeeee"
      }
    ]
  },
  {
    "featureType": "water",
    "elementType": "geometry",
    "stylers": [
      {
        "color": "#c9c9c9"
      }
    ]
  },
  {
    "featureType": "water",
    "elementType": "labels.text.fill",
    "stylers": [
      {
        "color": "#9e9e9e"
      }
    ]
  }
]

Giving the user feedback

Even though BigQuery will usually give a response in seconds, it is sometimes useful to show the user that something is happening while the query is running.

Add some UI to your web page that shows the response of the checkJobStatus() function, and an animated graphic to indicate that something's happening.

Information you can display includes query duration, amount of data returned, and amount of data processed.

Add some HTML after the map <div> to create a panel to the page that will show the number of rows returned by a query, the time the query took, and the amount of data processed.

<div id="menu">
    <div id="stats">
        <h3>Statistics:</h3>
        <table>
            <tr>
                <td>Total Locations:</td><td id="rowCount"> - </td>
            </tr>
            <tr>
                <td>Query Execution:</td><td id="duration"> - </td>
            </tr>
            <tr>
                <td>Data Processed:</td><td id="bytes"> - </td>
            </tr>
        </table>
    </div>
</div>

The appearance and position of this panel is controlled by CSS. Add style settings to position the panel in the top left corner of the page below the map type buttons and the drawing toolbar as in the snippet below.

#menu {
  position: absolute; 
  background: rgba(255, 255, 255, 0.8); 
  z-index: 1000; 
  top: 50px; 
  left: 10px; 
  padding: 15px;
}
#menu h1 {
  margin: 0 0 10px 0;
  font-size: 1.75em;
}
#menu div {
  margin: 5px 0px;
}

The animated graphic can be added to the page but hidden until required, and some JavaScript and CSS code used to show it when a BigQuery job is running.

Add some HTML to show an animated graphic. There is an image file called loader.gif in the img folder in the code repository.

<img id="spinner" src="img/loader.gif">

Add some CSS to position the image and hide it by default until it's needed.

#spinner {
  position: absolute; 
  top: 50%; 
  left: 50%; 
  margin-left: -32px; 
  margin-top: -32px; 
  opacity: 0; 
  z-index: -1000;
}

Finally add some JavaScript to update the status panel and show or hide the graphic when a query is running. You can use the response object to update the panel depending on what information is available. When checking a current job, there is a response.statistics property you can use. When the job is complete you can access the response.totalRows and response.totalBytesProcessed properties. It is helpful to the user to convert milliseconds to seconds and bytes to gigabytes for display as shown in code sample below.

function updateStatus(response){
  if(response.statistics){
    var durationMs = response.statistics.endTime - response.statistics.startTime;
    var durationS = durationMs/1000;
    var suffix = (durationS ==1) ? '':'s';
    var durationTd = document.getElementById("duration");
    durationTd.innerHTML = durationS + ' second' + suffix;
  }
  if(response.totalRows){
    var rowsTd = document.getElementById("rowCount");
    rowsTd.innerHTML = response.totalRows;
  }
  if(response.totalBytesProcessed){
    var bytesTd = document.getElementById("bytes");
    bytesTd.innerHTML = (response.totalBytesProcessed/1073741824) + ' GB';
  }
}

Call this method when there is a response to a checkJobStatus() call and when the query results are fetched. For example:

// Poll a job to see if it has finished executing.
function checkJobStatus(jobId){
  var request = gapi.client.bigquery.jobs.get({
    'projectId': billingProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    //Show progress to the user
    updateStatus(response);

    if (response.status.errorResult){
      // Handle any errors.
      console.log(response.status.error);
    } else {

      if (response.status.state == 'DONE'){
        // Get the results.
        clearTimeout(jobCheckTimer);
        getQueryResults(jobId);
      } else {
        // Not finished, check again in a moment.
        jobCheckTimer = setTimeout(checkJobStatus, 500, [jobId]);
      }
    }
  });
}
// When a BigQuery job has completed, fetch the results.
function getQueryResults(jobId){
  var request = gapi.client.bigquery.jobs.getQueryResults({
    'projectId': billingProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    doHeatMap(response.result.rows);
    updateStatus(response);
  })
}

To toggle the animated graphic, add a function to control its visibility. This function will toggle the opacity of any HTML DOM Element passed to it.

function fadeToggle(obj){
    if(obj.style.opacity==1){
        obj.style.opacity = 0;
        setTimeout(function(){obj.style.zIndex = -1000;}, 1000);
    } else {
        obj.style.zIndex = 1000;
        obj.style.opacity = 1;
    }
}

Finally, call this method before processing a query, and after the query result has come back from BigQuery.

This code calls the fadeToggle function when the user has finished drawing a rectangle.

drawingManager.addListener('rectanglecomplete', function (rectangle) {
  //show an animation to indicate that something is happening.
  fadeToggle(document.getElementById('spinner'));
  rectangleQuery(rectangle.getBounds());
});

When the query response has been received, call fadeToggle() again to hide the animated graphic.

// When a BigQuery job has completed, fetch the results.
function getQueryResults(jobId){
  var request = gapi.client.bigquery.jobs.getQueryResults({
    'projectId': billingProjectId,
    'jobId': jobId
  });
  request.execute(function(response){
    console.log(response)
    doHeatMap(response.result.rows);
    //hide the animation.
    fadeToggle(document.getElementById('spinner'));
    updateStatus(response);
  })
}

The page should look something like this.

Have a look at the complete example in step8/map.html.

Too Many Markers

If you're working with very large tables, your query may return too many rows to efficiently display on a map. Limit the results by adding a WHERE clause or a LIMIT statement.

Drawing many markers can make the map unreadable. Consider using a HeatmapLayer to show the density, or cluster markers to indicate where many data points lie using a single symbol per cluster. There are more details in our Marker Clustering tutorial.

Optimizing Queries

BigQuery will scan the entire table with every query. To optimize your BigQuery quota usage, only select the columns you need in your query.

Queries will be faster if you store latitude and longitude as floats rather than strings.

Export Interesting Results

The examples here require the end user to be authenticated against the BigQuery table, which won't suit every use case. When you have discovered some interesting patterns, it may be easier to share these with a wider audience by exporting the results from BigQuery and creating a static dataset using the Google Maps Data Layer.

The Boring Legal Bit

Bear in mind the Google Maps API Terms of Service. Free and public applications do not require a license. A Premium Plan Maps for Work license is required for an application that restricts access, for example behind a firewall or on an intranet. For more details on Maps API pricing and plans see the online documentation.

Play With More Data!

There are a number of public datasets in BigQuery that have latitude and longitude columns, for example the NYC Taxi datasets from 2009-2016, Uber and Lyft NYC trip data, and the GDELT dataset.

We hope this helps you get up and running quickly with some geo queries against BigQuery tables so you can discover patterns and visualize them on a Google Map. Happy mapping!

What's next?

If you'd like to learn more about the Google Maps API or BigQuery, have a look at the following suggestions.

See What is BigQuery to learn more about Google's serverless, petabyte-scale data warehouse service.

Have a look at the how-to guide to create a simple application using the BigQuery API.

See the developer guide for the drawing library for more details on enabling user interaction to draw shapes on a Google Map.

Have a look at other ways to visualize data on a Google Map.

See the Getting Started guide for the Javascript Client API to understand the basic concepts of using the Client API to access other Google APIs.