Note: This lab provides a ready to use dummy dataset in Google Cloud Storage that is based on simulated data for Xonotic games played based on the data displayed at stats.xonotic.org
When you have a large fleet of game servers running and users playing your game you will need a scalable solution for collecting, storing and analyzing all the data generated from your game.
There are 2 main ways you can collect this data, streaming or batch.
In this lab we will look at a serverless batch architecture that sends event logs to Google Cloud Storage ready for you to ingest into BigQuery for analysis.
You can then easily create visualisations and dashboards on top of this data using Google Data Studio, our free dashboarding tool.
In this section of the lab you will create a BigQuery dataset and empty table
Navigate to the BigQuery Web UI in the Google Cloud Console.
Click on your project name in the left navigation and then click on the create dataset button.
Set the Dataset ID as xonotic_analytics.
Leave the data location as default and leave the Default table expiration as never.
Then click on the create dataset button
Use the BigQuery DDL CREATE TABLE function to create a new table called events in the xonotic_analytics dataset. Type the following query into the Query editor and press run.
#standardSQL CREATE TABLE xonotic_analytics.events ( game_id STRING, game_server STRING, game_type STRING, game_map STRING, event_datetime TIMESTAMP, player STRING, killed STRING, weapon STRING, x_cord INT64, y_cord INT64 ) OPTIONS ( description="Xonotic Analytics table for events" )
In this section of the lab you will set-up BigQuery Data Transfer Service that will schedule files from Google Cloud Storage to be imported into BigQuery daily.
As we are using historical data it will do a backfill of our historical data and if new data was added to Google Cloud Storage this would be imported daily into BigQuery.
You will be importing data from a GCS bucket called gaming-taw-analytics-qwiklab that has been shared with you. The structure of the bucket and files is bucket_name/folder_name/filename_YYYYMMDD.csv
Navigate back to the BigQuery web UI and click on Transfers to create a transfer schedule. As this is the first time you will you be using this service you will need to enable the API.
Set the following options for the Transfer and press save for the transfer to start. If this is the first time you will also be asked to authenticate the service.
Source: Google Cloud Storage
Transfer config name: Xonotic Events GCS to BQ
Schedule Options: Start Now
Destination dataset: xonotic_analytics
Destination table: events
Cloud Storage URI: gaming-taw-analytics-qwiklab/postgres-csv-export/events_*.csv
Delete source files after transfer: No (Leave unticked)
File format: CSV
JSON, CSV - Ignore unknown values: : No (Leave unticked)
Number of errors allowed: 0 (default)
CSV Field Delimiter: , (default)
Header rows to skip: 1
Allow quoted newlines: No (Leave unticked)
Allow jagged rows: No (Leave unticked)
When done press save
You can see the logs for the transfer which will tell you in the transfer was a success. Refresh the page every 10 seconds to see the latest status of the transfer. Once the data has successfully been transferred you will see the logs shown below:
Navigate back to the query editor and click on the dataset and table in the left navigation.
Click on schema, information and preview tabs to check the data has successfully been imported.
If you do not want to wait for the data to be upload to your dataset you can make use of the public dataset (gaming-taw-analytics-qwiklab:xonotic_analytics.events) we have made available. Click on the add dataset button to pin the dataset to your project.
In this section of the lab you will write some queries to analyse the events we have collected from the millions of Xonotic games played.
#standardSQL SELECT game_map, COUNT(DISTINCT game_id) games FROM `xonotic_analytics.events` GROUP BY game_map ORDER BY games DESC
#standardSQL SELECT player, COUNT( killed ) as kills, COUNT(DISTINCT game_id) as games FROM `xonotic_analytics.events` GROUP BY player ORDER BY kills DESC LIMIT 10
#standardSQL SELECT EXTRACT(HOUR FROM event_datetime) hour_of_day, COUNT(DISTINCT game_id) games FROM `xonotic_analytics.events` WHERE DATE(event_datetime) = '2019-04-01' GROUP BY hour_of_day ORDER BY games DESC
You will notice that if we query for a certain date using
WHERE DATE(event_datetime) = '2019-04-01' it queries the data for all dates.
Also to get a unique count of games we have to use
COUNT(DISTINCT game_id) which for very very large datasets could run slowly.
In this section of the lab you will look at some features we can use to create a new BigQuery that is optimised for performance.
You will notice that we have a flat table and that we repeat the game information for all events.
We can make use of the following features that will give faster query results and also reduce the cost for each query as less data will be processed.
Write the following query that will create a new table with our performance optimisations:
#standardSQL CREATE TABLE xonotic_analytics.games_partitioned PARTITION BY DATE(game_start_datetime) CLUSTER BY game_type, game_map OPTIONS ( description="Xonotic games partitioned by date and nested per game" ) AS SELECT MIN(event_datetime) game_start_datetime, game_id, game_server, game_type, game_map, ARRAY_AGG(STRUCT(event_datetime as datetime, player, killed, weapon, x_cord, y_cord)) as event FROM `xonotic_analytics.events` GROUP BY game_id, game_server, game_type, game_map
For the new schema we have used a nested approach. We treat each game as one session and all events that occured in that game as repeated rows.
We store the data in this format as it can help queries run a lot faster if we only need game data.
In this section of the lab you will re-write the queries from before to analyse the games_partitioned table we have created.
For all the queries below you will notice a lot less data is now analysed due to the optimizations we made which results in lower costs and faster query results.
You will notice you no longer need to use COUNT(DISTINCT game_id) to get the number of games as the game_id is no longer repeated per event.
#standardSQL SELECT game_map, COUNT(game_id) games FROM `xonotic_stats.games_partitioned` GROUP BY game_map ORDER BY games DESC
As the data is now nested you will need to UNNEST the data in your query.
#standardSQL SELECT e.player, COUNT( e.killed ) as kills, COUNT(DISTINCT game_id) as games FROM `xonotic_analytics.games_partitioned`, UNNEST(event) as e GROUP BY player ORDER BY kills DESC LIMIT 10
The WHERE statement in this query now makes use of the partitioning we set-up and only needs to retrieve the data for that single date.
#standardSQL SELECT EXTRACT(HOUR FROM game_start_datetime ) hour_of_day, COUNT(DISTINCT game_id) games FROM `xonotic_analytics.games_partitioned` WHERE DATE( game_start_datetime ) = '2019-04-01' GROUP BY hour_of_day ORDER BY games DESC
In the section of the lab we will be using Google Data Studio to create a dashboard to visualize some of the xonotic data.
Navigate to the Data Studio datasources UI and click on the + button in the bottom right to create a new datasource.
Name the data source Xonotic Analytics - Games Partitioned and then select BigQuery as your data source
Select your project, dataset and table. Do not tick the option to use game_start_datetime as the partitionioning date. Click connect to go to the edit fields page.
Once connected to the data source create some new fields.
Click on game_id menu and select count. Rename the field to games
Click on game_start_datetime menu and select duplicate.
Rename the duplicate field to game_start_hour then click on the type dropdown and select the date & time > Hour (HH) option. You will need to scroll in the dropdown to see the Hour (HH) option
Click on the create report button in the top right and this will take you to a new page to start adding your visualisations.
Name the dashboard Xonotic - Games and add a smoothed time-series chart.
Make sure the time dimension is set to game_start_datetime and change the metric to games
Click anywhere on the canvas away from the chart and you will see the Layout and Theme sidebar appear. Click on THEME and select the dark theme
Click on the date button and this will add a date filter dropdown to the dashboard. Move this to the top right of your dashboard.
Add 3 additional charts to your dashboard and for each chart enable the apply filter option.
We recommend adding the following charts.
Chart type - Time Series with bars (change to bars using style tab)
Dimensions - game_start_hour
Metrics - games
Chart type - Horizontal bar chart
Dimensions - game_map
Metrics - games
Chart type - Horizontal bar chart
Dimensions - game_tpe
Metrics - games
Now the dashboard is ready click on the view button in the top right and you can start to filter the report by clicking on the date dropdown or by clicking on any of the bars in the charts where you enabled filtering.
Return to the BigQuery Web UI and click on BI Engine
Select US as the location and slide GB of Capacity to 10GB. Click next and then press create to create a reservation.
Test your BI Engine optimized dashboard
In View mode, right click on any chart and select Diagnose Query Performance to see if it is now being optimized using BigQuery BI Engine.
If it has a green tick and states Accelerated by BigQuery BI Engine then it has worked correctly.
Now the same filters as before and you should now notice your dashboard filters a lot faster.
You have now set-up a full end to end serverless analytics pipeline to analyse and visualize Xonotic game data.
Here are some follow-up steps:
...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.
©2019 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.