Bot Management with Google Cloud Armor + reCAPTCHA

1. Introduction

Google Cloud HTTP(S) load balancing is deployed at the edge of Google's network in Google points of presence (POP) around the world. User traffic directed to an HTTP(S) load balancer enters the POP closest to the user and is then load balanced over Google's global network to the closest backend that has sufficient capacity available.

Cloud Armor is Google's distributed denial of service and web application firewall (WAF) detection system. Cloud Armor is tightly coupled with the Google Cloud HTTP Load Balancer and safeguards applications of Google Cloud customers from attacks from the internet. reCAPTCHA Enterprise is a service that protects your site from spam and abuse, building on the existing reCAPTCHA API which uses advanced risk analysis techniques to tell humans and bots apart. Cloud Armor Bot Management provides an end-to-end solution integrating reCAPTCHA Enterprise bot detection and scoring with enforcement by Cloud Armor at the edge of the network to protect downstream applications.

In this lab, you configure an HTTP Load Balancer with a backend, as shown in the diagram below. Then, you'll learn to set up a reCAPTCHA session token site key and embed it in your website. You will also learn to set up redirection to reCAPTCHA Enterprise manual challenge. We will then configure a Cloud Armor bot management policy to showcase how bot detection protects your application from malicious bot traffic.

8b46e6728996bc0c.png

What you'll learn

  • How to set up a HTTP Load Balancer with appropriate health checks.
  • How to create a reCAPTCHA WAF challenge-page site key and associated it with Cloud Armor security policy.
  • How to create a reCAPTCHA session token site key and install it on your web pages.
  • How to create a Cloud Armor bot management policy.
  • How to validate that the bot management policy is handling traffic based on the rules configured.

What you'll need

  • Basic Networking and knowledge of HTTP
  • Basic Unix/Linux command line knowledge

2. Setup and Requirements

Self-paced environment setup

  1. Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

b35bf95b8bf3d5d8.png

a99b7ace416376c4.png

bd84a6d3004737c5.png

  • The Project name is the display name for this project's participants. It is a character string not used by Google APIs, and you can update it at any time.
  • The Project ID must be unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference the Project ID (and it is typically identified as PROJECT_ID), so if you don't like it, generate another random one, or, you can try your own and see if it's available. Then it's "frozen" after the project is created.
  • There is a third value, a Project Number which some APIs use. Learn more about all three of these values in the documentation.
  1. Next, you'll need to enable billing in the Cloud Console in order to use Cloud resources/APIs. Running through this codelab shouldn't cost much, if anything at all. To shut down resources so you don't incur billing beyond this tutorial, follow any "clean-up" instructions found at the end of the codelab. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

55efc1aaa7a4d3ad.png

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

7ffe5cbb04455448.png

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this lab can be done with simply a browser.

Before you begin

Inside Cloud Shell, make sure that your project id is set up

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
PROJECT_ID=[YOUR-PROJECT-NAME]
echo $PROJECT_ID

Enable APIs

Enable all necessary services

gcloud services enable compute.googleapis.com
gcloud services enable logging.googleapis.com
gcloud services enable monitoring.googleapis.com
gcloud services enable recaptchaenterprise.googleapis.com

3. Configure firewall rules to allow HTTP and SSH traffic to backends

Configure firewall rules to allow HTTP traffic to the backends from the Google Cloud health checks and the Load Balancer. Also, configure a firewall rule to allow SSH into the instances.

We will be using the default VPC network created in your project. Create a firewall rule to allow HTTP traffic to the backends. Health checks determine which instances of a load balancer can receive new connections. For HTTP load balancing, the health check probes to your load balanced instances come from addresses in the ranges 130.211.0.0/22 and 35.191.0.0/16. Your VPC firewall rules must allow these connections. Also, the load balancers talk to the backend on the same IP range.

  1. In the Cloud Console, navigate to Navigation menu ( mainmenu.png) > VPC network > Firewall.

131fb495c9242335.png

  1. Notice the existing ICMP, internal, RDP, and SSH firewall rules.Each Google Cloud project starts with the default network and these firewall rules.
  2. Click Create Firewall Rule.
  3. Set the following values, leave all other values at their defaults:

Property

Value (type value or select option as specified)

Name

default-allow-health-check

Network

default

Targets

Specified target tags

Target tags

allow-health-check

Source filter

IP Ranges

Source IP ranges

130.211.0.0/22, 35.191.0.0/16

Protocols and ports

Specified protocols and ports, and then check tcp. Type 80 for port number

  1. Click Create.

Alternatively, if you are using the gCloud command line. Below is the command -

gcloud compute firewall-rules create default-allow-health-check --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:80 --source-ranges=130.211.0.0/22,35.191.0.0/16 --target-tags=allow-health-check
  1. Similarly, create a Firewall rule to allow SSH-ing into the instances -
gcloud compute firewall-rules create allow-ssh --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:22 --source-ranges=0.0.0.0/0 --target-tags=allow-health-check

4. Configure instance templates and create managed instance groups

A managed instance group uses an instance template to create a group of identical instances. Use these to create the backend of the HTTP Load Balancer.

Configure the instance templates

An instance template is a resource that you use to create VM instances and managed instance groups. Instance templates define the machine type, boot disk image, subnet, labels, and other instance properties. Create an instance template as indicated below.

  1. In the Cloud Console, navigate to Navigation menu ( mainmenu.png) > Compute Engine > Instance templates, and then click Create instance template.
  2. For Name, type lb-backend-template.
  3. For Series, select N1.
  4. Click Networking, Disks, Security, Management , Sole-Tenancy.

1d0b7122f4bb410d.png

  1. Go to the Management section and insert the following script into the Startup script field.
#! /bin/bash
sudo apt-get update
sudo apt-get install apache2 -y
sudo a2ensite default-ssl
sudo a2enmod ssl
sudo vm_hostname="$(curl -H "Metadata-Flavor:Google" \
http://169.254.169.254/computeMetadata/v1/instance/name)"
sudo echo "Page served from: $vm_hostname" | \
tee /var/www/html/index.html
  1. Click on the Networking tab, add the network tags: allow-health-check
  2. Set the following values and leave all other values at their defaults -

Property

Value (type value or select option as specified)

Network (Under Network Interfaces)

default

Subnet (Under Network Interfaces)

default (us-east1)

Network tags

allow-health-check

  1. Click Create.
  2. Wait for the instance template to be created.

Create the managed instance group

  1. Still in Compute Engine page, click Instance groups in the left menu.

ed419061ad2b982c.png

  1. Click Create instance group. Select New managed instance group (stateless).
  2. Set the following values, leave all other values at their defaults:

Property

Value (type value or select option as specified)

Name

lb-backend-example

Location

Single zone

Region

us-east1

Zone

us-east1-b

Instance template

lb-backend-template

Autoscaling

Don't autoscale

Number of instances

1

  1. Click Create.

Add a named port to the instance group

For your instance group, define an HTTP service and map a port name to the relevant port. The load balancing service forwards traffic to the named port.

gcloud compute instance-groups set-named-ports lb-backend-example \
    --named-ports http:80 \
    --zone us-east1-b

5. Configure the HTTP Load Balancer

Configure the HTTP Load Balancer to send traffic to your backend lb-backend-example:

Start the configuration

  1. In the Cloud Console, click Navigation menu ( mainmenu.png) > click Network Services > Load balancing, and then click Create load balancer.
  2. Under HTTP(S) Load Balancing, click on Start configuration.

4f8b8cb10347ecec.png

  1. Select From Internet to my VMs, Classic HTTP(S) Load Balancer and click Continue.
  2. Set the Name to http-lb.

Configure the backend

Backend services direct incoming traffic to one or more attached backends. Each backend is composed of an instance group and additional serving capacity metadata.

  1. Click on Backend configuration.
  2. For Backend services & backend buckets, click Create a backend service.
  3. Set the following values, leave all other values at their defaults:

Property

Value (select option as specified)

Name

http-backend

Protocol

HTTP

Named Port

htp

Instance group

lb-backend-example

Port numbers

80

  1. Click Done.
  2. Click Add backend.
  3. For Health Check, select Create a health check.

168a9ba1062b1f45.png

  1. Set the following values, leave all other values at their defaults:

Property

Value (select option as specified)

Name

http-health-check

Protocol

TCP

Port

80

dc45bc726bb4dfad.png

  1. Click Save.
  2. Check the Enable Logging box.
  3. Set the Sample Rate to 1:

c8f884fa4a8cd50.png

  1. Click Create to create the backend service.

1fd2ad21b1d32a95.png

Configure the frontend

The host and path rules determine how your traffic will be directed. For example, you could direct video traffic to one backend and static traffic to another backend. However, you are not configuring the Host and path rules in this lab.

  1. Click on Frontend configuration.
  2. Specify the following, leaving all other values at their defaults:

Property

Value (type value or select option as specified)

Protocol

HTTP

IP version

IPv4

IP address

Ephemeral

Port

80

  1. Click Done.

Review and create the HTTP Load Balancer

  1. Click on Review and finalize.

478e5e51057af3a3.png

  1. Review the Backend services and Frontend.
  2. Click on Create.
  3. Wait for the load balancer to be created.
  4. Click on the name of the load balancer (http-lb).
  5. Note the IPv4 address of the load balancer for the next task. We will refer to it as [LB_IP_v4].

6. Test the HTTP Load Balancer

Now that you created the HTTP Load Balancer for your backends, verify that traffic is forwarded to the backend service. To test IPv4 access to the HTTP Load Balancer, open a new tab in your browser and navigate to http://[LB_IP_v4]. Make sure to replace [LB_IP_v4] with the IPv4 address of the load balancer.

7. Create and deploy reCAPTCHA session token and challenge-page site key

reCAPTCHA Enterprise for WAF and Google Cloud Armor integration offers the following features: reCAPTCHA challenge page, reCAPTCHA action-tokens, and reCAPTCHA session-tokens. In this code lab, we will be implementing the reCATCHA session token site key and reCAPTCHA WAF challenge-page site.

Create reCAPTCHA session token and WAF challenge-page site key

Before creating the session token site key and challenge page site key, double check that you have enabled the reCAPTCHA Enterprise API as indicated in the "Enable API" section at the beginning.

The reCAPTCHA JavaScript sets a reCAPTCHA session-token as a cookie on the end-user's browser after the assessment. The end-user's browser attaches the cookie and refreshes the cookie as long as the reCAPTCHA JavaScript remains active.

  1. Create the reCAPTCHA session token site key and enable the WAF feature for the key. We will also be setting the WAF service to Cloud Armor to enable the Cloud Armor integration.
gcloud recaptcha keys create --display-name=test-key-name \
   --web --allow-all-domains --integration-type=score --testing-score=0.5 \
   --waf-feature=session-token --waf-service=ca
  1. Output of the above command, gives you the key created. Make a note of it as we will add it to your web site in the next step.
  2. Create the reCAPTCHA WAF challenge-page site key and enable the WAF feature for the key. You can use the reCAPTCHA challenge page feature to redirect incoming requests to reCAPTCHA Enterprise to determine whether each request is potentially fraudulent or legitimate. We will later associate this key with the Cloud Armor security policy to enable the manual challenge. We will refer to this key as CHALLENGE-PAGE-KEY in the later steps.
gcloud recaptcha keys create --display-name=challenge-page-key \
   --web --allow-all-domains --integration-type=INVISIBLE \
   --waf-feature=challenge-page --waf-service=ca
  1. Navigate to Navigation menu ( mainmenu.png) > Security > reCAPTCHA Enterprise. You should see the keys you created under Enterprise Keys -

4e2567aae0eb92d7.png

Implement reCAPTCHA session token site key

  1. Navigate to Navigation menu ( mainmenu.png) > Compute Engine > VM Instances. Locate the VM in your instance group and SSH to it.

6d7b0fd12a667b5f.png

  1. Go to the webserver root directory and and change user to root -
@lb-backend-example-4wmn:~$ cd /var/www/html/
@lb-backend-example-4wmn:/var/www/html$ sudo su
  1. Update the landing index.html page and embed the reCAPTCHA session token site key. The session token site key is set in the head section of your landing page as below -

<script src="https://www.google.com/recaptcha/enterprise.js?render=<REPLACE_TOKEN_HERE>&waf=session" async defer></script>

Remember to replace the token before updating the index.html file as indicated below -

root@lb-backend-example-4wmn:/var/www/html# echo '<!doctype html><html><head><title>ReCAPTCHA Session Token</title><script src="https://www.google.com/recaptcha/enterprise.js?render=<REPLACE_TOKEN_HERE>&waf=session" async defer></script></head><body><h1>Main Page</h1><p><a href="/good-score.html">Visit allowed link</a></p><p><a href="/bad-score.html">Visit blocked link</a></p><p><a href="/median-score.html">Visit redirect link</a></p></body></html>' > index.html
  1. Create three other sample pages to test out the bot management policies -
  • good-score.html
root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>Congrats! You have a good score!!</h1></body></html>' > good-score.html
  • bad-score.html
root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>Sorry, You have a bad score!</h1></body></html>' > bad-score.html
  • median-score.html
root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>You have a median score that we need a second verification.</h1></body></html>' > median-score.html
  1. Validate that you are able to access all the webpages by opening them in your browser. Make sure to replace [LB_IP_v4] with the IPv4 address of the load balancer.
  • Open http://[LB_IP_v4]/index.html. You will be able to verify that the reCAPTCHA implementation is working when you see "protected by reCAPTCHA" at the bottom right corner of the page -

d695ad23d91ae4e9.png

  • Click into each of the links.

4a2ad1b2f10b4c86.png

  • Validate you are able to access all the pages.

481f63bf5e6f244.png

8. Create Cloud Armor security policy rules for Bot Management

In this section, you will use Cloud Armor bot management rules to allow, deny and redirect requests based on the reCAPTCHA score. Remember that when you created the session token site key, you set a testing score of 0.5.

  1. In Cloud Shell(refer to "Start Cloud Shell" under "Setup and Requirements" for instructions on how to use Cloud Shell), create security policy via gcloud:
gcloud compute security-policies create recaptcha-policy \
    --description "policy for bot management"
  1. To use reCAPTCHA Enterprise manual challenge to distinguish between human and automated clients, associate the reCAPTCHA WAF challenge site key we created for manual challenge with the security policy. Replace "CHALLENGE-PAGE-KEY" with the key we created -
gcloud compute security-policies update recaptcha-policy \
   --recaptcha-redirect-site-key "CHALLENGE-PAGE-KEY"
  1. Add a bot management rule to allow traffic if the url path matches good-score.html and has a score greater than 0.4.
gcloud compute security-policies rules create 2000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('good-score.html') &&    token.recaptcha_session.score > 0.4"\
     --action allow
  1. Add a bot management rule to deny traffic if the url path matches bad-score.html and has a score less than 0.6.
  gcloud compute security-policies rules create 3000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('bad-score.html') && token.recaptcha_session.score < 0.6"\
     --action "deny-403"
  1. Add a bot management rule to redirect traffic to Google reCAPTCHA if the url path matches median-score.html and has a score equal to 0.5
  gcloud compute security-policies rules create 1000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('median-score.html') && token.recaptcha_session.score == 0.5"\
     --action redirect \
     --redirect-type google-recaptcha
  1. Attach the security policy to the backend service http-backend:
gcloud compute backend-services update http-backend \
    --security-policy recaptcha-policy –-global
  1. In the Console, navigate to Navigation menu > Network Security > Cloud Armor.
  2. Click recaptcha-policy. Your policy should resemble the following:

74852618aaa96786.png

9. Validate Bot Management with Cloud Armor

  1. Open up a browser and enter the url http://[LB_IP_v4]/index.html. Navigate to "Visit allow link". You should be allowed through -

edf3e6ca238d2ee7.png

  1. Open a new window in Incognito mode to ensure we have a new session. Enter the url http://[LB_IP_v4]/index.html and navigate to "Visit blocked link". You should receive a HTTP 403 error -

ecef5655b291dbb0.png

  1. Open a new window in Incognito mode to ensure we have a new session. Enter the url http://[LB_IP_v4]/index.html and navigate to "Visit redirect link". You should see the redirection to Google reCAPTCHA and the manual challenge page as below -

53ed2b4067b55436.png

Verify Cloud Armor logs

Explore the security policy logs to validate bot management worked as expected.

  1. In the Console, navigate to Navigation menu > Network Security > Cloud Armor.
  2. Click recaptcha-policy.
  3. Click Logs.

46fd825d8506d355.png

  1. Click View policy logs.
  2. Below is the MQL(monitoring query language) query, you can copy and paste into the query editer -
resource.type:(http_load_balancer) AND jsonPayload.enforcedSecurityPolicy.name:(recaptcha-policy)
  1. Now click Run Query.
  2. Look for a log entry in Query results where the request is for http://[LB_IP_v4]/good-score.html. Expand jsonPayload.Expand enforcedSecurityPolicy.

b7b1712642cf092b.png

  1. Repeat the same for http://[LB_IP_v4]/bad-score.html and http://[LB_IP_v4]/median-score.html

c28f96d83056725a.png

8c4803d75a77142c.png

Notice that the configuredAction is set to ALLOW, DENY or GOOGLE_RECAPTCHA with the name recaptcha-policy.

Congratulations! You have completed this lab on Bot Management with Cloud Armor

©2020 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

10. Lab Clean up

  1. Navigate to Network Security >> Cloud Armor >> %POLICY NAME% and select delete -

2646f9c1df093f90.png

  1. Navigate to Networking >> Network services >> Load Balancing. Select the load balancer you created and click delete.

8ad4f55dc06513f7.png

Select the backend service and health check as additional resources to be deleted -

f6f02bb56add6420.png

  1. Navigate to Navigation menu ( mainmenu.png) > Compute Engine > Instance Groups. Select the managed instance group and click delete -

2116b286954fd6.png

Confirm deletion by typing "delete" into the textbox.

Wait for the managed instance group to be deleted. This also deletes the instance in the group. You can delete the templates only after the instance group has been deleted.

  1. Navigate to Instance templates from the left hand side pane**.** Select the instance template and click delete.
  2. Navigate to Navigation menu ( mainmenu.png) > VPC network > Firewall. Select the default-allow-health-check and allow-ssh rules and click delete.
  3. Navigate to Navigation menu ( mainmenu.png) > Security > reCAPTCHA Enterprise. Select the keys we created and delete it. Confirm deletion by typing "DELETE" into the textbox.

e71ecd11baf262ca.png

11. Congratulations!

You successfully implemented bot management with Cloud Armor. You configured an HTTP Load Balancer. Then, you created and implemented reCAPTCHA session token site key on a webpage. You also learnt to create a challenge-page site key. You set up Cloud Armor Bot management policy and validated how they handle requests based on the rules. You were able to explore the security policy logs to identify why the traffic was allowed, blocked or redirected.

What we've covered

  • How to set up instance templates and create managed instance groups.
  • How to set up a HTTP Load Balancer.
  • How to create a Cloud Armor bot management policy.
  • How to create and implement reCAPTCHA session token site key.
  • How to create and implement reCAPTCHA challenge page site key.
  • How to validate that the Bot Management Policy is working as intended.

Next steps

  • Try setting up reCAPTCHA action tokens.