Bot Management with Google Cloud Armor + reCAPTCHA

58 mins remaining

About this codelab

Last updated Jul 15, 2024

Written by Navya Dwarakanath

1. Introduction

Google Cloud HTTP(S) load balancing is deployed at the edge of Google's network in Google points of presence (POP) around the world. User traffic directed to an HTTP(S) load balancer enters the POP closest to the user and is then load balanced over Google's global network to the closest backend that has sufficient capacity available.

Cloud Armor is Google's distributed denial of service and web application firewall (WAF) detection system. Cloud Armor is tightly coupled with the Google Cloud HTTP Load Balancer and safeguards applications of Google Cloud customers from attacks from the internet. reCAPTCHA Enterprise is a service that protects your site from spam and abuse, building on the existing reCAPTCHA API which uses advanced risk analysis techniques to tell humans and bots apart. Cloud Armor Bot Management provides an end-to-end solution integrating reCAPTCHA Enterprise bot detection and scoring with enforcement by Cloud Armor at the edge of the network to protect downstream applications.

In this lab, you configure an HTTP Load Balancer with a backend, as shown in the diagram below. Then, you'll learn to set up a reCAPTCHA session token site key and embed it in your website. You will also learn to set up redirection to reCAPTCHA Enterprise manual challenge. We will then configure a Cloud Armor bot management policy to showcase how bot detection protects your application from malicious bot traffic.

What you'll learn

How to set up a HTTP Load Balancer with appropriate health checks.
How to create a reCAPTCHA WAF challenge-page site key and associated it with Cloud Armor security policy.
How to create a reCAPTCHA session token site key and install it on your web pages.
How to create a Cloud Armor bot management policy.
How to validate that the bot management policy is handling traffic based on the rules configured.

What you'll need

Basic Networking and knowledge of HTTP
Basic Unix/Linux command line knowledge

2. Setup and Requirements

Self-paced environment setup

Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

The Project name is the display name for this project's participants. It is a character string not used by Google APIs, and you can update it at any time.
The Project ID must be unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference the Project ID (and it is typically identified as PROJECT_ID), so if you don't like it, generate another random one, or, you can try your own and see if it's available. Then it's "frozen" after the project is created.
There is a third value, a Project Number which some APIs use. Learn more about all three of these values in the documentation.

Next, you'll need to enable billing in the Cloud Console in order to use Cloud resources/APIs. Running through this codelab shouldn't cost much, if anything at all. To shut down resources so you don't incur billing beyond this tutorial, follow any "clean-up" instructions found at the end of the codelab. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this lab can be done with simply a browser.

Before you begin

Inside Cloud Shell, make sure that your project id is set up

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
PROJECT_ID=[YOUR-PROJECT-NAME]
echo $PROJECT_ID

Enable APIs

Enable all necessary services

gcloud services enable compute.googleapis.com
gcloud services enable logging.googleapis.com
gcloud services enable monitoring.googleapis.com
gcloud services enable recaptchaenterprise.googleapis.com

3. Configure firewall rules to allow HTTP and SSH traffic to backends

Configure firewall rules to allow HTTP traffic to the backends from the Google Cloud health checks and the Load Balancer. Also, configure a firewall rule to allow SSH into the instances.

We will be using the default VPC network created in your project. Create a firewall rule to allow HTTP traffic to the backends. Health checks determine which instances of a load balancer can receive new connections. For HTTP load balancing, the health check probes to your load balanced instances come from addresses in the ranges 130.211.0.0/22 and 35.191.0.0/16. Your VPC firewall rules must allow these connections. Also, the load balancers talk to the backend on the same IP range.

In the Cloud Console, navigate to Navigation menu ( ) > VPC network > Firewall.

Notice the existing ICMP, internal, RDP, and SSH firewall rules.Each Google Cloud project starts with the default network and these firewall rules.
Click Create Firewall Rule.
Set the following values, leave all other values at their defaults:

Property	Value (type value or select option as specified)
Name	default-allow-health-check
Network	default
Targets	Specified target tags
Target tags	allow-health-check
Source filter	IP Ranges
Source IP ranges	130.211.0.0/22, 35.191.0.0/16
Protocols and ports	Specified protocols and ports, and then check tcp. Type 80 for port number

Click Create.

Alternatively, if you are using the gcloud command line. Below is the command -

gcloud compute firewall-rules create default-allow-health-check --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:80 --source-ranges=130.211.0.0/22,35.191.0.0/16 --target-tags=allow-health-check

Similarly, create a Firewall rule to allow SSH-ing into the instances -

gcloud compute firewall-rules create allow-ssh --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:22 --source-ranges=0.0.0.0/0 --target-tags=allow-health-check

4. Configure instance templates and create managed instance groups

A managed instance group uses an instance template to create a group of identical instances. Use these to create the backend of the HTTP Load Balancer.

Configure the instance templates

An instance template is a resource that you use to create VM instances and managed instance groups. Instance templates define the machine type, boot disk image, subnet, labels, and other instance properties. Create an instance template as indicated below.

In the Cloud Console, navigate to Navigation menu ( ) > Compute Engine > Instance templates, and then click Create instance template.
For Name, type lb-backend-template.
For Series, select N1.
Click Networking, Disks, Security, Management , Sole-Tenancy.

Go to the Management section and insert the following script into the Startup script field.

#! /bin/bash
sudo apt-get update
sudo apt-get install apache2 -y
sudo a2ensite default-ssl
sudo a2enmod ssl
sudo vm_hostname="$(curl -H "Metadata-Flavor:Google" \
http://169.254.169.254/computeMetadata/v1/instance/name)"
sudo echo "Page served from: $vm_hostname" | \
tee /var/www/html/index.html

Click on the Networking tab, add the network tags: allow-health-check
Set the following values and leave all other values at their defaults -

Property	Value (type value or select option as specified)
Network (Under Network Interfaces)	default
Subnet (Under Network Interfaces)	default (us-east1)
Network tags	allow-health-check

Click Create.
Wait for the instance template to be created.

Create the managed instance group

Still in Compute Engine page, click Instance groups in the left menu.

Click Create instance group. Select New managed instance group (stateless).
Set the following values, leave all other values at their defaults:

Property	Value (type value or select option as specified)
Name	lb-backend-example
Location	Single zone
Region	us-east1
Zone	us-east1-b
Instance template	lb-backend-template
Autoscaling	Don't autoscale
Number of instances	1

Click Create.

Add a named port to the instance group

For your instance group, define an HTTP service and map a port name to the relevant port. The load balancing service forwards traffic to the named port.

gcloud compute instance-groups set-named-ports lb-backend-example \
    --named-ports http:80 \
    --zone us-east1-b

5. Configure the HTTP Load Balancer

Configure the HTTP Load Balancer to send traffic to your backend lb-backend-example:

Start the configuration

In the Cloud Console, click Navigation menu ( ) > click Network Services > Load balancing, and then click Create load balancer.
Under HTTP(S) Load Balancing, click on Start configuration.

Select From Internet to my VMs, Classic HTTP(S) Load Balancer and click Continue.
Set the Name to http-lb.

Configure the backend

Backend services direct incoming traffic to one or more attached backends. Each backend is composed of an instance group and additional serving capacity metadata.

Click on Backend configuration.
For Backend services & backend buckets, click Create a backend service.
Set the following values, leave all other values at their defaults:

Property	Value (select option as specified)
Name	http-backend
Protocol	HTTP
Named Port	htp
Instance group	lb-backend-example
Port numbers	80

Click Done.
Click Add backend.
For Health Check, select Create a health check.

Set the following values, leave all other values at their defaults:

Property	Value (select option as specified)
Name	http-health-check
Protocol	TCP
Port	80

Click Save.
Check the Enable Logging box.
Set the Sample Rate to 1:

Click Create to create the backend service.

Configure the frontend

The host and path rules determine how your traffic will be directed. For example, you could direct video traffic to one backend and static traffic to another backend. However, you are not configuring the Host and path rules in this lab.

Click on Frontend configuration.
Specify the following, leaving all other values at their defaults:

Property	Value (type value or select option as specified)
Protocol	HTTP
IP version	IPv4
IP address	Ephemeral
Port	80

Click Done.

Review and create the HTTP Load Balancer

Click on Review and finalize.

Review the Backend services and Frontend.
Click on Create.
Wait for the load balancer to be created.
Click on the name of the load balancer (http-lb).
Note the IPv4 address of the load balancer for the next task. We will refer to it as [LB_IP_v4].

6. Test the HTTP Load Balancer

Now that you created the HTTP Load Balancer for your backends, verify that traffic is forwarded to the backend service. To test IPv4 access to the HTTP Load Balancer, open a new tab in your browser and navigate to http://[LB_IP_v4]. Make sure to replace [LB_IP_v4] with the IPv4 address of the load balancer.

7. Create and deploy reCAPTCHA session token and challenge-page site key

reCAPTCHA Enterprise for WAF and Google Cloud Armor integration offers the following features: reCAPTCHA challenge page, reCAPTCHA action-tokens, and reCAPTCHA session-tokens. In this code lab, we will be implementing the reCATCHA session token site key and reCAPTCHA WAF challenge-page site.

Create reCAPTCHA session token and WAF challenge-page site key

Before creating the session token site key and challenge page site key, double check that you have enabled the reCAPTCHA Enterprise API as indicated in the "Enable API" section at the beginning.

The reCAPTCHA JavaScript sets a reCAPTCHA session-token as a cookie on the end-user's browser after the assessment. The end-user's browser attaches the cookie and refreshes the cookie as long as the reCAPTCHA JavaScript remains active.

Create the reCAPTCHA session token site key and enable the WAF feature for the key. We will also be setting the WAF service to Cloud Armor to enable the Cloud Armor integration.

gcloud recaptcha keys create --display-name=test-key-name \
   --web --allow-all-domains --integration-type=score --testing-score=0.5 \
   --waf-feature=session-token --waf-service=ca

Output of the above command, gives you the key created. Make a note of it as we will add it to your web site in the next step.
Create the reCAPTCHA WAF challenge-page site key and enable the WAF feature for the key. You can use the reCAPTCHA challenge page feature to redirect incoming requests to reCAPTCHA Enterprise to determine whether each request is potentially fraudulent or legitimate. We will later associate this key with the Cloud Armor security policy to enable the manual challenge. We will refer to this key as CHALLENGE-PAGE-KEY in the later steps.

gcloud recaptcha keys create --display-name=challenge-page-key \
   --web --allow-all-domains --integration-type=INVISIBLE \
   --waf-feature=challenge-page --waf-service=ca

Navigate to Navigation menu ( ) > Security > reCAPTCHA Enterprise. You should see the keys you created under Enterprise Keys -

Implement reCAPTCHA session token site key

Navigate to Navigation menu ( ) > Compute Engine > VM Instances. Locate the VM in your instance group and SSH to it.

Go to the webserver root directory and and change user to root -

@lb-backend-example-4wmn:~$ cd /var/www/html/
@lb-backend-example-4wmn:/var/www/html$ sudo su

Update the landing index.html page and embed the reCAPTCHA session token site key. The session token site key is set in the head section of your landing page as below -

Remember to replace the token before updating the index.html file as indicated below -

root@lb-backend-example-4wmn:/var/www/html# echo '<!doctype html><html><head><title>ReCAPTCHA Session Token</title><script src="https://www.google.com/recaptcha/enterprise.js?render=<REPLACE_TOKEN_HERE>&waf=session" async defer></script></head><body><h1>Main Page</h1><p><a href="/good-score.html">Visit allowed link</a></p><p><a href="/bad-score.html">Visit blocked link</a></p><p><a href="/median-score.html">Visit redirect link</a></p></body></html>' > index.html

Create three other sample pages to test out the bot management policies -

good-score.html

root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>Congrats! You have a good score!!</h1></body></html>' > good-score.html

bad-score.html

root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>Sorry, You have a bad score!</h1></body></html>' > bad-score.html

median-score.html

root@lb-backend-example-4wmn:/var/www/html# echo '<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head><body><h1>You have a median score that we need a second verification.</h1></body></html>' > median-score.html

Validate that you are able to access all the webpages by opening them in your browser. Make sure to replace [LB_IP_v4] with the IPv4 address of the load balancer.

Open http://[LB_IP_v4]/index.html. You will be able to verify that the reCAPTCHA implementation is working when you see "protected by reCAPTCHA" at the bottom right corner of the page -

Click into each of the links.

Validate you are able to access all the pages.

8. Create Cloud Armor security policy rules for Bot Management

In this section, you will use Cloud Armor bot management rules to allow, deny and redirect requests based on the reCAPTCHA score. Remember that when you created the session token site key, you set a testing score of 0.5.

In Cloud Shell(refer to "Start Cloud Shell" under "Setup and Requirements" for instructions on how to use Cloud Shell), create security policy via gcloud:

gcloud compute security-policies create recaptcha-policy \
    --description "policy for bot management"

To use reCAPTCHA Enterprise manual challenge to distinguish between human and automated clients, associate the reCAPTCHA WAF challenge site key we created for manual challenge with the security policy. Replace "CHALLENGE-PAGE-KEY" with the key we created -

gcloud compute security-policies update recaptcha-policy \
   --recaptcha-redirect-site-key "CHALLENGE-PAGE-KEY"

Add a bot management rule to allow traffic if the url path matches good-score.html and has a score greater than 0.4.

gcloud compute security-policies rules create 2000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('good-score.html') &&    token.recaptcha_session.score > 0.4"\
     --action allow

Add a bot management rule to deny traffic if the url path matches bad-score.html and has a score less than 0.6.

  gcloud compute security-policies rules create 3000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('bad-score.html') && token.recaptcha_session.score < 0.6"\
     --action "deny-403"

Add a bot management rule to redirect traffic to Google reCAPTCHA if the url path matches median-score.html and has a score equal to 0.5

  gcloud compute security-policies rules create 1000 \
     --security-policy recaptcha-policy\
     --expression "request.path.matches('median-score.html') && token.recaptcha_session.score == 0.5"\
     --action redirect \
     --redirect-type google-recaptcha

Attach the security policy to the backend service http-backend:

gcloud compute backend-services update http-backend \
    --security-policy recaptcha-policy –-global

In the Console, navigate to Navigation menu > Network Security > Cloud Armor.
Click recaptcha-policy. Your policy should resemble the following:

9. Validate Bot Management with Cloud Armor

Open up a browser and enter the url http://[LB_IP_v4]/index.html. Navigate to "Visit allow link". You should be allowed through -

Open a new window in Incognito mode to ensure we have a new session. Enter the url http://[LB_IP_v4]/index.html and navigate to "Visit blocked link". You should receive a HTTP 403 error -

Open a new window in Incognito mode to ensure we have a new session. Enter the url http://[LB_IP_v4]/index.html and navigate to "Visit redirect link". You should see the redirection to Google reCAPTCHA and the manual challenge page as below -

Verify Cloud Armor logs

Explore the security policy logs to validate bot management worked as expected.

In the Console, navigate to Navigation menu > Network Security > Cloud Armor.
Click recaptcha-policy.
Click Logs.

Click View policy logs.
Below is the MQL(monitoring query language) query, you can copy and paste into the query editer -

resource.type:(http_load_balancer) AND jsonPayload.enforcedSecurityPolicy.name:(recaptcha-policy)

Now click Run Query.
Look for a log entry in Query results where the request is for http://[LB_IP_v4]/good-score.html. Expand jsonPayload.Expand enforcedSecurityPolicy.

Repeat the same for http://[LB_IP_v4]/bad-score.html and http://[LB_IP_v4]/median-score.html

Notice that the configuredAction is set to ALLOW, DENY or GOOGLE_RECAPTCHA with the name recaptcha-policy.

Congratulations! You have completed this lab on Bot Management with Cloud Armor

©2020 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

10. Lab Clean up

Navigate to Network Security >> Cloud Armor >> %POLICY NAME% and select delete -

Navigate to Networking >> Network services >> Load Balancing. Select the load balancer you created and click delete.

Select the backend service and health check as additional resources to be deleted -

Navigate to Navigation menu ( ) > Compute Engine > Instance Groups. Select the managed instance group and click delete -

Confirm deletion by typing "delete" into the textbox.

Wait for the managed instance group to be deleted. This also deletes the instance in the group. You can delete the templates only after the instance group has been deleted.

Navigate to Instance templates from the left hand side pane**.** Select the instance template and click delete.
Navigate to Navigation menu ( ) > VPC network > Firewall. Select the default-allow-health-check and allow-ssh rules and click delete.
Navigate to Navigation menu ( ) > Security > reCAPTCHA Enterprise. Select the keys we created and delete it. Confirm deletion by typing "DELETE" into the textbox.

11. Congratulations!

You successfully implemented bot management with Cloud Armor. You configured an HTTP Load Balancer. Then, you created and implemented reCAPTCHA session token site key on a webpage. You also learnt to create a challenge-page site key. You set up Cloud Armor Bot management policy and validated how they handle requests based on the rules. You were able to explore the security policy logs to identify why the traffic was allowed, blocked or redirected.

What we've covered

How to set up instance templates and create managed instance groups.
How to set up a HTTP Load Balancer.
How to create a Cloud Armor bot management policy.
How to create and implement reCAPTCHA session token site key.
How to create and implement reCAPTCHA challenge page site key.
How to validate that the Bot Management Policy is working as intended.

Next steps

Try setting up reCAPTCHA action tokens.

Report a mistake