Per-Instance Weighted Network Load Balancing

8 mins remaining

About this codelab

Last updated Mar 18, 2025

Written by Sahana P, Babi Seal

1. Introduction

You can configure a network load balancer to distribute traffic across the load balancer's backend instances based on the weights reported by an HTTP health check using weighted load balancing.

Weighted load balancing requires that you configure both of the following:

You must set the locality load balancer policy (localityLbPolicy) of the backend service to WEIGHTED_MAGLEV.
You must configure the backend service with an HTTP/HTTP2/HTTPS health check. The HTTP health check responses must contain a custom HTTP response header field X-Load-Balancing-Endpoint-Weight to specify the weights with integer values from 0 to 1000 in decimal representation for each backend instance.

If you use the same instance group as a backend for multiple backend service-based network load balancers using weighted load balancing, it's recommended to use a unique request-path for each health check of the backend service. For more information, see Success criteria for HTTP, HTTPS, and HTTP/2 health checks.

The HTTP health check should return an HTTP 200 (OK) response for the health-checks to pass and the backend instance regarded as healthy. In situations where all backend instances pass their health checks and return X-Load-Balancing-Endpoint-Weight with zero weight, the load balancer distributes new connections among the healthy backends, treating them with equal weight. The load balancer can also distribute new connections among unhealthy backends. For more information, see Traffic distribution.

For examples of weighted load balancing, see Backend selection and connection tracking.

Weighted load balancing can be used in the following scenarios:

If some connections process more data than others, or some connections live longer than others, the backend load distribution might get uneven. By signaling a lower per-instance weight, an instance with high load can reduce its share of new connections, while it keeps servicing existing connections.
If a backend is overloaded and assigning more connections might break existing connections, such backends assign zero weight to itself. By signaling zero weight, a backend instance stops servicing new connections, but continues to service existing ones.
If a backend is draining existing connections before maintenance, it assigns zero weight to itself. By signaling zero weight, the backend instance stops servicing new connections, but continues to service existing ones.

What you'll learn

How to configure a network load balancer to distribute traffic across the load balancer's backend instances based on the weights reported by an HTTP health check using weighted load balancing.

Self-paced environment setup

Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

The Project name is the display name for this project's participants. It is a character string not used by Google APIs. You can update it at any time.
The Project ID is unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference the Project ID (it is typically identified as PROJECT_ID). If you don't like the generated ID, you may generate another random one. Alternatively, you can try your own and see if it's available. It cannot be changed after this step and will remain for the duration of the project.
For your information, there is a third value, a Project Number which some APIs use. Learn more about all three of these values in the documentation.

Next, you'll need to enable billing in the Cloud Console to use Cloud resources/APIs. Running through this codelab shouldn't cost much, if anything at all. To shut down resources so you don't incur billing beyond this tutorial, you can delete the resources you created or delete the whole project. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the Google Cloud Console, click the Cloud Shell icon on the top right toolbar:

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this codelab can be done within a browser. You do not need to install anything.

2. Start Configuration

Codelab requires a single project.

In this tutorial, you create an instance group with three VM instances and assign weights for each instance. You create an HTTP health check to report backend instance weights. Weighted network load balancer is enabled on the backend service with locality load balancer policy as WEIGHTED_MAGLEV.

Before you begin

Read the Backend service-based external Network Load Balancing overview.
Install the Google Cloud CLI. For a complete overview of the tool, see the gcloud CLI overview. You can find commands related to load balancing in the API and gcloud CLI reference. If you haven't run the Google Cloud CLI previously, first run gcloud init to authenticate.
Enable the compute API.

gcloud services enable compute.googleapis.com

Note: You cannot use the Google Cloud console to configure the locality load balancer policy and assign weights to VM instances. Use the Google Cloud CLI instead.

Create VPC network, subnets, and firewall rules

Create a VPC network, subnet, and ingress allow firewall rules to allow connections to the backend VMs of your load balancer.

Create a VPC network and subnet. a. To create the VPC network, run the gcloud compute networks create command:

gcloud compute networks create NETWORK_NAME --subnet-mode custom

b. In this example, the subnet's primary IPv4 address range is 10.10.0.0/24.

To create the subnet, run the gcloud compute networks subnets create command:

gcloud compute networks subnets create SUBNET_NAME \
  --network=NETWORK_NAME \
  --range=10.10.0.0/24 \
  --region=us-central1

Replace the following:

NETWORK_NAME: the name of the VPC network to create.
SUBNET_NAME: the name of the subnetwork to create.

Create an ingress allow firewall rule to allow packets sent to destination TCP ports 80 and 443 to be delivered to the backend VMs. In this example, firewall rule allows connections from any source IP address. The firewall rule applies to VMs with the network tag network-lb-tag. To create the firewall rule, run the gcloud compute firewall-rules create command:

gcloud compute firewall-rules create FIREWALL_RULE_NAME \
   --direction=INGRESS \
   --priority=1000 \
   --network=NETWORK_NAME \
   --action=ALLOW \
   --rules=tcp:80,tcp:443 \
   --source-ranges=0.0.0.0/0 \
   --target-tags=network-lb-tag

Replace FIREWALL_RULE_NAME with the name of the firewall rule to create.

Create VM instances and assign weights

Create three VM instances and assign weights:

Configure three backend VM instances to return the weights in the X-Load-Balancing-Endpoint-Weight header with HTTP responses. For this tutorial, you configure one backend instance to report a weight of zero, a second backend instance to report a weight of 100, and a third backend instance to report a weight of 900. To create the instances, run the gcloud compute instances createcommand:

gcloud compute instances create instance-0 \
  --zone=us-central1-a \
  --tags=network-lb-tag \
  --image-family=debian-10 \
  --image-project=debian-cloud \
  --subnet=
SUBNET_NAME
\
  --metadata=load-balancing-weight=0,startup-script='#! /bin/bash
  apt-get update
  apt-get install apache2 -y
  ln -sr /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled/headers.load
  vm_hostname="$(curl -H "Metadata-Flavor:Google" \
  http://169.254.169.254/computeMetadata/v1/instance/name)"
  echo "Page served from: $vm_hostname" | \
  tee /var/www/html/index.html
  lb_weight="$(curl -H "Metadata-Flavor:Google" \
  http://169.254.169.254/computeMetadata/v1/instance/attributes/load-balancing-weight)"
  echo "Header set X-Load-Balancing-Endpoint-Weight \"$lb_weight\"" | \
  tee /etc/apache2/conf-enabled/headers.conf
  systemctl restart apache2'

gcloud compute instances create instance-100 \
  --zone=us-central1-a \
  --tags=network-lb-tag \
  --image-family=debian-10 \
  --image-project=debian-cloud \
  --subnet=SUBNET_NAME \
  --metadata=load-balancing-weight=100,startup-script='#! /bin/bash
  apt-get update
  apt-get install apache2 -y
  ln -sr /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled/headers.load
  vm_hostname="$(curl -H "Metadata-Flavor:Google" \
  http://169.254.169.254/computeMetadata/v1/instance/name)"
  echo "Page served from: $vm_hostname" | \
  tee /var/www/html/index.html
  lb_weight="$(curl -H "Metadata-Flavor:Google" \
  http://169.254.169.254/computeMetadata/v1/instance/attributes/load-balancing-weight)"
  echo "Header set X-Load-Balancing-Endpoint-Weight \"$lb_weight\"" | \
  tee /etc/apache2/conf-enabled/headers.conf
  systemctl restart apache2'

gcloud compute instances create instance-900 \
  --zone=us-central1-a \
  --tags=network-lb-tag \
  --image-family=debian-10 \
  --image-project=debian-cloud \
  --subnet=
SUBNET_NAME
\
  --metadata=load-balancing-weight=900,startup-script='#! /bin/bash
    apt-get update
    apt-get install apache2 -y
    ln -sr /etc/apache2/mods-available/headers.load /etc/apache2/mods-enabled/headers.load
    vm_hostname="$(curl -H "Metadata-Flavor:Google" \
    http://169.254.169.254/computeMetadata/v1/instance/name)"
    echo "Page served from: $vm_hostname" | \
    tee /var/www/html/index.html
    lb_weight="$(curl -H "Metadata-Flavor:Google" \
    http://169.254.169.254/computeMetadata/v1/instance/attributes/load-balancing-weight)"
    echo "Header set X-Load-Balancing-Endpoint-Weight \"$lb_weight\"" | \
    tee /etc/apache2/conf-enabled/headers.conf
    systemctl restart apache2'

Create an instance group

In this tutorial, you provide instructions to create an unmanaged instance group containing all three VM instances(instance-0, instance-100, and instance-900).

To create the instance group, run the gcloud compute instance-groups unmanaged create command:

gcloud compute instance-groups unmanaged create
INSTANCE_GROUP --zone=us-central1-a

gcloud compute instance-groups unmanaged add-instances INSTANCE_GROUP \
  --zone=us-central1-a \
  --instances=instance-0,instance-100,instance-900

Replace INSTANCE_GROUP with the name of the instance group to create.

Create an HTTP health check

In this tutorial, you provide instructions to create an HTTP health check to read the HTTP response containing the backend VM's weight."

To create the HTTP health check, run the gcloud compute health-checks create command:

gcloud compute health-checks create http HTTP_HEALTH_CHECK_NAME \
  --region=us-central1

Replace HTTP_HEALTH_CHECK_NAME with the name of the HTTP health check to create.

Create a backend service

The following example provides instructions to create a regional external backend service configured to use weighted load balancing.

Create a backend service with the HTTP health check and set the locality load balancer policy to WEIGHTED_MAGLEV.

To create the backend service, run the gcloud compute backend-services create command:

gcloud compute backend-services create BACKEND_SERVICE_NAME \
  --load-balancing-scheme=external \
  --protocol=tcp \
  --region=us-central1 \
  --health-checks=HTTP_HEALTH_CHECK_NAME \
  --health-checks-region=us-central1 \
  --locality-lb-policy=WEIGHTED_MAGLEV

Replace BACKEND_SERVICE_NAME with the name of the backend service to create.

Add the instance group to the backend service.

To add the instance group, run the gcloud compute backend-services add-backend command:

gcloud compute backend-services add-backend BACKEND_SERVICE_NAME \
  --instance-group=INSTANCE_GROUP \
  --instance-group-zone=us-central1-a \
  --region=us-central1

Reserve a regional external IP address for the load balancer.

To reserve one or more IP addresses, run the gcloud compute addresses create command:

gcloud compute addresses create ADDRESS_NAME \
 --region us-central1

Replace ADDRESS_NAME with the name of the IP address to create. Use the compute addresses describe command to view the result. Note the reserved static external IP address (‘IP_ADDRESS').

gcloud compute addresses describe ADDRESS_NAME

Create a forwarding rule using the reserved regional external IP address ‘IP_ADDRESS'. Connect the forwarding rule to the backend service.

To create the forwarding rule, run the gcloud compute forwarding-rules create command:

gcloud compute forwarding-rules create FORWARDING_RULE \
  --region=us-central1 \
  --ports=80 \
  --address=IP_ADDRESS \
  --backend-service=BACKEND_SERVICE_NAME

Replace the following: FORWARDING_RULE: the name of the forwarding rule to create. IP_ADDRESS: the IP address to assign to the instance. Use the reserved static external IP address, not the address name.

Verify backend weights using backend service API

Verify that the backend weights are properly reported to the HTTP health check.

To get backend weights (along with health statuses) from a backend service, run the gcloud compute backend-services get-health command:

gcloud compute backend-services get-health HTTP_HEALTH_CHECK_NAME \
  --region=us-central1

The output should be like the following:

backend: https://www.googleapis.com/compute/projects/project-name/{project}/zones/us-central1-a/instanceGroups/{instance-group-name}
status:
  healthStatus:
  - forwardingRule: https://www.googleapis.com/compute/projects/{project}/regions/us-central1/forwardingRules/{firewall-rule-name}
    forwardingRuleIp: 34.135.46.66
    healthState: HEALTHY
    instance: https://www.googleapis.com/compute/projects/{project}/zones/us-central1-a/instances/instance-0
    ipAddress: 10.10.0.5
    port: 80
    weight: '0'
  - forwardingRule: https://www.googleapis.com/compute/projects/{project}/regions/us-central1/forwardingRules/{firewall-rule-name}
    forwardingRuleIp: 34.135.46.66
    healthState: HEALTHY
    instance: https://www.googleapis.com/compute/projects/{project}/zones/us-central1-a/instances/instance-100
    ipAddress: 10.10.0.6
    port: 80
    weight: '100'
  - forwardingRule: https://www.googleapis.com/compute/projects/{project}/regions/us-central1/forwardingRules/{firewall-rule-name}
    forwardingRuleIp: 34.135.46.66
    healthState: HEALTHY
    instance: https://www.googleapis.com/compute/projects/{project}/zones/us-central1-a/instances/instance-900
    ipAddress: 10.10.0.7
    port: 80
    weight: '900'
  kind: compute#backendServiceGroupHealth

Report a mistake