Alerts: Log-Based Errors to Pub/Sub Topics

1. Introduction

Last Updated: Jun 21, 2023

Alerting on Log-Based Errors for Availability

Log-based alerts can be used to determine an application's availability by monitoring for specific events or patterns in the logs*.* By being alerted to outages or other user-facing issues, you can take steps to minimize the impact on your users and customers.

While uptime checks provide a general snapshot of availability, it may be more accurate to use error messages derived from logs as indicators of more specific types of unavailability, and to get a sense of what proportion of users are experiencing an issue.

Errors can arise from any number of causes, ranging from user mistakes to systems maintenance, upgrades, and even factors external to the system, such as bad weather. The key in alerting is not to try to anticipate all possible causes, but rather to pick a few key symptoms that can serve as the start for troubleshooting.

Pub/Sub Topics as an Alert Notification Channel

A Pub/Sub topic can be used as a Google Cloud Monitoring notification channel to send alerts to a Pub/Sub subscription. This allows you to integrate your Cloud Monitoring alerts with other systems, including third-party notification services.

To use a Pub/Sub topic as a notification channel, you first need to create a Pub/Sub topic and a Pub/Sub subscription. Then, you need to create a Cloud Monitoring notification channel that uses the Pub/Sub topic as the destination.

When an alert is triggered, Cloud Monitoring will send a message to the Pub/Sub topic. The subscriber of the Pub/Sub subscription can then process the message and take appropriate action.

What you'll build

In this codelab, you're going to deploy an app, create a Pub/Sub topic, and create a log-based alert that checks for errors in a specific part of the app and uses the Pub/Sub topic as a notification channel.

What you'll learn

  • How to create a Pub/Sub topic
  • How to create a log-based alert

This codelab is focused on creating an alert for errors. Non-relevant concepts and application code are glossed over and are provided for you to simply copy and paste.

What you'll need

  • A Google Cloud account with permissions to:
  • Deploy Cloud Run applications
  • Create Pub/Sub topics
  • Create alerts

2. Getting set up

Select or Create a Google Cloud Project

To select an existing project, use the dropdown:

b35bf95b8bf3d5d8.png

To create a new project in Google Cloud, you can follow these steps:

  1. Go to the Google Cloud Platform Console.
  2. Click the Create Project button.
  3. Enter a name for your project.
  4. Select a billing account for your project.
  5. Click the Create button.

Your project will be created and you will be taken to the project dashboard. From there, you can start using Google Cloud services.

Here are some additional details about each step:

  • Name: The name of your project must be unique within your organization.
  • Billing account: You can use an existing billing account or create a new one.
  • Create: Once you have entered all the required information, click the Create button to create your project.

For more information, please see the Google Cloud documentation on creating projects.

3. Deploy the API Application

What is the sample application or API about?

Our application is a simple Inventory API application that exposes a REST API Endpoint with a couple of operations to list the inventory items and getting specific item inventory count.

Once we deploy the API and assuming that it is hosted at https://<somehost>, we can access the API endpoints as follows:

https://<somehost>/inventory

This will list down all the product items with the on-hand inventory levels.

https://<somehost>/inventory/{productid}

This will provide a single record with the productid and on-hand inventory level for that product.

The returned response data is JSON format.

Note: This API application is for demo purposes only and does not represent a secure and robust API implementation. It is meant to have a quick application available to us, to explore the key purpose of the lab, i.e. Google Cloud Operations.

Sample Data and API Request/Response

The application is not powered by a database at the backend to keep things simple. It contains 3 sample product ids and their on-hand inventory levels.

Product Id

On-Hand Inventory Level

I-1

10

I-2

20

I-3

30

Sample API Request and Response are shown below:

API Request

API Response

https://<somehost>/inventory

[ { "I-1": 10, "I-2": 20, "I-3": 30 }]

https://<somehost>/inventory/I-1

{ "productid": "I-1", "qty": 10}

https://<somehost>/inventory/I-2

{ "productid": "I-2", "qty": 20}

https://<somehost>/inventory/I-200

{ "productid": I-200, "qty": -1}

Clone the Repository

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

bce75f34b2c53987.png

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

f6ef2b5f13479f3a.png

This virtual machine is loaded with all the development tools you need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All your work in this lab can be done with simply a browser.

Setup gcloud

In Cloud Shell, set your project ID and save it as the PROJECT_ID variable.

PROJECT_ID=[YOUR-PROJECT-ID]

gcloud config set project $PROJECT_ID

Now, execute the following command:

$ git clone https://github.com/rominirani/cloud-code-sample-repository.git

This will create a folder titled cloud-code-sample-repository in this folder.

(Optional) Run the application on Cloud Shell

You can run the application locally by following these steps:

  1. From the terminal, navigate to the Python version of the API via the following command:

$ cd cloud-code-sample-repository

$ cd python-flask-api

  1. In the terminal, provide the following command (At the time of writing, Cloud Shell comes with Python 3.9.x installed and we will use the default version. If you plan to run it locally on your laptop, you can go with Python 3.8+) :

$ python app.py

  1. You can run the following command to start the Python Server locally.

1f798fbddfdc2c8e.png 46edf454cc70c5a6.png

Click on Preview on port 8080. 5. This will open a browser window. You will see a 404 Error and that is fine. Modify URL and change it to just have /inventory after the host name.

For e.g. on my machine, it looks like this:

https://8080-cs-557561579860-default.cs-asia-southeast1-yelo.cloudshell.dev/inventory

This will display the list of inventory items as explained earlier:

709d57ee2f0137e4.png

  1. You can stop the server now by going to the Terminal and pressing Ctrl-C

Deploy the application

We will now deploy this API application to Cloud Run. The process involved utilizing the gcloud command line client to run the command to deploy the code to Cloud Run.

From the terminal, give the following gcloud command:

$ gcloud run deploy --source .

This will ask you multiple questions and some of the points are mentioned below:

  1. Service name (python-flask-api): Either go with this default or choose something like my-inventory-api
  2. API [run.googleapis.com] not enabled on project [613162942481]. Would you like to enable and retry (this will take a few minutes)? (y/N)? Y
  3. Please specify a region: Choose 31 (us-west-1)
  4. API [artifactregistry.googleapis.com] not enabled on project [613162942481]. Would you like to enable and retry (this will take a few minutes)? (y/N)? Y
  5. Deploying from source requires an Artifact Registry Docker repository to store built containers. A repository named [cloud-run-source-deploy] in region [us-west1] will be created.
  6. Do you want to continue (Y/n)? Y
  7. Allow unauthenticated invocations to [my-inventory-api] (y/N)? Y

Eventually, this will kick-off the process to take your source code, containerize it, push it to the Artifact Registry and then deploy the Cloud Run service + revision. You should be patient through this process (can take 3-4 minutes) and you should see the process getting completed with the Service URL shown to you.

A sample run is shown below:

87ba8dbf88e8cfa4.png

Test the application

Now that we have deployed the application to Cloud Run, you can access the API application as follows:

  1. Note the Service URL from the previous step. For example, on my setup, it is shown as https://my-inventory-api-bt2r5243dq-uw.a.run.app. Let's call this <SERVICE_URL>.
  2. Open a browser and access the following 3 URLs for the API endpoints:
  3. <SERVICE_URL>/inventory
  4. <SERVICE_URL>/inventory/I-1
  5. <SERVICE_URL>/inventory/I-100

It should be as per the specifications that we had provided in an earlier section with sample API Request and Response.

Get Service Details from Cloud Run

We deployed our API Service to Cloud Run, a serverless compute environment. We can visit the Cloud Run service via Google Cloud console at any point in time.

From the main menu, navigate to Cloud Run. This will display the list of services that you have running in Cloud Run. You should see the service that you just deployed. Depending on the name that you selected, you should see something like this:

2633965c4bc957cc.png

Click the Service name to view the details. The sample details are shown below:

33042ae64322ce07.png

Notice the URL, which is nothing but the service URL that you can punch into the browser and access the Inventory API that we just deployed. Check out Metrics and other details.

Let's start with Google Cloud Operations Suite now.

4. Create a Pub/Sub Topic to Receive the Alert Notification

To create a Pub/Sub topic, you can follow these steps in the Google Cloud Console:

  1. Search Pub/Sub in the Search box, and navigate to Pub/Sub. 935028bd8f6328ef.png
  2. Click the Topics tab if you are not already there. 7fd8bf91386a88fd.png
  3. Click the Create Topic button. cd9d197f9023c41b.png
  4. Enter a recognizable name for your topic.

173f313b4a3c4934.png

  1. Click the Create button. ca9a02477da21a44.png
  2. Copy the Topic name using the copy icon button. You will need it for the next section.

20848252ee83df93.png

5. Create an Alert Policy for Errors

Exploring Error Logs

To see the errors logs for the application:

Click the Logging tab.

This will display a log interface where you can specifically select/deselect various Resources (Project, Google cloud Resource, service names, etc) along with Log Levels to filter the log messages as needed.

6605b68395185b89.png

Simulate a few invalid requests to the Inventory Service by providing product ids that are not one of I-1, I-2 and I-3. For e.g. an incorrect request is:

https://<SERVICE_URL>/inventory/I-999

We will now search for all the WARNINGs that have been generated by our API, when an incorrect Product ID is provided in the Query.

Creating a Custom Log-Based Alert Policy for Errors

Suppose we want to watch out for the occurrence of a very specific error message for part of the application. Say if we notice a high number of errors for looking up Product IDs. This issue is a symptom of many possible problems (a bad link, database inconsistency, or bot enumerating our site). While it would be difficult or impossible to imagine every potential cause, the application sending this message even once is a high-level issue we want to be aware of. To alert on it, we need to create a policy based on data in our error logs.

  1. In the Query Box, insert the following query parameters:

resource.type="cloud_run_revision"

textPayload =~ "WARNING in app: Received inventory request for incorrect productid"

It should look something like this:

f672154cfebf0051.png

  1. Click on Run Query. This will then show you all the requests that have come in and which have this issue.

77c190e3a2fab6bf.png

  1. To convert the above to an alert, click on the Create alert button that you see in the Logs Explorer just beneath the query field, to the right:

4cd3fcf142189376.png

  1. This will bring up the form to create a log-based alert policy.

b82446854bad87fc.png

  1. Use the initial query for logs to include in the alert:

resource.type="cloud_run_revision"

textPayload =~ "WARNING in app: Received inventory request for incorrect productid"

764227db73ec3de6.png

  1. Set the notification frequency and incident duration. For the purpose of the example, you can use the minimum values for each:

bb3d96448ec998a1.png

  1. Finally, for "Who should be notified?" select the Pub/Sub notification channel you created earlier:

3593c48c29d4b76c.png

  1. Click Save. To view and manage the alert policy, visit the Alerting page and check under Policies: ca08ea380fb37c91.png

6. Congratulations

Congratulations, you've successfully configured your Uptime Check to send alerts to Pub/Sub!