Uptime Checks is a service of Cloud Monitoring. You configure the service to check your system's health by sending requests to your applications, services, or URLs from various locations around the world. You can use the results of the checks as conditions in your alert policies, so you will be notified if system health is degraded.

An Alert Policy is a set of rules that determine whether your resources or groups are operating normally. The rules are logical conditions involving metric thresholds and uptime checks. For example, you can create a rule that your web site's average response latency must not exceed five seconds over a period of two minutes.

An alert occurs when an alert policy's conditions are met, causing an Incident to appear in the Incidents section of the Cloud Monitoring Console. Incidents remain open until the alert policy rules are no longer in violation or until the incident is manually closed.

You can associate notifications with alert policies. For example, alerts can send email or SMS notifications to people or services.

In this codelab, you'll learn how to create an Uptime check on a Compute Engine instance, attach an alerting policy to it, so that an incident from that policy will be created to notify you when the machine goes down.

What you'll learn

What you'll need

Self-paced environment setup

If you don't already have a Google Account (Gmail or Google Apps), you must create one. Sign-in to Google Cloud Platform console (console.cloud.google.com) and create a new project:

Screenshot from 2016-02-10 12:45:26.png

Remember the project ID, a unique name across all Google Cloud projects (the name above has already been taken and will not work for you, sorry!). It will be referred to later in this codelab as PROJECT_ID.

Next, you'll need to enable billing in the Developers Console in order to use Google Cloud resources.

Running through this codelab shouldn't cost you more than a few dollars, but it could be more if you decide to use more resources or if you leave them running (see "cleanup" section at the end of this document).

New users of Google Cloud Platform are eligible for a $300 free trial.

Before we can enable monitoring, we will need some kind of infrastructure within this Google Cloud Platform project to actually monitor, so let us create that now.

We will create a Compute Engine instance with NGINX through the GCP Marketplace, so that we have a URL we can hit with a HTTP request to see if our resource is up and running.

Note: The first time you access Compute Engine, it will need to be enabled. This can take a minute or two, so please be patient.

Starting NGINX

To create the virtual machine:

  1. Visit the Marketplace.
  2. Type "nginx open source bitnami" in the search bar
  3. Click on the Nginx Open Source Certified by Bitnami

  1. Click Launch on Compute Engine (this may take a couple of minutes).
  2. Leave the default values for all of the options
  3. Accept the GCP Marketplace Terms of Service
  4. Click on Deploy
  5. The NGINX instance will now start up.
  6. We can leave this to continue in the background, while we move on to enabling Cloud Monitoring - so no need to watch it start up.

We now have a resource that we can monitor!

Before we can use Stackdriver Monitoring, it must first be enabled for your project.

To use Stackdriver Monitoring with one of your projects, do the following:

  1. Visit the Stackdriver Monitoring Console. You may be prompted to login again, click Log in with Google, and then choose your account to login.
  2. On the Add your project to a Workspace page, click Add

You are now looking at the Stackdriver Monitoring Console. The information shown will vary depending on the Google (and AWS) services you are using and the monitoring features you have set up.

Now that monitoring is enabled, we want to create an Uptime Check. An uptime check is a process to make sure that a given resource is up and running all the time. There are a variety of ways that uptime checks can be made, including: HTTP, HTTPS, UDP and TCP.

For the purposes of this Code Lab, we will create a HTTP uptime check, to monitor our recently created NGINX web server.

To create the Uptime Check, on the left bar, click the Uptime Check > Uptime Checks Overview. Then click Add Uptime Check button on the top right.

From there, select the following options:

Click Test to make sure that your Uptime Check works correctly. You should get back a message with "Responded with 200 (OK) in ...".

Click save to save your Uptime Check.

Click No Thanks on the Alerting Policy question - we will do this in the next section.

Congratulations, You have now successfully created a Uptime Check!

Creating an Uptime Check is only half the battle. You will need something to notify you when a Uptime Check fails. This is where an Alerting Policy comes into effect.

There are multiple ways to create an Alerting Policy (as we saw earlier), but to create an Alerting Policy directly from your Uptime Check:

  1. Navigate to Alerting > Create a Policy
  2. Click Add Condition
  3. Click UPTIME CHECK on the top

Under Target, select the following:

Under Configuration, select the following:

Click Save.

Add Notifications

Now we need to configure how we want to be notified. There are lots of options, including PagerDuty integration, SMS, Slack, Hipchat, etc, but the easiest option for now is Email, so let's configure that:

Under Notification, select Email from the drop down, and enter an email address you would be happy to receive a notification.

Documentation

It is often useful to include documentation with your alerts, outlining what the alert is for, and possible fixes or troubleshooting steps. For this code lab we will not add any documentation, but it is something you should consider for production systems.

Set the Policy Name

This gives a convenient name to the Alerting Policy, so it can be recognisable when it creates an Incident.

  1. Give the Policy a name of "NGINX"
  2. Click Save Policy

You now have a Compute Engine instance that has it's uptime state monitored by a Uptime Check and a Alerting Policy

What we've covered

Next Steps

Learn More