Pic-a-daily: Lab 5—Cleanup after image deletion

In this code lab, you create a new Cloud Run service, image garbage collector, that will be triggered by Eventarc, a new service for receiving events in Cloud Run. When a picture is deleted from the pictures bucket, the service receives an event from Eventarc. Then, it deletes the image from the thumbnails bucket and also removes it from the Firestore pictures collection.

d93345bfc235f81e.png

What you'll learn

  • Cloud Run
  • Cloud Storage
  • Cloud Firestore
  • Eventarc

Self-paced environment setup

  1. Sign in to Cloud Console and create a new project or reuse an existing one. (If you don't already have a Gmail or Google Workspace account, you must create one.)

96a9c957bc475304.png

b9a10ebdf5b5a448.png

a1e3c01a38fa61c2.png

Remember the project ID, a unique name across all Google Cloud projects (the name above has already been taken and will not work for you, sorry!). It will be referred to later in this codelab as PROJECT_ID.

  1. Next, you'll need to enable billing in Cloud Console in order to use Google Cloud resources.

Running through this codelab shouldn't cost much, if anything at all. Be sure to to follow any instructions in the "Cleaning up" section which advises you how to shut down resources so you don't incur billing beyond this tutorial. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

bce75f34b2c53987.png

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

f6ef2b5f13479f3a.png

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this lab can be done with simply a browser.

Eventarc makes it easy to connect Cloud Run services with events from a variety of sources. It takes care of event ingestion, delivery, security, authorization and error-handling for you.

4621ff9603687968.png

You can draw events from Google Cloud sources and Custom applications publishing to Cloud Pub/Sub and deliver them to Google Cloud Run sinks.

Events from a breadth of Google Cloud sources are delivered by way of Cloud Audit Logs. The latency and availability of event delivery from these sources are tied to those of Cloud Audit Logs. Whenever an event from a Google Cloud source is fired, a corresponding Cloud Audit Log entry is created.

Custom applications publishing to Cloud Pub/Sub can publish messages to a Pub/Sub topic they specify in any format.

Event triggers are the filtering mechanism to specify which events to deliver to which sink.

All events are delivered in the CloudEvents v1.0 format for cross service interoperability.

Enable APIs

You will need the Eventarc service to trigger the Cloud Run service. Make sure it is enabled:

gcloud services enable eventarc.googleapis.com

You should see the operation to finish successfully:

Operation "operations/acf.5c5ef4f6-f734-455d-b2f0-ee70b5a17322" finished successfully.

Enable Audit Logs for Cloud Storage

In Eventarc, to receive events from a service, you need to enable Cloud Audit Logs. In this case, you want to receive events from Cloud Storage. From the Cloud Console, select IAM & Admin and Audit Logs from the upper left-hand menu:

2327de902decdb0.png

In the list of services, check Google Cloud Storage:

3053afc271d2734c.png

On the right hand side, make sure Admin, Read and Write are selected. Click save:

413772b97076fe03.png

Configure service account

Give the default compute service account eventarc.eventReceiver role:

export PROJECT_NUMBER="$(gcloud projects list --filter=$(gcloud config get-value project) --format='value(PROJECT_NUMBER)')"
gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \
    --member=serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com \
    --role=roles/eventarc.eventReceiver

You will use this service account in Audit Log trigger later.

Clone the code, if you haven't already in the previous code lab:

git clone https://github.com/GoogleCloudPlatform/serverless-photosharing-workshop

You can then go to the directory containing the service:

cd serverless-photosharing-workshop/services/garbage-collector/nodejs

You will have the following file layout for the service:

services
 |
 ├── garbage-collector
      |
      ├── nodejs
           |
           ├── Dockerfile
           ├── index.js
           ├── package.json

Inside the folder,, you have 3 files:

  • index.js contains the Node.js code
  • package.json defines the library dependencies
  • Dockerfile defines the container image

Dependencies

The package.json file defines the needed library dependencies:

{
  "name": "garbage_collector_service",
  "version": "0.0.1",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "@google-cloud/storage": "^5.5.0",
    "@google-cloud/firestore": "^4.7.1",
    "@google/events": "^3.1.0",
    "body-parser": "^1.19.0",
    "cloudevents": "^3.2.0",
    "express": "^4.17.1",
    "bluebird": "^3.7.2"
  }
}

We depend on the Cloud Storage library to delete images within Cloud Storage. We declare a dependency on Cloud Firestore, to also delete picture metadata that we stored previously. Additionally, we depend on CloudEvents SDK and Google Events libraries to read the CloudEvents sent by Eventarc. Express is a JavaScript / Node web framework. Bluebird is used for handling promises.

Dockerfile

Dockerfile defines the container image for the application:

FROM node:12-slim

WORKDIR /picadaily/services/garbage-collector
COPY package*.json ./
RUN npm install --production
COPY . .
CMD [ "npm", "start" ]

We're using a light Node 12 base image. We're installing the NPM modules needed by our code, and we run our node code with npm start.

index.js

Let's have a closer look at our index.js code:

const express = require('express');
const bodyParser = require('body-parser');
const Promise = require("bluebird");
const {Storage} = require('@google-cloud/storage');
const storage = new Storage();
const Firestore = require('@google-cloud/firestore');
const { HTTP } = require("cloudevents");
const {toLogEntryData} = require('@google/events/cloud/audit/v1/LogEntryData');

We require the various dependencies needed for our program to run: Express is the Node web framework we will be using, Bluebird is a library for handling JavaScript promises, Storage and Firestore are for working respectively with Google Cloud Storage (our buckets of images), and the Cloud Firestore datastore. Additionally, we require CloudEvent to read the CloudEvent sent by Eventarc LogEntryData from Google Events library to read the AuditLog body of the CloudEvent.

const app = express();

app.post('/', async (req, res) => {
    try {
        const cloudEvent = HTTP.toEvent({ headers: req.headers, body: req.body });
        console.log(cloudEvent);


        /* ... */

    } catch (err) {
        console.log(`Error: ${err}`);
        res.status(500).send(err);
    }
});

Above, we have the structure of our Node handler: our app responds to HTTP POST requests. It reads the CloudEvent from the HTTP request and we're doing a bit of error handling in case something goes wrong. Let's now have a look at what is inside this structure.

if (EVENT_TYPE_AUDITLOG != cloudEvent.type)
{
    console.log(`Event type '${cloudEvent.type}' is not '${EVENT_TYPE_AUDITLOG}', ignoring.`);
    res.status(200).send();
    return;
}

The service only cares about CloudEvents of type google.cloud.audit.log.v1.written. This is the event type that will be sent by Cloud Storage when an image is deleted.

Next step is to retrieve and parse the CloudEvent body and retrieve the bucket name and object name:

//"protoPayload" : {"resourceName":"projects/_/buckets/events-atamel-images-input/objects/atamel.jpg}";
const logEntryData = toLogEntryData(cloudEvent.data);
console.log(logEntryData);

const tokens = logEntryData.protoPayload.resourceName.split('/');
const bucket = tokens[3];
const objectName = tokens[5];

The service is going to receive events from all buckets, so we need to ignore events from buckets we are not interested in:

if (bucketImages != bucket)
{
    console.log(`Bucket '${bucket}' is not same as '${bucketImages}', ignoring.`);
    res.status(200).send();
    return;
}

Once we know that the image was from the images bucket, we can delete it from the thumbnails bucket:

try {
    await storage.bucket(bucketThumbnails).file(objectName).delete();
    console.log(`Deleted '${objectName}' from bucket '${bucketThumbnails}'.`);
}
catch(err) {
    console.log(`Failed to delete '${objectName}' from bucket '${bucketThumbnails}': ${err}.`);
}

As a last step, delete the picture metadata from the Firestore collection as well:

try {
    const pictureStore = new Firestore().collection('pictures');
    const docRef = pictureStore.doc(objectName);
    await docRef.delete();

    console.log(`Deleted '${objectName}' from Firestore collection 'pictures'`);
}
catch(err) {
    console.log(`Failed to delete '${objectName}' from Firestore: ${err}.`);
}

res.status(200).send(`Processed '${objectName}'.`);

Now time to make our Node script listen to incoming requests. Also check that the required environment variables are set:

app.listen(PORT, () => {
    if (!bucketImages) throw new Error("BUCKET_IMAGES not set");
    if (!bucketThumbnails) throw new Error("BUCKET_THUMBNAILS not set");
    console.log(`Started service on port ${PORT}`);
});

Test the code locally to make sure it works before deploying to cloud.

Inside garbage-collector/nodejs folder, install npm dependencies and start the server:

export BUCKET_IMAGES=uploaded-pictures-${GOOGLE_CLOUD_PROJECT}
export BUCKET_THUMBNAILS=thumbnails-${GOOGLE_CLOUD_PROJECT}
npm install; npm start

If everything went well, it should start the server on port 8080:

Started service on port 8080

Use CTRL-C to exit.

Inside the garbage-collector/nodejs folder where the Dockerfile, is issue the following command to build the container image with Cloud Build:

gcloud builds submit --tag gcr.io/${GOOGLE_CLOUD_PROJECT}/garbage-collector-service

Before deploying to Cloud Run, set the Cloud Run region to one of the supported regions and platform to managed:

gcloud config set run/region europe-west1
gcloud config set run/platform managed

You can check that the configuration is set:

gcloud config list

...
[run]
platform = managed
region = europe-west1

Run the following command to deploy the container image on Cloud Run:

export BUCKET_IMAGES=uploaded-pictures-${GOOGLE_CLOUD_PROJECT}
export BUCKET_THUMBNAILS=thumbnails-${GOOGLE_CLOUD_PROJECT}
export SERVICE_NAME=garbage-collector-service
gcloud run deploy ${SERVICE_NAME} \
    --image gcr.io/${GOOGLE_CLOUD_PROJECT}/${SERVICE_NAME} \
    --no-allow-unauthenticated \
    --update-env-vars BUCKET_IMAGES=${BUCKET_IMAGES},BUCKET_THUMBNAILS=${BUCKET_THUMBNAILS}

Note the --no-allow-unauthenticated flag. This makes the Cloud Run service an internal service that will only be triggered by specific service accounts. Later, you will create a Trigger with the default compute service account that has run.invoker role to call internal Cloud Run services.

In Eventarc, a Trigger defines what service should get what kind of events. In this case, you want the service to receive events when a file is deleted in a bucket.

Set the location of the Trigger in the same region as the Cloud Run service:

gcloud config set eventarc/location europe-west1

Create an AuditLog trigger to filter for storage.objects.delete events and send to the Cloud Run service:

gcloud eventarc triggers create trigger-${SERVICE_NAME} \
  --destination-run-service=${SERVICE_NAME} \
  --destination-run-region=europe-west1 \
  --event-filters="type=google.cloud.audit.log.v1.written" \
  --event-filters="serviceName=storage.googleapis.com" \
  --event-filters="methodName=storage.objects.delete" \
  --service-account=${PROJECT_NUMBER}-compute@developer.gserviceaccount.com

You can double check that the Trigger is created with this command:

gcloud eventarc triggers list

To test the setup is working, go to the uploaded-pictures bucket and delete one of the pictures. You should see in the logs of the service that it deleted the relevant picture in the thumbnails bucket and also deleted its document from pictures Firestore collection.

519abf90e7ea4d12.png

If you don't intend to continue with the other labs in the series, you can clean up resources to save costs and to be an overall good cloud citizen. You can clean up resources individually as follows.

Delete the service:

gcloud run services delete ${SERVICE_NAME} -q

Delete the Eventarc trigger:

gcloud eventarc triggers delete trigger-${SERVICE_NAME} -q

Alternatively, you can delete the whole project:

gcloud projects delete ${GOOGLE_CLOUD_PROJECT} 

Congratulations! You created a Cloud Run service, image garbage collector, that is triggered by Eventarc, a new service for receiving events in Cloud Run. When a picture is deleted from the pictures bucket, the service receives an event from Eventarc. Then, it deletes the image from the thumbnails bucket and also removes it from the Firestore pictures collection.

What we've covered

  • Cloud Run
  • Cloud Storage
  • Cloud Firestore
  • Eventarc