Pic-a-daily: Lab 5—Cleanup after image deletion

1. Overview

In this code lab, you create a new Cloud Run service, image garbage collector, that will be triggered by Eventarc, a new service for receiving events in Cloud Run. When a picture is deleted from the pictures bucket, the service receives an event from Eventarc. Then, it deletes the image from the thumbnails bucket and also removes it from the Firestore pictures collection.

d93345bfc235f81e.png

What you'll learn

  • Cloud Run
  • Cloud Storage
  • Cloud Firestore
  • Eventarc

2. Setup and Requirements

Self-paced environment setup

  1. Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

96a9c957bc475304.png

b9a10ebdf5b5a448.png

a1e3c01a38fa61c2.png

  • The Project name is the display name for this project's participants. It is a character string not used by Google APIs, and you can update it at any time.
  • The Project ID must be unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference the Project ID (and it is typically identified as PROJECT_ID), so if you don't like it, generate another random one, or, you can try your own and see if it's available. Then it's "frozen" after the project is created.
  • There is a third value, a Project Number which some APIs use. Learn more about all three of these values in the documentation.
  1. Next, you'll need to enable billing in the Cloud Console in order to use Cloud resources/APIs. Running through this codelab shouldn't cost much, if anything at all. To shut down resources so you don't incur billing beyond this tutorial, follow any "clean-up" instructions found at the end of the codelab. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

bce75f34b2c53987.png

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

f6ef2b5f13479f3a.png

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this lab can be done with simply a browser.

3. Introduction to Eventarc

Eventarc makes it easy to connect Cloud Run services with events from a variety of sources. It takes care of event ingestion, delivery, security, authorization and error-handling for you.

776ed63706ca9683.png

You can draw events from Google Cloud sources and Custom applications publishing to Cloud Pub/Sub and deliver them to Google Cloud Run sinks.

Events from a breadth of Google Cloud sources are delivered by way of Cloud Audit Logs. The latency and availability of event delivery from these sources are tied to those of Cloud Audit Logs. Whenever an event from a Google Cloud source is fired, a corresponding Cloud Audit Log entry is created.

Custom applications publishing to Cloud Pub/Sub can publish messages to a Pub/Sub topic they specify in any format.

Event triggers are the filtering mechanism to specify which events to deliver to which sink.

All events are delivered in the CloudEvents v1.0 format for cross service interoperability.

4. Before you begin

Enable APIs

You will need the Eventarc service to trigger the Cloud Run service. Make sure it is enabled:

gcloud services enable eventarc.googleapis.com

You should see the operation to finish successfully:

Operation "operations/acf.5c5ef4f6-f734-455d-b2f0-ee70b5a17322" finished successfully.

Configure service accounts

The default compute service account will be used in triggers. Grant the eventarc.eventReceiver role to the default compute service account:

PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_PROJECT --format='value(projectNumber)')

gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
    --member serviceAccount:$PROJECT_NUMBER-compute@developer.gserviceaccount.com \
    --role roles/eventarc.eventReceiver

Grant the pubsub.publisher role to the Cloud Storage service account. This is needed for the Eventarc Cloud Storage trigger:

SERVICE_ACCOUNT=$(gsutil kms serviceaccount -p $PROJECT_NUMBER)

gcloud projects add-iam-policy-binding $PROJECT_NUMBER \
    --member serviceAccount:$SERVICE_ACCOUNT \
    --role roles/pubsub.publisher

If you enabled the Pub/Sub service account on or before April 8, 2021, grant the iam.serviceAccountTokenCreator role to the Pub/Sub service account:

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:service-$PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com \
  --role roles/iam.serviceAccountTokenCreator

5. Clone the code

Clone the code, if you haven't already in the previous code lab:

git clone https://github.com/GoogleCloudPlatform/serverless-photosharing-workshop

You can then go to the directory containing the service:

cd serverless-photosharing-workshop/services/garbage-collector/nodejs

You will have the following file layout for the service:

services
 |
 ├── garbage-collector
      |
      ├── nodejs
           |
           ├── index.js
           ├── package.json

Inside the folder,, you have 3 files:

  • index.js contains the Node.js code
  • package.json defines the library dependencies

6. Explore the code

Dependencies

The package.json file defines the needed library dependencies:

{
  "name": "garbage_collector_service",
  "version": "0.0.1",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "cloudevents": "^4.0.1",
    "express": "^4.17.1",
    "@google/events": "^3.1.0",
    "@google-cloud/firestore": "^4.9.9",
    "@google-cloud/storage": "^5.8.3"
  }
}

We depend on the Cloud Storage library to delete images within Cloud Storage. We declare a dependency on Cloud Firestore, to also delete picture metadata that we stored previously. Additionally, we depend on CloudEvents SDK and Google Events libraries to read the CloudEvents sent by Eventarc. Express is a JavaScript / Node web framework. Bluebird is used for handling promises.

index.js

Let's have a closer look at our index.js code:

const express = require('express');
const {Storage} = require('@google-cloud/storage');
const Firestore = require('@google-cloud/firestore');
const { HTTP } = require("cloudevents");
const {toStorageObjectData} = require('@google/events/cloud/storage/v1/StorageObjectData');

We require the various dependencies needed for our program to run: Express is the Node web framework we will be using, Bluebird is a library for handling JavaScript promises, Storage and Firestore are for working respectively with Google Cloud Storage (our buckets of images), and the Cloud Firestore datastore. Additionally, we require CloudEvent to read the CloudEvent sent by Eventarc StoreObjectData from Google Events library to read the Cloud Storage event body of the CloudEvent.

const app = express();
app.use(express.json());

app.post('/', async (req, res) => {
    try {
        const cloudEvent = HTTP.toEvent({ headers: req.headers, body: req.body });
        console.log(cloudEvent);


        /* ... */

    } catch (err) {
        console.log(`Error: ${err}`);
        res.status(500).send(err);
    }
});

Above, we have the structure of our Node handler: our app responds to HTTP POST requests. It reads the CloudEvent from the HTTP request and we're doing a bit of error handling in case something goes wrong. Let's now have a look at what is inside this structure.

Next step is to retrieve and parse the CloudEvent body and retrieve the object name:

const storageObjectData = toStorageObjectData(cloudEvent.data);
console.log(storageObjectData);

const objectName = storageObjectData.name;

Once we know the image name, we can delete it from the thumbnails bucket:

try {
    await storage.bucket(bucketThumbnails).file(objectName).delete();
    console.log(`Deleted '${objectName}' from bucket '${bucketThumbnails}'.`);
}
catch(err) {
    console.log(`Failed to delete '${objectName}' from bucket '${bucketThumbnails}': ${err}.`);
}

As a last step, delete the picture metadata from the Firestore collection as well:

try {
    const pictureStore = new Firestore().collection('pictures');
    const docRef = pictureStore.doc(objectName);
    await docRef.delete();

    console.log(`Deleted '${objectName}' from Firestore collection 'pictures'`);
}
catch(err) {
    console.log(`Failed to delete '${objectName}' from Firestore: ${err}.`);
}

res.status(200).send(`Processed '${objectName}'.`);

Now time to make our Node script listen to incoming requests. Also check that the required environment variables are set:

app.listen(PORT, () => {
    if (!bucketThumbnails) throw new Error("BUCKET_THUMBNAILS not set");
    console.log(`Started service on port ${PORT}`);
});

7. Test locally

Test the code locally to make sure it works before deploying to cloud.

Inside garbage-collector/nodejs folder, install npm dependencies and start the server:

export BUCKET_THUMBNAILS=thumbnails-$GOOGLE_CLOUD_PROJECT

npm install; npm start

If everything went well, it should start the server on port 8080:

Started service on port 8080

Use CTRL-C to exit.

8. Build and deploy to Cloud Run

Before deploying to Cloud Run, set the Cloud Run region to one of the supported regions and platform to managed:

REGION=europe-west1
gcloud config set run/region $REGION
gcloud config set run/platform managed

You can check that the configuration is set:

gcloud config list

...
[run]
platform = managed
region = europe-west1

Instead of building and publishing the container image using Cloud Build manually, you can also rely on Cloud Run to build the container image for you using Google Cloud Buildpacks.

Run the following command to build the container image using Google Cloud Buildpacks and then deploy the container image to Cloud Run:

SERVICE_NAME=garbage-collector-service

gcloud run deploy $SERVICE_NAME \
    --source . \
    --no-allow-unauthenticated \
    --update-env-vars BUCKET_THUMBNAILS=$BUCKET_THUMBNAILS

Note the –-source flag. This flags Cloud Run to use Google Cloud Buildpacks to build the container image without a Dockerfile. The --no-allow-unauthenticated flag makes the Cloud Run service an internal service that will only be triggered by specific service accounts. Later, you will create a Trigger with the default compute service account that has the run.invoker role to call internal Cloud Run services.

9. Create a Trigger

In Eventarc, a Trigger defines what service should get what kind of events. In this case, you want the service to receive events when a file is deleted in a bucket.

Set the location of the Trigger in the same region as uploaded pictures bucket:

gcloud config set eventarc/location eu

Create an AuditLog trigger to filter for storage.objects.delete events and send to the Cloud Run service:

BUCKET_IMAGES=uploaded-pictures-$GOOGLE_CLOUD_PROJECT

gcloud eventarc triggers create trigger-$SERVICE_NAME \
  --destination-run-service=$SERVICE_NAME \
  --destination-run-region=$REGION \
  --event-filters="type=google.cloud.storage.object.v1.deleted" \
  --event-filters="bucket=$BUCKET_IMAGES" \
  --service-account=$PROJECT_NUMBER-compute@developer.gserviceaccount.com

You can double check that the Trigger is created with this command:

gcloud eventarc triggers list

10. Test the service

To test if the service is working, go to the uploaded-pictures bucket and delete one of the pictures. You should see in the logs of the service that it deleted the relevant picture in the thumbnails bucket and also deleted its document from pictures Firestore collection.

519abf90e7ea4d12.png

11. Clean up (Optional)

If you don't intend to continue with the other labs in the series, you can clean up resources to save costs and to be an overall good cloud citizen. You can clean up resources individually as follows.

Delete the service:

gcloud run services delete $SERVICE_NAME -q

Delete the Eventarc trigger:

gcloud eventarc triggers delete trigger-$SERVICE_NAME -q

Alternatively, you can delete the whole project:

gcloud projects delete $GOOGLE_CLOUD_PROJECT

12. Congratulations!

Congratulations! You created a Cloud Run service, image garbage collector, that is triggered by Eventarc, a new service for receiving events in Cloud Run. When a picture is deleted from the pictures bucket, the service receives an event from Eventarc. Then, it deletes the image from the thumbnails bucket and also removes it from the Firestore pictures collection.

What we've covered

  • Cloud Run
  • Cloud Storage
  • Cloud Firestore
  • Eventarc