Pic-a-daily: Lab 2—Create thumbnails of pictures

1. Overview

In this code lab, you build on the previous lab and add a thumbnail service. The thumbnail service is a web container that takes big pictures and creates thumbnails out of them.

As the picture is uploaded to Cloud Storage, a notification is sent via Cloud Pub/Sub to a Cloud Run web container, which then resizes images and saves them back in another bucket in Cloud Storage.

31fa4f8a294d90df.png

What you'll learn

  • Cloud Run
  • Cloud Storage
  • Cloud Pub/Sub

2. Setup and Requirements

Self-paced environment setup

  1. Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

96a9c957bc475304.png

b9a10ebdf5b5a448.png

a1e3c01a38fa61c2.png

  • The Project name is the display name for this project's participants. It is a character string not used by Google APIs, and you can update it at any time.
  • The Project ID must be unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference the Project ID (and it is typically identified as PROJECT_ID), so if you don't like it, generate another random one, or, you can try your own and see if it's available. Then it's "frozen" after the project is created.
  • There is a third value, a Project Number which some APIs use. Learn more about all three of these values in the documentation.
  1. Next, you'll need to enable billing in the Cloud Console in order to use Cloud resources/APIs. Running through this codelab shouldn't cost much, if anything at all. To shut down resources so you don't incur billing beyond this tutorial, follow any "clean-up" instructions found at the end of the codelab. New users of Google Cloud are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

From the GCP Console click the Cloud Shell icon on the top right toolbar:

bce75f34b2c53987.png

It should only take a few moments to provision and connect to the environment. When it is finished, you should see something like this:

f6ef2b5f13479f3a.png

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on Google Cloud, greatly enhancing network performance and authentication. All of your work in this lab can be done with simply a browser.

3. Enable APIs

In this lab, you will need Cloud Build to build container images and Cloud Run to deploy the container.

Enable both APIs from Cloud Shell:

gcloud services enable cloudbuild.googleapis.com \
  run.googleapis.com

You should see the operation to finish successfully:

Operation "operations/acf.5c5ef4f6-f734-455d-b2f0-ee70b5a17322" finished successfully.

4. Create another bucket

You will store thumbnails of the uploaded pictures in another bucket. Let's use gsutil to create the second bucket.

Inside Cloud Shell, set a variable for the unique bucket name. Cloud Shell already has GOOGLE_CLOUD_PROJECT set to your unique project id. You can append that to the bucket name. Then, create a public multi-region bucket in Europe with uniform level access:

BUCKET_THUMBNAILS=thumbnails-$GOOGLE_CLOUD_PROJECT
gsutil mb -l EU gs://$BUCKET_THUMBNAILS
gsutil uniformbucketlevelaccess set on gs://$BUCKET_THUMBNAILS
gsutil iam ch allUsers:objectViewer gs://$BUCKET_THUMBNAILS

In the end, you should have a new public bucket:

8e75c8099938e972.png

5. Clone the code

Clone the code and go to the directory containing the service:

git clone https://github.com/GoogleCloudPlatform/serverless-photosharing-workshop
cd serverless-photosharing-workshop/services/thumbnails/nodejs

You will have the following file layout for the service:

services
 |
 ├── thumbnails
      |
      ├── nodejs
           |
           ├── Dockerfile
           ├── index.js
           ├── package.json

Inside the thumbnails/nodejs folder, you have 3 files:

  • index.js contains the Node.js code
  • package.json defines the library dependencies
  • Dockerfile defines the container image

6. Explore the code

To explore the code, you can use the built-in text editor, by clicking on the Open Editor button on top of the Cloud Shell window:

3d145fe299dd8b3e.png

You can also open the editor in a dedicated browser window, for more screen real estate.

Dependencies

The package.json file defines the needed library dependencies:

{
  "name": "thumbnail_service",
  "version": "0.0.1",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "bluebird": "^3.7.2",
    "express": "^4.17.1",
    "imagemagick": "^0.1.3",
    "@google-cloud/firestore": "^4.9.9",
    "@google-cloud/storage": "^5.8.3"
  }
}

Cloud Storage library is used to read and save image files within Cloud Storage. Firestore to update the picture metadata. Express is a JavaScript / Node web framework. The body-parser module is used to parse incoming requests easily. Bluebird is used for handling promises, and Imagemagick is a library for manipulating images.

Dockerfile

Dockerfile defines the container image for the application:

FROM node:14-slim

# installing Imagemagick
RUN set -ex; \
  apt-get -y update; \
  apt-get -y install imagemagick; \
  rm -rf /var/lib/apt/lists/*; \
  mkdir /tmp/original; \
  mkdir /tmp/thumbnail;

WORKDIR /picadaily/services/thumbnails
COPY package*.json ./
RUN npm install --production
COPY . .
CMD [ "npm", "start" ]

The base image is Node 14 and the imagemagick library is used for image manipulation. Some temporary directories are created for holding original and thumbnail picture files. Then NPM modules needed by our code are installed before starting the code with npm start.

index.js

Let's explore the code in pieces, so that we can better understand what this program is doing.

const express = require('express');
const imageMagick = require('imagemagick');
const Promise = require("bluebird");
const path = require('path');
const {Storage} = require('@google-cloud/storage');
const Firestore = require('@google-cloud/firestore');

const app = express();
app.use(express.json());

We are first requiring the needed dependencies, and create our Express web application, as well as indicating that we want to use the JSON body parser, as incoming requests are actually just JSON payloads sent via a POST request to our application.

app.post('/', async (req, res) => {
    try {
        // ...
    } catch (err) {
        console.log(`Error: creating the thumbnail: ${err}`);
        console.error(err);
        res.status(500).send(err);
    }
});

We are receiving those incoming payloads on the / base URL, and we are wrapping our code with some error logic handling, to have better information of why something may be failing in our code by looking at the logs that will be visible from the Stackdriver Logging interface in the Google Cloud web console.

const pubSubMessage = req.body;
console.log(`PubSub message: ${JSON.stringify(pubSubMessage)}`);

const fileEvent = JSON.parse(Buffer.from(pubSubMessage.message.data, 'base64').toString().trim());
console.log(`Received thumbnail request for file ${fileEvent.name} from bucket ${fileEvent.bucket}`);

On the Cloud Run platform, Pub/Sub messages are sent via HTTP POST requests, as JSON payloads of the form:

{
  "message": {
    "attributes": {
      "bucketId": "uploaded-pictures",
      "eventTime": "2020-02-27T09:22:43.255225Z",
      "eventType": "OBJECT_FINALIZE",
      "notificationConfig": "projects/_/buckets/uploaded-pictures/notificationConfigs/28",
      "objectGeneration": "1582795363255481",
      "objectId": "IMG_20200213_181159.jpg",
      "payloadFormat": "JSON_API_V1"
    },
    "data": "ewogICJraW5kIjogInN0b3JhZ2Ujb2JqZWN...FQUU9Igp9Cg==",
    "messageId": "1014308302773399",
    "message_id": "1014308302773399",
    "publishTime": "2020-02-27T09:22:43.973Z",
    "publish_time": "2020-02-27T09:22:43.973Z"
  },
  "subscription": "projects/serverless-picadaily/subscriptions/gcs-events-subscription"
}

But what is really interesting in this JSON document is actually what is contained in the message.data attribute, which is just a string but that encodes the actual payload into Base 64. That's why our code above is decoding the Base 64 content of this attribute. That data attribute once decoded contains another JSON document that represents the Cloud Storage event details, which, among other metadata, indicates the file name and the bucket name.

{
  "kind": "storage#object",
  "id": "uploaded-pictures/IMG_20200213_181159.jpg/1582795363255481",
  "selfLink": "https://www.googleapis.com/storage/v1/b/uploaded-pictures/o/IMG_20200213_181159.jpg",
  "name": "IMG_20200213_181159.jpg",
  "bucket": "uploaded-pictures",
  "generation": "1582795363255481",
  "metageneration": "1",
  "contentType": "image/jpeg",
  "timeCreated": "2020-02-27T09:22:43.255Z",
  "updated": "2020-02-27T09:22:43.255Z",
  "storageClass": "STANDARD",
  "timeStorageClassUpdated": "2020-02-27T09:22:43.255Z",
  "size": "4944335",
  "md5Hash": "QzBIoPJBV2EvqB1EVk1riw==",
  "mediaLink": "https://www.googleapis.com/download/storage/v1/b/uploaded-pictures/o/IMG_20200213_181159.jpg?generation=1582795363255481&alt=media",
  "crc32c": "hQ3uHg==",
  "etag": "CLmJhJu08ecCEAE="
}

We're interested in the image and bucket names, as our code is going to fetch that image from the bucket for its thumbnail treatment:

const bucket = storage.bucket(fileEvent.bucket);
const thumbBucket = storage.bucket(process.env.BUCKET_THUMBNAILS);

const originalFile = path.resolve('/tmp/original', fileEvent.name);
const thumbFile = path.resolve('/tmp/thumbnail', fileEvent.name);

await bucket.file(fileEvent.name).download({
    destination: originalFile
});
console.log(`Downloaded picture into ${originalFile}`);

We are retrieving the name of the output storage bucket from an environment variable.

We have the origin bucket whose file creation triggered our Cloud Run service, and the destination bucket where we'll store the resulting image. We are using the path built-in API to do local file handling, as the imagemagick library will be creating the thumbnail locally in the /tmp temporary directory. We await for an asynchronous call to download the uploaded image file.

const resizeCrop = Promise.promisify(im.crop);
await resizeCrop({
        srcPath: originalFile,
        dstPath: thumbFile,
        width: 400,
        height: 400         
});
console.log(`Created local thumbnail in ${thumbFile}`);

The imagemagick module is not very async / await friendly, so we are wrapping it up within a Javascript promise (provided by the Bluebird module). Then we're calling the asynchronous resizing / cropping function we created with the parameters for the source and destination files, as well as the dimensions of the thumbnail we want to create.

await thumbBucket.upload(thumbFile);
console.log(`Uploaded thumbnail to Cloud Storage bucket ${process.env.BUCKET_THUMBNAILS}`);

Once the thumbnail file is uploaded to Cloud Storage, we will also update the metadata in Cloud Firestore to add a boolean flag indicating that the thumbnail for this image is indeed generated:

const pictureStore = new Firestore().collection('pictures');
const doc = pictureStore.doc(fileEvent.name);
await doc.set({
    thumbnail: true
}, {merge: true});
console.log(`Updated Firestore about thumbnail creation for ${fileEvent.name}`);

res.status(204).send(`${fileEvent.name} processed`);

Once our request is over, we reply to the HTTP POST request that the file was properly processed.

const PORT = process.env.PORT || 8080;

app.listen(PORT, () => {
    console.log(`Started thumbnail generator on port ${PORT}`);
});

At the end of our source file, we have the instructions to have Express actually start our web application on the 8080 default port.

7. Test locally

Test the code locally to make sure it works before deploying to cloud.

Inside thumbnails/nodejs folder, install npm dependencies and start the server:

npm install; npm start

If everything went well, it should start the server on port 8080:

Started thumbnail generator on port 8080

Use CTRL-C to exit.

8. Build and publish the container image

Cloud Run runs containers but you first need to build the container image (defined in Dockerfile). Google Cloud Build can be used to build container images and then host to Google Container Registry.

Inside thumbnails/nodejs folder where Dockerfile is, issue the following command to build the container image:

gcloud builds submit --tag gcr.io/$GOOGLE_CLOUD_PROJECT/thumbnail-service

After a minute or two, the build should succeed:

b354b3a9a3631097.png

The Cloud Build "history" section should show the successful build as well:

df00f198dd2bf6bf.png

Clicking on the build ID to get the details view, in the "build artifacts" tab you should see that the container image has been uploaded to the Cloud Registry (GCR):

a4577ce0744f73e2.png

If you wish, you can double check that the container image runs locally in Cloud Shell:

docker run -p 8080:8080 gcr.io/$GOOGLE_CLOUD_PROJECT/thumbnail-service

It should start the server on port 8080 in the container:

Started thumbnail generator on port 8080

Use CTRL-C to exit.

9. Deploy to Cloud Run

Before deploying to Cloud Run, set the Cloud Run region to one of the supported regions and platform to managed:

gcloud config set run/region europe-west1
gcloud config set run/platform managed

You can check that the configuration is set:

gcloud config list

...
[run]
platform = managed
region = europe-west1

Run the following command to deploy the container image on Cloud Run:

SERVICE_NAME=thumbnail-service
gcloud run deploy $SERVICE_NAME \
    --image gcr.io/$GOOGLE_CLOUD_PROJECT/thumbnail-service \
    --no-allow-unauthenticated \
    --update-env-vars BUCKET_THUMBNAILS=$BUCKET_THUMBNAILS

Note the --no-allow-unauthenticated flag. This makes the Cloud Run service an internal service that will only be triggered by specific service accounts.

If the deployment is successful, you should see the following output:

c0f28e7d6de0024.png

If you go to the cloud console UI, you should also see that the service was successfully deployed:

9bfe48e3c8b597e5.png

10. Cloud Storage events to Cloud Run via Pub/Sub

The service is ready, but you still need to make Cloud Storage events to the newly created Cloud Run service. Cloud Storage can send file creation events via Cloud Pub/Sub but there are a few steps to get this working.

Create a Pub/Sub topic as the communication pipeline:

TOPIC_NAME=cloudstorage-cloudrun-topic
gcloud pubsub topics create $TOPIC_NAME

Create Pub/Sub notifications when files are stored in the bucket:

BUCKET_PICTURES=uploaded-pictures-$GOOGLE_CLOUD_PROJECT
gsutil notification create -t $TOPIC_NAME -f json gs://$BUCKET_PICTURES

Create a service account for the Pub/Sub subscription that we will create later:

SERVICE_ACCOUNT=$TOPIC_NAME-sa
gcloud iam service-accounts create $SERVICE_ACCOUNT \
     --display-name "Cloud Run Pub/Sub Invoker"

Give the service account permission to invoke a Cloud Run service:

SERVICE_NAME=thumbnail-service
gcloud run services add-iam-policy-binding $SERVICE_NAME \
   --member=serviceAccount:$SERVICE_ACCOUNT@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com \
   --role=roles/run.invoker

If you enabled the Pub/Sub service account on or before April 8, 2021, grant the iam.serviceAccountTokenCreator role to the Pub/Sub service account:

PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_PROJECT --format='value(projectNumber)')
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
     --member=serviceAccount:service-$PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com \
     --role=roles/iam.serviceAccountTokenCreator

It can take a few minutes for the IAM changes to propagate.

Finally, create a Pub/Sub subscription with the service account:

SERVICE_URL=$(gcloud run services describe $SERVICE_NAME --format 'value(status.url)')
gcloud pubsub subscriptions create $TOPIC_NAME-subscription --topic $TOPIC_NAME \
   --push-endpoint=$SERVICE_URL \
   --push-auth-service-account=$SERVICE_ACCOUNT@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com

You can check that a subscription is created. Go to Pub/Sub in the console, select the gcs-events topic and at the bottom, you should see the subscription:

e8ab86dccb8d890.png

11. Test the service

To test if the setup is working, upload a new picture to the uploaded-pictures bucket and check in the thumbnails bucket that new resized pictures appear as expected.

You can also double check the logs to see the logging messages appear, as the various steps of the Cloud Run service are going through:

42c025e2d7d6ca3a.png

12. Clean up (Optional)

If you don't intend to continue with the other labs in the series, you can clean up resources to save costs and to be an overall good cloud citizen. You can clean up resources individually as follows.

Delete the bucket:

gsutil rb gs://$BUCKET_THUMBNAILS

Delete the service:

gcloud run services delete $SERVICE_NAME -q

Delete the Pub/Sub topic:

gcloud pubsub topics delete $TOPIC_NAME

Alternatively, you can delete the whole project:

gcloud projects delete $GOOGLE_CLOUD_PROJECT

13. Congratulations!

Everything is now in place:

  • Created a notification in Cloud Storage that sends Pub/Sub messages on a topic, when a new picture is uploaded.
  • Defined the required IAM bindings and accounts (unlike Cloud Functions where it is all automated, it is manually configured here).
  • Created a subscription so that our Cloud Run service receives the Pub/Sub messages.
  • Whenever a new picture is uploaded to the bucket, the picture is resized thanks to the new Cloud Run service.

What we've covered

  • Cloud Run
  • Cloud Storage
  • Cloud Pub/Sub

Next Steps