Migrate from App Engine Task Queue Push Tasks to Cloud Tasks (Module 8)

1. Overview

The Serverless Migration Station series of codelabs (self-paced, hands-on tutorials) and related videos aim to help Google Cloud serverless developers modernize their appications by guiding them through one or more migrations, primarily moving away from legacy services. Doing so makes your apps more portable and gives you more options and flexibility, enabling you to integrate with and access a wider range of Cloud products and more easily upgrade to newer language releases. While initially focusing on the earliest Cloud users, primarily App Engine (standard environment) developers, this series is broad enough to include other serverless platforms like Cloud Functions and Cloud Run, or elsewhere if applicable.

The purpose of this codelab is to show Python 2 App Engine developers how to migrate from App Engine Task Queue (push tasks) to Cloud Tasks. There is also an implicit migration from App Engine NDB to Cloud NDB for Datastore access (primarily covered in Module 2).

We added its use of push tasks in Module 7, and migrating that usage to Cloud Tasks here in Module 8, then continuing on to Python 3 and Cloud Datastore in Module 9. Those using Task Queues for pull tasks will migrate to Cloud Pub/Sub and should refer to Modules 18-19 instead.

You'll learn how to

What you'll need

Survey

How will you use this tutorial?

Only read through it Read it and complete the exercises

How would you rate your experience with Python?

Novice Intermediate Proficient

How would you rate your experience with using Google Cloud services?

Novice Intermediate Proficient

2. Background

App Engine Task Queue supports both push and pull tasks. To improve application portability, the Google Cloud team recommends migrating from legacy bundled services like Task Queue to other Cloud standalone or 3rd-party equivalent services.

Pull task migration is covered in Migration Modules 18-19 while Modules 7-9 focus on push task migration. In order to migrate from App Engine Task Queue push tasks, we added its usage to the existing Python 2 App Engine sample app that registers new page visits and displays the most recent visits. The Module 7 codelab adds a push task to delete the oldest visits—they will never be shown again, so why should they take up extra storage in Datastore? This Module 8 codelab preserves the same functionality but migrates the underlying queueing mechanism from Task Queue push tasks to Cloud Tasks as well as repeats the Module 2 migration from App Engine NDB to Cloud NDB for Datastore access.

This tutorial features the following steps:

  1. Setup/Prework
  2. Update configuration
  3. Modify application code

3. Setup/Prework

This section explains how to:

  1. Set up your Cloud project
  2. Get baseline sample app
  3. (Re)Deploy and validate baseline app
  4. Enable new Google Cloud services/APIs

These steps ensure you're starting with working code and that your sample app is ready for migrating to Cloud services.

1. Setup project

If you completed the Module 7 codelab, reuse that same project (and code). Alternatively, create a brand new project or reuse another existing project. Ensure the project has an active billing account and an enabled App Engine app. Find your project ID as you need to have it handy during this codelab, using it whenever you encounter the PROJECT_ID variable.

2. Get baseline sample app

One of the prerequisites is a working Module 7 App Engine app: complete the Module 7 codelab (recommended) or copy the Module 7 app from the repo. Whether you use yours or ours, the Module 7 code is where we'll begin ("START"). This codelab walks you through the migration, concluding with code that resembles what's in the Module 8 repo folder ("FINISH").

Regardless which Module 7 app you use, the folder should look like the below, possibly with a lib folder as well:

$ ls
README.md               appengine_config.py     requirements.txt
app.yaml                main.py                 templates

3. (Re)Deploy and validate baseline app

Execute the following steps to deploy the Module 7 app:

  1. Delete the lib folder if there is one and run pip install -t lib -r requirements.txt to repopulate lib. You may need to use pip2 instead if you have both Python 2 and 3 installed on your development machine.
  2. Ensure you've installed and initialized the gcloud command-line tool and reviewed its usage.
  3. (optional) Set your Cloud project with gcloud config set project PROJECT_ID if you don't want to enter the PROJECT_ID with each gcloud command you issue.
  4. Deploy the sample app with gcloud app deploy
  5. Confirm the app runs as expected without issue. If you completed the Module 7 codelab, the app displays the top visitors along with the most recent visits (illustrated below). At the bottom is an indication of the older tasks which will be deleted.

4aa8a2cb5f527079.png

4. Enable new Google Cloud services/APIs

The old app used App Engine bundled services which don't require additional setup, but standalone Cloud services do, and the updated app will employ both Cloud Tasks and Cloud Datastore (via the Cloud NDB client library). A number of Cloud products have an "Always Free" tier quotas, including App Engine, Cloud Datastore, and Cloud Tasks. So long as you stay under those limits, you shouldn't incur charges completing this tutorial. Cloud APIs can be enabled from either the Cloud Console or from the command-line, depending on your preference.

From the Cloud Console

Go to the API Manager's Library page (for the correct project) in the Cloud Console, and search for the Cloud Datastore and Cloud Tasks APIs using the search bar in the middle of the page:

c7a740304e9d35b.png

Click the Enable button for each API separately—you may be prompted for billing information. This is an example featuring Cloud Pub/Sub API Library page (do not enable the Pub/Sub API for this codelab, just Cloud Tasks and Datastore):

1b6c0a2a73124f6b.jpeg

From the command-line

While it is visually informative enabling APIs from the console, some prefer the command-line. Issue the gcloud services enable cloudtasks.googleapis.com datastore.googleapis.com command to enable both APIs at the same time:

$ gcloud services enable cloudtasks.googleapis.com datastore.googleapis.com
Operation "operations/acat.p2-aaa-bbb-ccc-ddd-eee-ffffff" finished successfully.

You may be prompted for billing information. If you wish to enable other Cloud APIs and wish to know what their "URIs" are, they can be found at the bottom of each API's Library page. For example, observe pubsub.googleapis.com as the "Service name" at the bottom of the Pub/Sub page just above.

After the steps are complete, your project will be able to access the APIs. Now it's time to update the app to use those APIs.

4. Update configuration

Updates in configuration are explicitly due to added use of Cloud client libraries. Regardless of which one(s) you use, the same changes must be made to apps that do not use any Cloud client libraries.

requirements.txt

Module 8 exchanges the use of App Engine NDB and Task Queue from Module 1 with Cloud NDB and Cloud Tasks. Append both google-cloud-ndb and google-cloud-tasks to requirements.txt so as to join flask from Module 7:

flask
google-cloud-ndb
google-cloud-tasks

This requirements.txt file doesn't feature any version numbers, meaning the latest versions are selected. If any incompatibilities arise, specify a version number to lock-in working versions for the app.

app.yaml

When using Cloud client libraries, the Python 2 App Engine runtime requires specific third-party packages, namely grpcio and setuptools. Python 2 users must list built-in libraries like these along with an available version or "latest" in app.yaml. If you don't have a libraries section yet, create one and add both libraries like this:

libraries:
- name: grpcio
  version: latest
- name: setuptools
  version: latest

When migrating your app, it may already have a libraries section. If it does, and either grpcio and setuptools are missing, just add them to your existing libraries section. The updated app.yaml should now look like this:

runtime: python27
threadsafe: yes
api_version: 1

handlers:
- url: /.*
  script: main.app

libraries:
- name: grpcio
  version: latest
- name: setuptools
  version: latest

appengine_config.py

The google.appengine.ext.vendor.add() call in appengine_config.py connects your copied (sometimes called "vendoring" or "self-bundling") 3rd-party libraries in lib to your app. Above in app.yaml, we added built-in 3rd-party libraries, and those need setuptools.pkg_resources.working_set.add_entry() to tie your app to those built-in packages in lib. Below are the original Module 1 appengine_config.py and after you've made the Module 8 updates:

BEFORE:

from google.appengine.ext import vendor

# Set PATH to your libraries folder.
PATH = 'lib'
# Add libraries installed in the PATH folder.
vendor.add(PATH)

AFTER:

import pkg_resources
from google.appengine.ext import vendor

# Set PATH to your libraries folder.
PATH = 'lib'
# Add libraries installed in the PATH folder.
vendor.add(PATH)
# Add libraries to pkg_resources working set to find the distribution.
pkg_resources.working_set.add_entry(PATH)

A similar description can also be found in the App Engine migration documentation.

5. Modify application code

This section features updates to the main application file, main.py, replacing use of App Engine Task Queue push queues with Cloud Tasks. There are no changes to the web template, templates/index.html—both apps should operate identically, displaying the same data. The modifications to the main application are broken down into these four "to-do"s:

  1. Update imports and initialization
  2. Update data model functionality (Cloud NDB)
  3. Migrate to Cloud Tasks (and Cloud NDB)
  4. Update (push) task handler

1. Update imports and initialization

  1. Replace App Engine NDB (google.appengine.ext.ndb) and Task Queue (google.appengine.api.taskqueue) with Cloud NDB (google.cloud.ndb) and Cloud Tasks (google.cloud.tasks), respectively.
  2. Cloud client libraries require initialization and creation of "API clients;" assign them to ds_client and ts_client, respectively.
  3. The Task Queue documentation states: "App Engine provides a default push queue, named default, which is configured and ready to use with default settings." Cloud Tasks does not provide a default queue (because it is a standalone Cloud product independent of App Engine), so new code is required to create a Cloud Tasks queue named default.
  4. App Engine Task Queue does not require you to specify a region because it uses the region your app runs in. However, because Cloud Tasks is now an independent product, it does requires a region, and that region must match the region your app runs in. The region name and Cloud project ID are required to create a "fully-qualified pathname" as the queue's unique identifier.

The updates described in the third and fourth bullet above make up the bulk of the additional constants and initialization required. See in the "before" and "after" below and make these changes at the top of main.py.

BEFORE:

from datetime import datetime
import logging
import time
from flask import Flask, render_template, request
from google.appengine.api import taskqueue
from google.appengine.ext import ndb

app = Flask(__name__)

AFTER:

from datetime import datetime
import json
import logging
import time
from flask import Flask, render_template, request
from google.cloud import ndb, tasks

app = Flask(__name__)
ds_client = ndb.Client()
ts_client = tasks.CloudTasksClient()

_, PROJECT_ID = google.auth.default()
REGION_ID = 'REGION_ID'    # replace w/your own
QUEUE_NAME = 'default'     # replace w/your own
QUEUE_PATH = ts_client.queue_path(PROJECT_ID, REGION_ID, QUEUE_NAME)

2. Update data model functionality (Cloud NDB)

App Engine NDB and Cloud NDB work nearly identically. There are no major changes to either the data model nor the store_visit() function. The only noticeable difference is that the creation of the Visit entity in store_visit() is now encapsulated inside a Python with block. Cloud NDB requires that all Datastore access is controlled within its context manager, hence the with statement. The code snippets below illustrate this minor difference when migrating to Cloud NDB. Implement this change.

BEFORE:

class Visit(ndb.Model):
    'Visit entity registers visitor IP address & timestamp'
    visitor   = ndb.StringProperty()
    timestamp = ndb.DateTimeProperty(auto_now_add=True)

def store_visit(remote_addr, user_agent):
    'create new Visit entity in Datastore'
    Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()

AFTER:

class Visit(ndb.Model):
    'Visit entity registers visitor IP address & timestamp'
    visitor   = ndb.StringProperty()
    timestamp = ndb.DateTimeProperty(auto_now_add=True)

def store_visit(remote_addr, user_agent):
    'create new Visit entity in Datastore'
    with ds_client.context():
        Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()

3. Migrate to Cloud Tasks (and Cloud NDB)

The most critical change in this migration switches the underlying queuing infrastructure. This takes place in the fetch_visits() function where a (push) task to delete old visits is created and enqueued for execution. However, the original functionality from Module 7 stays intact:

  1. Query for the most recent visits.
  2. Instead of returning those visits immediately, save the timestamp of the last Visit, the oldest displayed—it is safe to delete all visits older than this.
  3. Tease out the timestamp as a float and a string using standard Python utilities and use both in various capacities, for example, display to user, add to logs, pass to handler, etc.
  4. Create a push task with this timestamp as its payload along with /trim as the URL.
  5. The task handler is eventually called via an HTTP POST to that URL.

This workflow is illustrated by the "before" code snippet:

BEFORE:

def fetch_visits(limit):
    'get most recent visits & add task to delete older visits'
    data = Visit.query().order(-Visit.timestamp).fetch(limit)
    oldest = time.mktime(data[-1].timestamp.timetuple())
    oldest_str = time.ctime(oldest)
    logging.info('Delete entities older than %s' % oldest_str)
    taskqueue.add(url='/trim', params={'oldest': oldest})
    return data, oldest_str

While the functionality remains the same, Cloud Tasks becomes the execution platform. The updates to effect this change include:

  1. Wrap Visit query inside a Python with block (repeating Module 2 migration to Cloud NDB)
  2. Create Cloud Tasks metadata, including expected attributes such as timestamp payload and URL, but also add the MIME type and JSON-encode the payload.
  3. Use the Cloud Tasks API client to create the task with the metadata and full pathname of the queue.

These changes to fetch_visits() are illustrated below:

AFTER:

def fetch_visits(limit):
    'get most recent visits & add task to delete older visits'
    with ds_client.context():
        data = Visit.query().order(-Visit.timestamp).fetch(limit)
    oldest = time.mktime(data[-1].timestamp.timetuple())
    oldest_str = time.ctime(oldest)
    logging.info('Delete entities older than %s' % oldest_str)
    task = {
        'app_engine_http_request': {
            'relative_uri': '/trim',
            'body': json.dumps({'oldest': oldest}).encode(),
            'headers': {
                'Content-Type': 'application/json',
            },
        }
    }
    ts_client.create_task(parent=QUEUE_PATH, task=task)
    return data, oldest_str

4. Update (push) task handler

The (push) task handler function does not require major updates; it only requires execution. This is applicable to Task Queue or Cloud Tasks. "The code is the code," so they say. There are some minor changes however:

  1. The timestamp payload was passed verbatim to Task Queue, but it was JSON-encoded for Cloud Tasks and therefore must be JSON-parsed upon arrival.
  2. The HTTP POST call to /trim with Task Queue had an implicit MIMEtype of application/x-www-form-urlencoded, but with Cloud Tasks, it is designated explicitly as application/json, so there's a slightly different way to extract the payload.
  3. Use the Cloud NDB API client context manager (Module 2 migration to Cloud NDB).

Below are the code snippets before and after making these changes to the task handler, trim():

BEFORE:

@app.route('/trim', methods=['POST'])
def trim():
    '(push) task queue handler to delete oldest visits'
    oldest = request.form.get('oldest', type=float)
    keys = Visit.query(
            Visit.timestamp < datetime.fromtimestamp(oldest)
    ).fetch(keys_only=True)
    nkeys = len(keys)
    if nkeys:
        logging.info('Deleting %d entities: %s' % (
                nkeys, ', '.join(str(k.id()) for k in keys)))
        ndb.delete_multi(keys)
    else:
        logging.info('No entities older than: %s' % time.ctime(oldest))
    return ''   # need to return SOME string w/200

AFTER:

@app.route('/trim', methods=['POST'])
def trim():
    '(push) task queue handler to delete oldest visits'
    oldest = float(request.get_json().get('oldest'))
    with ds_client.context():
        keys = Visit.query(
                Visit.timestamp < datetime.fromtimestamp(oldest)
        ).fetch(keys_only=True)
        nkeys = len(keys)
        if nkeys:
            logging.info('Deleting %d entities: %s' % (
                    nkeys, ', '.join(str(k.id()) for k in keys)))
            ndb.delete_multi(keys)
        else:
            logging.info(
                    'No entities older than: %s' % time.ctime(oldest))
    return ''   # need to return SOME string w/200

There are no updates to the main application handler root() nor the web template templates/index.html.

6. Summary/Cleanup

This section wraps up this codelab by deploying the app, verifying it works as intended and in any reflected output. After app validation, perform any clean-up and consider next steps.

Deploy and verify application

Deploy your app with gcloud app deploy. The output should be identical to the Module 7 app but realize you've changed to a completely different push queue product making your app more portable than before!

4aa8a2cb5f527079.png

Clean up

General

If you are done for now, we recommend you disable your App Engine app to avoid incurring billing. However if you wish to test or experiment some more, the App Engine platform has a free quota, and so as long as you don't exceed that usage tier, you shouldn't be charged. That's for compute, but there may also be charges for relevant App Engine services, so check its pricing page for more information. If this migration involves other Cloud services, those are billed separately. In either case, if applicable, see the "Specific to this codelab" section below.

For full disclosure, deploying to a Google Cloud serverless compute platform like App Engine incurs minor build and storage costs. Cloud Build has its own free quota as does Cloud Storage. Storage of that image uses up some of that quota. However, you might live in a region that does not have such a free tier, so be aware of your storage usage to minimize potential costs. Specific Cloud Storage "folders" you should review include:

  • console.cloud.google.com/storage/browser/LOC.artifacts.PROJECT_ID.appspot.com/containers/images
  • console.cloud.google.com/storage/browser/staging.PROJECT_ID.appspot.com
  • The storage links above depend on your PROJECT_ID and *LOC*ation, for example, "us" if your app is hosted in the USA.

On the other hand, if you're not going to continue with this application or other related migration codelabs and want to delete everything completely, shut down your project.

Specific to this codelab

The services listed below are unique to this codelab. Refer to each product's documentation for more information:

Next steps

This concludes our migration from App Engine Task Queue push tasks to Cloud Tasks. If you're interested in continuing to port this app to Python 3 and migrate even further to Cloud Datastore from Cloud NDB, then consider Module 9.

Cloud NDB exists specifically for Python 2 App Engine developers, providing a near-identical user experience, but Cloud Datastore has its own native client library made for non-App Engine users or new (Python 3) App Engine users. However, because Cloud NDB is available for Python 2 and 3, there's no requirement to migrate to Cloud Datastore.

Cloud NDB and Cloud Datastore both access Datastore (albeit in different ways), so the only reason to consider moving to Cloud Datastore is if you already have other apps, notably non-App Engine apps, using Cloud Datastore and desire to standardize on a single Datastore client library. This optional migration from Cloud NDB to Cloud Datastore is also covered on its own (without Task Queue or Cloud Tasks) in Module 3.

Beyond Modules 3, 8, and 9, other migration modules focusing on moving away from App Engine legacy bundled services to consider include:

  • Module 2: migrate from App Engine NDB to Cloud NDB
  • Modules 12-13: migrate from App Engine Memcache to Cloud Memorystore
  • Modules 15-16: migrate from App Engine Blobstore to Cloud Storage
  • Modules 18-19: App Engine Task Queue (pull tasks) to Cloud Pub/Sub

App Engine is no longer the only serverless platform in Google Cloud. If you have a small App Engine app or one that has limited functionality and wish to turn it into a standalone microservice, or you want to break-up a monolithic app into multiple reusable components, these are good reasons to consider moving to Cloud Functions. If containerization has become part of your application development workflow, particularly if it consists of a CI/CD (continuous integration/continuous delivery or deployment) pipeline, consider migrating to Cloud Run. These scenarios are covered by the following modules:

  • Migrate from App Engine to Cloud Functions: see Module 11
  • Migrate from App Engine to Cloud Run: see Module 4 to containerize your app with Docker, or Module 5 to do it without containers, Docker knowledge, or Dockerfiles

Switching to another serverless platform is optional, and we recommend considering the best options for your apps and use cases before making any changes.

Regardless of which migration module you consider next, all Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. The repo's README also provides guidance on which migrations to consider and any relevant "order" of Migration Modules.

7. Additional resources

Listed below are additional resources for developers further exploring this or related Migration Module as well as related products. This includes places to provide feedback on this content, links to the code, and various pieces of documentation you may find useful.

Codelabs issues/feedback

If you find any issues with this codelab, please search for your issue first before filing. Links to search and create new issues:

Migration resources

Links to the repo folders for Module 7 (START) and Module 8 (FINISH) can be found in the table below.

Codelab

Python 2

Python 3

Module 7

code

code (not featured in this tutorial)

Module 8 (this codelab)

code

(n/a)

Online resources

Below are online resources which may be relevant for this tutorial:

App Engine Task Queue and Cloud Tasks

App Engine NDB and Cloud NDB (Datastore)

App Engine platform

Other Cloud information

Videos

License

This work is licensed under a Creative Commons Attribution 2.0 Generic License.