Module 8: Migrate from App Engine ndb and taskqueue to Cloud NDB and Cloud Tasks

1. Overview

This series of codelabs (self-paced, hands-on tutorials) aims to help Google App Engine (Standard) developers modernize their apps by guiding them through a series of migrations. The most significant step is to move away from original runtime bundled services because the next generation runtimes are more flexible, giving users a greater variety of service options. Moving to the newer generation runtime enables you to integrate with Google Cloud products more easily, use a wider range of supported services, and support current language releases.

This codelab helps users migrate from App Engine push tasks and its taskqueue API/library to Cloud Tasks. If your app does not use Task Queues, you can use this codelab as an exercise to learn how to migrate App Engine push tasks to Cloud Tasks.

You'll learn how to

  • Migrate from App Engine taskqueue to Cloud Tasks
  • Create push tasks with Cloud Tasks
  • Migrate from App Engine ndb to Cloud NDB (same as Module 2)

What you'll need

Survey

How will you use this codelab?

Only read through it Read it and complete the exercises

2. Background

Since we added App Engine push tasks to the sample app in the previous (Module 7) codelab, we can now migrate it to Cloud Tasks. This tutorial's migration features these primary steps are:

  1. Setup/Prework
  2. Update configuration files
  3. Update main application

3. Setup/Prework

Before we get going with the main part of the tutorial, let's set up our project, get the code, then deploy the baseline app so we know we started with working code.

1. Setup project

We recommend reusing the same project as the one you used for completing the Module 7 codelab. Alternatively, you can create a brand new project or reuse another existing project. Ensure the project has an active billing account and App Engine (app) is enabled.

2. Get baseline sample app

One of the prerequisites to this codelab is to have a working Module 7 sample app. If you don't have one, we recommend completing the Module 7 tutorial (link above) before moving ahead here. Otherwise if you're already familiar with its contents, you can just start by grabbing the Module 7 code below.

Whether you use yours or ours, the Module 7 code is where we'll START. This Module 2 codelab walks you through each step, and when complete, it should resemble code at the FINISH point (including an optional port from Python 2 to 3).

The directory of Module 7 files (yours or ours) should look like this:

$ ls
README.md               appengine_config.py     requirements.txt
app.yaml                main.py                 templates

If you completed the Module 7 tutorial, you'll also have a lib folder with Flask and its dependencies.

3. (Re)Deploy Module 7 app

Your remaining prework steps to execute now:

  1. Re-familiarize yourself with the gcloud command-line tool (if nec.)
  2. (Re)deploy the Module 7 code to App Engine (if nec.)

Once you've successfully executed those steps and confirm it's operational, we'll move ahead in this tutorial, starting with the configuration files.

4. Update configuration files

requirements.txt

The requirements.txt from Module 7 only lists Flask as a required package. Cloud NDB and Cloud Tasks have their own client libraries, so in this step, add those packages to requirements.txt so it looks like this:

Flask==1.1.2
google-cloud-ndb==1.7.1
google-cloud-tasks==1.5.0

We recommend using the latest versions of each library; the versions numbers above are the latest for Python 2 at the time of this writing. (The Python 3 equivalent packages will likely be at higher versions.) The code in the FINISH repo folder is updated more frequently and may have newer releases, although that is less likely for Python 2 libraries which are generally frozen.

app.yaml

Reference the grpcio and setuptools built-in libraries in app.yaml in a libraries section:

libraries:
- name: grpcio
  version: 1.0.0
- name: setuptools
  version: 36.6.0

appengine_config.py

Update appengine_config.py to use pkg_resources to tie those built-in libraries to the copied 3rd-party libraries like Flask and the Google Cloud client libraries:

import pkg_resources
from google.appengine.ext import vendor

# Set PATH to your libraries folder.
PATH = 'lib'
# Add libraries installed in the PATH folder.
vendor.add(PATH)
# Add libraries to pkg_resources working set to find the distribution.
pkg_resources.working_set.add_entry(PATH)

5. Update application files

There is only one application file, main.py, so all changes in this section affects just that file.

Update imports and initialization

Our app is currently using the built-in google.appengine.api.taskqueue and google.appengine.ext.ndb libraries:

  • BEFORE:
from datetime import datetime
import logging
import time
from flask import Flask, render_template, request
from google.appengine.api import taskqueue
from google.appengine.ext import ndb

Replace both with google.cloud.ndb and google.cloud.tasks. Furthermore, Cloud Tasks requires you to JSON-encode the task's payload, so also import json. When you're done, here's what the import section of main.py should look like:

  • AFTER:
from datetime import datetime
import json
import logging
import time
from flask import Flask, render_template, request
from google.cloud import ndb, tasks

Migrate to Cloud Tasks (and Cloud NDB)

  • BEFORE:
def store_visit(remote_addr, user_agent):
    'create new Visit entity in Datastore'
    Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()

There is no change to store_visit() other than what you did in Module 2: add a context manager to all Datastore access. This comes in the form of moving creation of a new Visit Entity wrapped in a with statement.

  • AFTER:
def store_visit(remote_addr, user_agent):
    'create new Visit entity in Datastore'
    with ds_client.context():
        Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()

Cloud Tasks currently requires App Engine to be enabled for your Google Cloud project in order for you to use it (even if you don't have any App Engine code), otherwise tasks queues will not function. (See this section in the docs for more information.) Cloud Tasks supports tasks running on App Engine (App Engine "targets") but can also be run on any HTTP endpoint (HTTP targets) with a public IP address, such as Cloud Functions, Cloud Run, GKE, Compute Engine, or even an on-prem web server. Our simple app uses an App Engine target for tasks.

Some setup is needed to use Cloud NDB and Cloud Tasks. At the top of main.py under Flask initialization, initialize Cloud NDB and Cloud Tasks. Also define some constants that indicate where your push tasks will execute.

app = Flask(__name__)
ds_client = ndb.Client()
ts_client = tasks.CloudTasksClient()

PROJECT_ID = 'PROJECT_ID'  # replace w/your own
REGION = 'REGION'    # replace w/your own
QUEUE_NAME = 'default'     # replace w/your own if desired
QUEUE_PATH = ts_client.queue_path(PROJECT_ID, REGION, QUEUE_NAME)

Once you've created your task queue, fill-in your project's PROJECT_ID, the REGION where your tasks will run (should be the same as your App Engine region), and the name of your push queue. App Engine features a "default" queue, so we'll use that name (but you don't have to).

The default queue is special and created automatically under certain circumstances, one of which is when using App Engine APIs, so if you (re)use the same project as Module 7, default will already exist. However if you created a new project specifically for Module 8, you'll need to create default manually. More info on the default queue can be found in the queue.yaml documentation.

The purpose of ts_client.queue_path() is to create a task queue's "fully-qualified path name" (QUEUE_PATH) needed for creating a task. Also needed is a JSON structure specifying task parameters:

task = {
    'app_engine_http_request': {
        'relative_uri': '/trim',
        'body': json.dumps({'oldest': oldest}).encode(),
        'headers': {
            'Content-Type': 'application/json',
        },
    }
}

What are you looking at above?

  1. Supply task target information:
    • For App Engine targets, specify app_engine_http_request as the request type and relative_uri is the App Engine task handler.
    • For HTTP targets, use http_request and url instead.
  2. body: the JSON- and Unicode string-encoded parameter(s) to send to the (push) task
  3. Specify a JSON-encoded Content-Type header explicity

Refer to the documentation for more info on your options here.

With setup out of the way, let's update fetch_visits(). Here is what it looks like from the previous tutorial:

  • BEFORE:
def fetch_visits(limit):
    'get most recent visits and add task to delete older visits'
    data = Visit.query().order(-Visit.timestamp).fetch(limit)
    oldest = time.mktime(data[-1].timestamp.timetuple())
    oldest_str = time.ctime(oldest)
    logging.info('Delete entities older than %s' % oldest_str)
    taskqueue.add(url='/trim', params={'oldest': oldest})
    return (v.to_dict() for v in data), oldest_str

The required updates:

  1. Switch from App Engine ndb to Cloud NDB
  2. New code to extract timestamp of oldest visit displayed
  3. Use Cloud Tasks to create a new task instead of App Engine taskqueue

Here's what your new fetch_visits() should look like:

  • AFTER:
def fetch_visits(limit):
    'get most recent visits and add task to delete older visits'
    with ds_client.context():
        data = Visit.query().order(-Visit.timestamp).fetch(limit)
    oldest = time.mktime(data[-1].timestamp.timetuple())
    oldest_str = time.ctime(oldest)
    logging.info('Delete entities older than %s' % oldest_str)
    task = {
        'app_engine_http_request': {
            'relative_uri': '/trim',
            'body': json.dumps({'oldest': oldest}).encode(),
            'headers': {
                'Content-Type': 'application/json',
            },
        }
    }
    ts_client.create_task(parent=QUEUE_PATH, task=task)
    return (v.to_dict() for v in data), oldest_str

Summarizing the code update:

  • Switch to Cloud NDB means moving Datastore code inside a with statement
  • Switch to Cloud Tasks means using ts_client.create_task() instead of taskqueue.add()
  • Pass in the queue's full path and task payload (described earlier)

Update (push) task handler

There are very few changes that need to be made to the (push) task handler function.

  • BEFORE:
@app.route('/trim', methods=['POST'])
def trim():
    '(push) task queue handler to delete oldest visits'
    oldest = request.form.get('oldest', type=float)
    keys = Visit.query(
            Visit.timestamp < datetime.fromtimestamp(oldest)
    ).fetch(keys_only=True)
    nkeys = len(keys)
    if nkeys:
        logging.info('Deleting %d entities: %s' % (
                nkeys, ', '.join(str(k.id()) for k in keys)))
        ndb.delete_multi(keys)
    else:
        logging.info('No entities older than: %s' % time.ctime(oldest))
    return ''   # need to return SOME string w/200

The only thing that needs to be done, is to place all Datastore access within the context manager with statement, both the query and delete request. With this in mind, update your trim() handler like this:

  • AFTER:
@app.route('/trim', methods=['POST'])
def trim():
    '(push) task queue handler to delete oldest visits'
    oldest = float(request.get_json().get('oldest'))
    with ds_client.context():
        keys = Visit.query(
                Visit.timestamp < datetime.fromtimestamp(oldest)
        ).fetch(keys_only=True)
        nkeys = len(keys)
        if nkeys:
            logging.info('Deleting %d entities: %s' % (
                    nkeys, ', '.join(str(k.id()) for k in keys)))
            ndb.delete_multi(keys)
        else:
            logging.info('No entities older than: %s' % time.ctime(oldest))
    return ''   # need to return SOME string w/200

There are no changes to templates/index.html in this nor in the next codelab.

6. Summary/Cleanup

Deploy application

Double-check all the changes, that your code compiles, and re-deploy. Confirm the app (still) works. You should expect identical output as from Module 7. You just rewired things under the hood, so everything should still work as expected.

If you jumped into this tutorial without doing the Module 7 codelab, the app itself doesn't change; it registers all visits to the main web page (/) and looks like this once you've visited the site enough times as well as tells you that it has deleted all visits older than the tenth:

Module 7 visitme app

That concludes this codelab. Your code should now match what's in the Module 8 repo. Congrats for completing the most critical of the push tasks migrations! Module 9 (codelab link below) is optional, helping users move to Python 3 and Cloud Datastore.

Optional: Clean up

What about cleaning up to avoid being billed until you're ready to move onto the next migration codelab? As existing developers, you're likely already up-to-speed on App Engine's pricing information.

Optional: Disable app

If you're not ready to go to the next tutorial yet, disable your app to avoid incurring charges. When you're ready to move onto the next codelab, you can re-enable it. While your app is disabled, it won't get any traffic to incur charges, however another thing you can get billed for is your Datastore usage if it exceeds the free quota, so delete enough to fall under that limit.

On the other hand, if you're not going to continue with migrations and want to delete everything completely, you can shutdown your project.

Next steps

Beyond this tutorial, the next step is Module 9 and its codelab and porting to Python 3. It's a bit optional as not everyone's ready for that step. There's also an optional port from Cloud NDB to Cloud Datastore — that's definitely optional and only for those who want to move off of the NDB and consolidate code that use Cloud Datastore; that migration is identical to the Module 3 migration codelab.

  • Module 9 Migrate from Python 2 to 3 and Cloud NDB to Cloud Datastore
    • Optional migration module porting to Python 3
    • Also includes optional migration from Cloud NDB to Cloud Datastore (same as Module 3), and
    • A minor migration from Cloud Tasks v1 to v2 (as its client library is frozen for Python 2)
  • Module 4: Migrate to Cloud Run with Docker
    • Containerize your app to run on Cloud Run with Docker
    • This migration allows you to stay on Python 2.
  • Module 5: Migrate to Cloud Run with Cloud Buildpacks
    • Containerize your app to run on Cloud Run with Cloud Buildpacks
    • You do not need to know anything about Docker, containers, or Dockerfiles.
    • Requires your app to have already migrated to Python 3 (Buildpacks doesn't support Python 2)
  • Module 6: Migrate to Cloud Firestore
    • Migrate to Cloud Firestore to access Firebase features
    • While Cloud Firestore supports Python 2, this codelab is available only in Python 3.

7. Additional resources

App Engine migration module codelabs issues/feedback

If you find any issues with this codelab, please search for your issue first before filing. Links to search and create new issues:

Migration resources

Links to the repo folders for Module 7 (START) and Module 8 (FINISH) can be found in the table below. They can also be accessed from the repo for all App Engine codelab migrations which you can clone or download a ZIP file.

Codelab

Python 2

Python 3

Module 7

code

(n/a)

Module 8

code

(n/a)

App Engine resources

Below are additional resources regarding this specific migration: