1. Overview
The Serverless Migration Station series of codelabs (self-paced, hands-on tutorials) and related videos aim to help Google Cloud serverless developers modernize their appications by guiding them through one or more migrations, primarily moving away from legacy services. Doing so makes your apps more portable and gives you more options and flexibility, enabling you to integrate with and access a wider range of Cloud products and more easily upgrade to newer language releases. While initially focusing on the earliest Cloud users, primarily App Engine (standard environment) developers, this series is broad enough to include other serverless platforms like Cloud Functions and Cloud Run, or elsewhere if applicable.
The purpose of this codelab is to port the Module 8 sample app to Python 3 as well as switch Datastore (Cloud Firestore in Datastore mode) access from using Cloud NDB to the native Cloud Datastore client library and upgrade to the latest version of the Cloud Tasks client library.
We added use of Task Queue for push tasks in Module 7, then migrated that usage to Cloud Tasks in Module 8. Here in Module 9, we continue on to Python 3 and Cloud Datastore. Those using Task Queues for pull tasks will migrate to Cloud Pub/Sub and should refer to Modules 18-19 instead.
You'll learn how to
- Port the Module 8 sample app to Python 3
- Switch Datastore access from Cloud NDB to Cloud Datastore client libraries
- Upgrade to the latest Cloud Tasks client library version
What you'll need
- A Google Cloud Platform project with an active GCP billing account
- Basic Python skills
- Working knowledge of common Linux commands
- Basic knowledge of developing and deploying App Engine apps
- A working Module 8 App Engine app: complete the Module 8 codelab (recommended) or copy the Module 8 app from the repo
Survey
How will you use this tutorial?
How would you rate your experience with Python?
How would you rate your experience with using Google Cloud services?
2. Background
Module 7 demonstrates how to use App Engine Task Queue push tasks in Python 2 Flask App Engine apps. In Module 8, you migrate that app from Task Queue to Cloud Tasks. Here in Module 9, you continue that journey and port that app to Python 3 as well as switch Datastore access from using Cloud NDB to the native Cloud Datastore client library.
Since Cloud NDB works for both Python 2 and 3, it suffices for App Engine users porting their apps from Python 2 to 3. An additional migration of client libraries to Cloud Datastore is completely optional, and there is only one reason to consider it: you have non-App Engine apps (and/or Python 3 App Engine apps) already using the Cloud Datastore client library and want to consolidate your codebase to accessing Datastore with just one client library. Cloud NDB was created specifically for Python 2 App Engine developers as a Python 3 migration tool, so if you don't already have code using the Cloud Datastore client library, you don't need to consider this migration.
Finally, the development of the Cloud Tasks client library continues only in Python 3, so we are "migrating" from one of the final Python 2 versions to its Python 3 contemporary. Fortunately, there are no breaking changes from Python 2, meaning that there's nothing else you need to do here.
This tutorial features the following steps:
- Setup/Prework
- Update configuration
- Modify application code
3. Setup/Prework
This section explains how to:
- Set up your Cloud project
- Get baseline sample app
- (Re)Deploy and validate baseline app
These steps ensure you're starting with working code and that it's ready for migration to Cloud services.
1. Setup project
If you completed the Module 8 codelab, reuse that same project (and code). Alternatively, create a brand new project or reuse another existing project. Ensure the project has an active billing account and an enabled App Engine app. Find your project ID as you need to have it handy during this codelab, using it whenever you encounter the PROJECT_ID
variable.
2. Get baseline sample app
One of the prerequisites is a working Module 8 App Engine app: complete the Module 8 codelab (recommended) or copy the Module 8 app from the repo. Whether you use yours or ours, the Module 8 code is where we'll begin ("START"). This codelab walks you through the migration, concluding with code that resembles what's in the Module 9 repo folder ("FINISH").
- START: Module 8 repo
- FINISH: Module 9 repo
- Entire repo (clone or download ZIP)
Regardless which Module 7 app you use, the folder should look like the below, possibly with a lib
folder as well:
$ ls README.md appengine_config.py requirements.txt app.yaml main.py templates
3. (Re)Deploy and validate baseline app
Execute the following steps to deploy the Module 8 app:
- Delete the
lib
folder if there is one and runpip install -t lib -r requirements.txt
to repopulatelib
. You may need to usepip2
instead if you have both Python 2 and 3 installed on your development machine. - Ensure you've installed and initialized the
gcloud
command-line tool and reviewed its usage. - (optional) Set your Cloud project with
gcloud config set project
PROJECT_ID
if you don't want to enter thePROJECT_ID
with eachgcloud
command you issue. - Deploy the sample app with
gcloud app deploy
- Confirm the app runs as expected without issue. If you completed the Module 8 codelab, the app displays the top visitors along with the most recent visits (illustrated below). At the bottom is an indication of the older tasks which will be deleted.
4. Update configuration
requirements.txt
The new requirements.txt
is nearly the same as the one for Module 8, with only one big change: replace google-cloud-ndb
with google-cloud-datastore
. Make this change so your requirements.txt
file looks like this:
flask
google-cloud-datastore
google-cloud-tasks
This requirements.txt
file doesn't feature any version numbers, meaning the latest versions are selected. If any incompatibilities arise, use of version numbers to lock-in working versions for an app is standard practice.
app.yaml
The second generation App Engine runtime does not support built-in 3rd-party libraries like in 2.x nor does it support copying of non-built-in libraries. The only requirement for 3rd-party packages is to list them in requirements.txt
. As a result, the entire libraries
section of app.yaml
can be deleted.
Another update is that the Python 3 runtime requires use of web frameworks that do their own routing. As a result, all script handlers must be changed to auto
. However, since all routes must be changed to auto
, and there are no static files served from this sample app, it's irrelevant to have any handlers, so remove the entire handlers
section as well.
The only thing needed in app.yaml
is to set the runtime to a supported version of Python 3, say 3.10. Make this change so the new, abbreviated app.yaml
is just this single line:
runtime: python310
Delete appengine_config.py and lib
Next generation App Engine runtimes revamp 3rd-party package usage:
- Built-in libraries are those vetted by Google and made available on App Engine servers, likely because they contain C/C++ code which developers aren't allowed to deploy to the cloud—these are no longer available in the 2nd generation runtimes.
- Copying non-built-in libraries (sometimes called "vendoring" or "self-bundling") is no longer needed in 2nd generation runtimes. Instead, they should be listed in
requirements.txt
where the build system automatically installs them on your behalf at deploy time.
As a result of those changes to 3rd-party package management, neither the appengine_config.py
file nor lib
folder are needed, so delete them. In 2nd generation runtimes, App Engine automatically installs third-party packages listed in requirements.txt
. Summarizing:
- No self-bundled or copied 3rd-party libraries; list them in
requirements.txt
- No
pip install
into alib
folder, meaning nolib
folder period - No listing built-in 3rd-party libraries (thus no
libraries
section) inapp.yaml
; list them inrequirements.txt
- No 3rd-party libraries to reference from your app means no
appengine_config.py
file
Listing all desired 3rd-party libraries in requirements.txt
is the only developer requirement.
5. Update application files
There is only one application file, main.py
, so all changes in this section affect just that file. Below is a "diffs" illustration on the overall changes that need to be made to refactor the existing code into the new app. Readers are not expected to read the code line-by-line, as its purpose is to simply get a pictorial overview of what's required in this refactor (but feel free to open in a new tab or download and zoom in if desired).
Update imports and initialization
The import section in main.py
for Module 8 uses Cloud NDB and Cloud Tasks; it should look as follows:
BEFORE:
from datetime import datetime
import json
import logging
import time
from flask import Flask, render_template, request
import google.auth
from google.cloud import ndb, tasks
app = Flask(__name__)
ds_client = ndb.Client()
ts_client = tasks.CloudTasksClient()
Logging is simplified and enhanced in the second generation runtimes like Python 3:
- For comprehensive logging experience, use Cloud Logging
- For simple logging, just send to
stdout
(orstderr
) viaprint()
- There's no need to use the Python
logging
module (so remove it)
As such, delete the import of logging
and swap google.cloud.ndb
with google.cloud.datastore
. Similarly, change ds_client
to point to a Datastore client instead of an NDB client. With these changes made, the top of your new app now looks like this:
AFTER:
from datetime import datetime
import json
import time
from flask import Flask, render_template, request
import google.auth
from google.cloud import datastore, tasks
app = Flask(__name__)
ds_client = datastore.Client()
ts_client = tasks.CloudTasksClient()
Migrate to Cloud Datastore
Now it's time to replace NDB client library usage with Datastore. Both App Engine NDB and Cloud NDB require a data model (class); for this app, it's Visit
. The store_visit()
function works the same in all other migration modules: it registers a visit by creating a new Visit
record, saving a visiting client's IP address and user agent (browser type).
BEFORE:
class Visit(ndb.Model):
'Visit entity registers visitor IP address & timestamp'
visitor = ndb.StringProperty()
timestamp = ndb.DateTimeProperty(auto_now_add=True)
def store_visit(remote_addr, user_agent):
'create new Visit entity in Datastore'
with ds_client.context():
Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()
However Cloud Datastore does not use a data model class, so delete the class. Furthermore, Cloud Datastore does not automatically create a timestamp when records are created, requiring you to do it manually—this is done with the datetime.now()
call.
Without the data class, your modified store_visit()
should look like this:
AFTER:
def store_visit(remote_addr, user_agent):
'create new Visit entity in Datastore'
entity = datastore.Entity(key=ds_client.key('Visit'))
entity.update({
'timestamp': datetime.now(),
'visitor': '{}: {}'.format(remote_addr, user_agent),
})
ds_client.put(entity)
The key function is fetch_visits()
. Not only does it perform the original query for the latest Visit
s, but it also grabs the timestamp of the last Visit
displayed and creates the push task that calls /trim
(thus trim()
) to mass-delete the old Visit
s. Here it is using Cloud NDB:
BEFORE:
def fetch_visits(limit):
'get most recent visits & add task to delete older visits'
with ds_client.context():
data = Visit.query().order(-Visit.timestamp).fetch(limit)
oldest = time.mktime(data[-1].timestamp.timetuple())
oldest_str = time.ctime(oldest)
logging.info('Delete entities older than %s' % oldest_str)
task = {
'app_engine_http_request': {
'relative_uri': '/trim',
'body': json.dumps({'oldest': oldest}).encode(),
'headers': {
'Content-Type': 'application/json',
},
}
}
ts_client.create_task(parent=QUEUE_PATH, task=task)
return (v.to_dict() for v in data), oldest_str
The primary changes:
- Swap out the Cloud NDB query for the Cloud Datastore equivalent; the query styles differ slightly.
- Datastore doesn't require use of a context manager nor makes you extract its data (with
to_dict()
) like Cloud NDB does. - Replace logging calls with
print()
After those changes, fetch_visits()
look like this:
AFTER:
def fetch_visits(limit):
'get most recent visits & add task to delete older visits'
query = ds_client.query(kind='Visit')
query.order = ['-timestamp']
visits = list(query.fetch(limit=limit))
oldest = time.mktime(visits[-1]['timestamp'].timetuple())
oldest_str = time.ctime(oldest)
print('Delete entities older than %s' % oldest_str)
task = {
'app_engine_http_request': {
'relative_uri': '/trim',
'body': json.dumps({'oldest': oldest}).encode(),
'headers': {
'Content-Type': 'application/json',
},
}
}
ts_client.create_task(parent=QUEUE_PATH, task=task)
return visits, oldest_str
This would normally be all that's necessary. Unfortunately there's one major issue.
(Possibly) Create a new (push) queue
In Module 7, we added use of App Engine taskqueue
to the existing Module 1 app. One key benefit of having push tasks as a legacy App Engine feature is that a "default" queue is automatically created. When that app was migrated to Cloud Tasks in Module 8, that default queue was already there, so we still didn't need to be concerned about it then. That changes here in Module 9.
One critical aspect to consider is that the new App Engine application no longer uses App Engine services, and as such, you can no longer assume that App Engine automatically creates a task queue automatically in a different product (Cloud Tasks). As written, creating a task in fetch_visits()
(for a non-existing queue) will fail. A new function is needed to check whether the ("default") queue exists, and if not, create one.
Call this function _create_queue_if()
, and add it to your application just above fetch_visits()
because that is where it is called. The body of the function to add:
def _create_queue_if():
'app-internal function creating default queue if it does not exist'
try:
ts_client.get_queue(name=QUEUE_PATH)
except Exception as e:
if 'does not exist' in str(e):
ts_client.create_queue(parent=PATH_PREFIX,
queue={'name': QUEUE_PATH})
return True
The Cloud Tasks create_queue()
function requires the full pathname of the queue except the queue name. For simplicity, create another variable PATH_PREFIX
representing the QUEUE_PATH
minus the queue name (QUEUE_PATH.rsplit('/', 2)[0]
). Add its definition near the top so the code block with all the constant assignments look like this:
_, PROJECT_ID = google.auth.default()
REGION_ID = 'REGION_ID' # replace w/your own
QUEUE_NAME = 'default' # replace w/your own
QUEUE_PATH = ts_client.queue_path(PROJECT_ID, REGION_ID, QUEUE_NAME)
PATH_PREFIX = QUEUE_PATH.rsplit('/', 2)[0]
Now modify the last line in fetch_visits()
to use _create_queue_if()
, first creating the queue if necessary, then creating the task afterwards:
if _create_queue_if():
ts_client.create_task(parent=QUEUE_PATH, task=task)
return visits, oldest_str
Both _create_queue_if()
and fetch_visits()
should now look like this in aggregate:
def _create_queue_if():
'app-internal function creating default queue if it does not exist'
try:
ts_client.get_queue(name=QUEUE_PATH)
except Exception as e:
if 'does not exist' in str(e):
ts_client.create_queue(parent=PATH_PREFIX,
queue={'name': QUEUE_PATH})
return True
def fetch_visits(limit):
'get most recent visits & add task to delete older visits'
query = ds_client.query(kind='Visit')
query.order = ['-timestamp']
visits = list(query.fetch(limit=limit))
oldest = time.mktime(visits[-1]['timestamp'].timetuple())
oldest_str = time.ctime(oldest)
print('Delete entities older than %s' % oldest_str)
task = {
'app_engine_http_request': {
'relative_uri': '/trim',
'body': json.dumps({'oldest': oldest}).encode(),
'headers': {
'Content-Type': 'application/json',
},
}
}
if _create_queue_if():
ts_client.create_task(parent=QUEUE_PATH, task=task)
return visits, oldest_str
Other than having to add this extra code, the rest of the Cloud Tasks code is mostly intact from Module 8. The final piece of code to look at is the task handler.
Update (push) task handler
In the task handler, trim()
, the Cloud NDB code queries for visits older than the oldest displayed. It uses a keys-only query to speed things up—why fetch all the data if you only need the Visit IDs? Once you have all the visit IDs, delete them all in a batch with Cloud NDB's delete_multi()
function.
BEFORE:
@app.route('/trim', methods=['POST'])
def trim():
'(push) task queue handler to delete oldest visits'
oldest = float(request.get_json().get('oldest'))
with ds_client.context():
keys = Visit.query(
Visit.timestamp < datetime.fromtimestamp(oldest)
).fetch(keys_only=True)
nkeys = len(keys)
if nkeys:
logging.info('Deleting %d entities: %s' % (
nkeys, ', '.join(str(k.id()) for k in keys)))
ndb.delete_multi(keys)
else:
logging.info(
'No entities older than: %s' % time.ctime(oldest))
return '' # need to return SOME string w/200
Like fetch_visits()
, the bulk of the changes involve swapping out Cloud NDB code for Cloud Datastore, tweaking the query styles, removing use of its context manager, and changing the logging calls to print()
.
AFTER:
@app.route('/trim', methods=['POST'])
def trim():
'(push) task queue handler to delete oldest visits'
oldest = float(request.get_json().get('oldest'))
query = ds_client.query(kind='Visit')
query.add_filter('timestamp', '<', datetime.fromtimestamp(oldest))
query.keys_only()
keys = list(visit.key for visit in query.fetch())
nkeys = len(keys)
if nkeys:
print('Deleting %d entities: %s' % (
nkeys, ', '.join(str(k.id) for k in keys)))
ds_client.delete_multi(keys)
else:
print('No entities older than: %s' % time.ctime(oldest))
return '' # need to return SOME string w/200
There are no changes to the main application handler root()
.
Port to Python 3
This sample app was designed to run on both Python 2 and 3. Any Python 3-specific changes were covered earlier in relevant sections of this tutorial. There are no additional steps nor compatibility libraries required.
Cloud Tasks update
The final version of the Cloud Tasks client library supporting Python 2 is 1.5.0. At the time of this writing, the latest version of the client library for Python 3 is fully compatible with that version, thus no further updates are required.
HTML template update
No changes are needed in the HTML template file, templates/index.html
, either, so this wraps up all the necessary changes to arrive at the Module 9 app.
6. Summary/Cleanup
Deploy and verify application
Once you've completed the code updates, mainly the port to Python 3, deploy your app with gcloud app deploy
. The output should be identical to the apps from Modules 7 and 8 app except that you've moved the database access to the Cloud Datastore client library and have upgraded to Python 3:
This step completes codelab. We invite you to compare your code to what's in the Module 9 folder. Congratulations!
Clean up
General
If you are done for now, we recommend you disable your App Engine app to avoid incurring billing. However if you wish to test or experiment some more, the App Engine platform has a free quota, and so as long as you don't exceed that usage tier, you shouldn't be charged. That's for compute, but there may also be charges for relevant App Engine services, so check its pricing page for more information. If this migration involves other Cloud services, those are billed separately. In either case, if applicable, see the "Specific to this codelab" section below.
For full disclosure, deploying to a Google Cloud serverless compute platform like App Engine incurs minor build and storage costs. Cloud Build has its own free quota as does Cloud Storage. Storage of that image uses up some of that quota. However, you might live in a region that does not have such a free tier, so be aware of your storage usage to minimize potential costs. Specific Cloud Storage "folders" you should review include:
console.cloud.google.com/storage/browser/LOC.artifacts.PROJECT_ID.appspot.com/containers/images
console.cloud.google.com/storage/browser/staging.PROJECT_ID.appspot.com
- The storage links above depend on your
PROJECT_ID
and *LOC
*ation, for example, "us
" if your app is hosted in the USA.
On the other hand, if you're not going to continue with this application or other related migration codelabs and want to delete everything completely, shut down your project.
Specific to this codelab
The services listed below are unique to this codelab. Refer to each product's documentation for more information:
- Cloud Tasks has a free tier; see its pricing page for more details.
- The App Engine Datastore service is provided by Cloud Datastore (Cloud Firestore in Datastore mode) which also has a free tier; see its pricing page for more information.
Next steps
This concludes our migration from App Engine Task Queue push tasks to Cloud Tasks. The optional migration from Cloud NDB to Cloud Datastore is also covered on its own (without Task Queue or Cloud Tasks) in Module 3. In addition to Module 3, there are other migration modules focusing on moving away from App Engine legacy bundled services to consider include:
- Module 2: migrate from App Engine NDB to Cloud NDB
- Module 3: migrate from Cloud NDB to Cloud Datastore
- Modules 12-13: migrate from App Engine Memcache to Cloud Memorystore
- Modules 15-16: migrate from App Engine Blobstore to Cloud Storage
- Modules 18-19: App Engine Task Queue (pull tasks) to Cloud Pub/Sub
App Engine is no longer the only serverless platform in Google Cloud. If you have a small App Engine app or one that has limited functionality and wish to turn it into a standalone microservice, or you want to break-up a monolithic app into multiple reusable components, these are good reasons to consider moving to Cloud Functions. If containerization has become part of your application development workflow, particularly if it consists of a CI/CD (continuous integration/continuous delivery or deployment) pipeline, consider migrating to Cloud Run. These scenarios are covered by the following modules:
- Migrate from App Engine to Cloud Functions: see Module 11
- Migrate from App Engine to Cloud Run: see Module 4 to containerize your app with Docker, or Module 5 to do it without containers, Docker knowledge, or
Dockerfile
s
Switching to another serverless platform is optional, and we recommend considering the best options for your apps and use cases before making any changes.
Regardless of which migration module you consider next, all Serverless Migration Station content (codelabs, videos, source code [when available]) can be accessed at its open source repo. The repo's README
also provides guidance on which migrations to consider and any relevant "order" of Migration Modules.
7. Additional resources
Codelabs issues/feedback
If you find any issues with this codelab, please search for your issue first before filing. Links to search and create new issues:
Migration resources
Links to the repo folders for Module 8 (START) and Module 9 (FINISH) can be found in the table below. They can also be accessed from the repo for all App Engine codelab migrations which you can clone or download a ZIP file.
Codelab | Python 2 | Python 3 |
(n/a) | ||
Module 9 | (n/a) |
Online resources
Below are online resources which may be relevant for this tutorial:
App Engine
- App Engine documentation
- Python 2 App Engine (standard environment) runtime
- Python 3 App Engine (standard environment) runtime
- Differences between Python 2 & 3 App Engine (standard environment) runtimes
- Python 2 to 3 App Engine (standard environment) migration guide
- App Engine pricing and quotas information
Cloud NDB
Cloud Datastore
Cloud Tasks
Other Cloud information
- Python on Google Cloud Platform
- Google Cloud Python client libraries
- Google Cloud "Always Free" tier
- Google Cloud SDK (
gcloud
command-line tool) - All Google Cloud documentation
License
This work is licensed under a Creative Commons Attribution 2.0 Generic License.