xx While Cloud NDB is a great Datastore solution for long-time App Engine developers and helps with transitioning to Python 3, it is not the only way App Engine developers can access Datastore.
When App Engine's Datastore became its own product in 2013, Google Cloud Datastore, a new client library was created so that all users can use Datastore.
Python 3 App Engine as well as non-App Engine developers are directed to use the Cloud Datastore (not Cloud NDB) client library. Python 2 App Engine developers are encouraged to migrate from ndb
to Cloud NDB and port to Python 3 from there.
If you have (App Engine or non-App Engine) apps using Cloud Datastore, there are several reasons to consider migrating your Cloud NDB apps to CLoud Datastore. Moving to Cloud Datastore:
- Allows developers to focus on a single codebase for Datastore access
- Avoids maintaining some code using Cloud NDB & others using Cloud Datastore
- More consistency in codebase and better code reuseability
- Common/shared libraries contribute to lower overall maintenance cost
You'll learn how to
- Use Cloud NDB (if you're unfamiliar with it)
- Migrate from Cloud NDB to Cloud Datastore
- Further migrate your app to Python 3
What you'll need
- A Google Cloud Platform project with an active GCP billing account
- Basic Python skills
- Working knowledge of basic Linux commands
- Basic knowledge of developing & deploying App Engine apps
- A working Module 2 App Engine 2.x or 3.x app.
Survey
How will you use this codelab?
App Engine's Datastore started as the bundled NoSQL data storage solution since its original launch in 2008. Since then, as mentioned above, Datastore has grown up to become its own product, Cloud Datastore so developers can use it outside App Engine. Fast-forwarding some more to 2017, the next generation of Datastore launched in 2017 rebranded as Cloud Firestore to signal its feature integration with the Firebase real-time database. For backwards-compatibility reasons, Cloud Firestore operates in "Cloud Firestore in Datastore mode" when accessed from the Cloud NDB or Cloud Datastore client libraries.
Through completing this migration, you can then:
- Migrate to Python 3 & the next-gen App Engine runtime
- Containerize your Python 2 (or 3) app and migrate to Cloud Run
- Add use of App Engine (push) task queues then migrate to Cloud Tasks
- Migrate to Cloud Firestore (using Firestore in native mode)
But let's migrate to Cloud Datastore first. This migration features these primary steps:
- Setup/Prework
- Replace Cloud NDB with Cloud Datastore client libraries
- Update application
Before we get going with the main part of the tutorial, let's setup our project, get the code, then deploy the baseline app so we know we started with working code.
1. Setup project
If you completed the Module 2 codelab, we recommend reusing that same project (and code). Alternatively, you can create a brand new project or reuse another existing project. Ensure the project has an active billing account and App Engine (app) is enabled.
2. "Get" baseline sample app
One of the prerequisites is to have a working Module 2 sample app. Here are some options on where to get it from:
- The working sample you created after completing the Module 2 codelab
- Doing the Module 2 codelab now before starting this one
- Copy the Module 2 repo (link below)
Whether you use yours or ours, the Module 2 code is where we'll START. This Module 3 codelab walks you through each step, and when complete, it should resemble code at the FINISH point. There are Python 2 and 3 versions of this tutorial, so grab the correct code repo below.
Python 2
- START: Module 2 code
- FINISH: Module 3 code
- Entire repo (to clone or download ZIP)
The directory of Python 2 Module 2 STARTing files (yours or ours) should look like this:
$ ls
README.md appengine_config.py requirements.txt
app.yaml main.py templates
If you completed the Module 2 tutorial, you'll also have a lib
folder with Flask and its dependencies. If you don't have a lib
folder, create it with the pip install -t lib -r requirements.txt
command so that we can deploy this baseline app in the next step. If you have both Python 2 and 3 installed, we recommend using pip2
instead of pip
to avoid confusion with Python 3.
Python 3
- START: Module 2 repo
- FINISH: Module 3 repo
- Entire repo (to clone or download ZIP)
The directory of Python 3 Module 2 STARTing files (yours or ours) should look like this:
$ ls
README.md main.py templates
app.yaml requirements.txt
Neither lib
nor appengine_config.py
are used for Python 3.
3. (Re)Deploy Module 1 app
Your remaining prework steps to execute now:
- Re-familiarize yourself with the
gcloud
command-line tool (if nec.) - (Re)deploy the Module 1 code to App Engine (if nec.)
Once you've successfully executed those steps and confirm it's operational, we'll move ahead in this tutorial, starting with the configuration files.
The only configuration change is a minor package swap in your requirements.txt
file.
1. Update requirements.txt
Upon completing Module 2, your requirements.txt
file looked like this:
- BEFORE (Python 2 & 3):
Flask==1.1.2
google-cloud-ndb==1.7.1
Update requirements.txt
by replacing the Cloud NDB library (google-cloud-ndb
) with the latest version of the Cloud Datastore library (google-cloud-datastore
), leaving the entry for Flask intact, bearing in mind the final version of Cloud Datastore that's Python 2 compatible is 1.15.3:
- AFTER (Python 2):
Flask==1.1.2
google-cloud-datastore==1.15.3
- AFTER (Python 3):
Flask==1.1.2
google-cloud-datastore==2.1.0
Keep in mind that the repo is maintained more regularly than this tutorial, so it's possible the requirements.txt
there may reflect newer versions. We recommend using the latest versions of each library, but if they don't work, you can roll back to an older release. The versions numbers above are the latest when this codelab was last updated.
2. Other configuration files
The other configuration files, app.yaml
and appengine_config.py
, should remain unchanged from the previous migration step:
app.yaml
should (still) reference the 3rd-party bundled packagesgrpcio
andsetuptools
.appengine_config.py
should (still) pointpkg_resources
andgoogle.appengine.ext.vendor
to the 3rd-party resources inlib
.
Now let's move to the application files.
There are no changes to template/index.html
, but there are a few updates for main.py
.
1. Imports
The starting code for the import section should look as follows:
- BEFORE:
from flask import Flask, render_template, request
from google.cloud import ndb
Replace the google.cloud.ndb
import with one for Cloud Datastore: google.cloud.datastore
. Because the Datastore client library does not support auto-creation of a timestamp field in an Entity, also import the standard library datetime
module to create one manually. By convention, standard library imports go above third-party package imports. When you're done with these changes, it should look like this:
- AFTER:
from datetime import datetime
from flask import Flask, render_template, request
from google.cloud import datastore
2. Initialization and data model
After initializing Flask, the Module 2 sample app creating an NDB data model class and its fields:
- BEFORE:
app = Flask(__name__)
ds_client = ndb.Client()
class Visit(ndb.Model):
visitor = ndb.StringProperty()
timestamp = ndb.DateTimeProperty(auto_now_add=True)
The Cloud Datastore library does not have such a class, so delete the Visit
class declaration. You still need a client to talk to Datastore, so change ndb.Client()
to datastore.Client()
. The Datastore library is more "flexible," allowing you to create Entities without "pre-declaring" their structure like NDB. After this update, this part of main.py
should look like:
- AFTER:
app = Flask(__name__)
ds_client = datastore.Client()
3. Datastore access
Migrating to Cloud Datastore requires changing how you create, store, and query Datastore entites (at the user-level). For your applications, the difficulty of this migration depends on how complex your Datastore code is. In our sample app, we attempted to make the update as straightforward as possible. Here is our starting code:
- BEFORE:
def store_visit(remote_addr, user_agent):
with ds_client.context():
Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()
def fetch_visits(limit):
with ds_client.context():
return (v.to_dict() for v in Visit.query().order(
-Visit.timestamp).fetch_page(limit)[0])
With Cloud Datastore, create a generic entity, identifying grouped objects in your Entity with a "key". Create the data record with a JSON object (Python dict
) of key-value pairs, then write it to Datastore with the expected put()
. Querying is similar but more straightforward with Datastore. Here you can see how the equivalent Datastore code differs:
- AFTER:
def store_visit(remote_addr, user_agent):
entity = datastore.Entity(key=ds_client.key('Visit'))
entity.update({
'timestamp': datetime.now(),
'visitor': '{}: {}'.format(remote_addr, user_agent),
})
ds_client.put(entity)
def fetch_visits(limit):
query = ds_client.query(kind='Visit')
query.order = ['-timestamp']
return query.fetch(limit=limit)
Update the function bodies for store_visit()
and fetch_visits()
as above, keeping their signatures identical to the previous version. There are no changes at all to the main handler root()
. After completing these changes, your
Deploy application
Re-deploy your app with gcloud app deploy
, and confirm the app works. Your code should now match what's in the Module 3 repo folders:
Congrats for completing this Module 3 codelab. You've just crossed the finish line, since this is the last of the strongly recommended migrations in this series as far as Datastore goes.
Optional: Clean up
What about cleaning up to avoid being billed until you're ready to move onto the next migration codelab? As existing developers, you're likely already up-to-speed on App Engine's pricing information.
Optional: Disable app
If you're not ready to go to the next tutorial yet, disable your app to avoid incurring charges. When you're ready to move onto the next codelab, you can re-enable it. While your app is disabled, it won't get any traffic to incur charges, however another thing you can get billed for is your Datastore usage if it exceeds the free quota, so delete enough to fall under that limit.
On the other hand, if you're not going to continue with migrations and want to delete everything completely, you can shutdown your project.
Next steps
From here, feel free to explore these next migration modules:
- Module 3 Bonus: Continue below to the bonus part of this tutorial to explore porting to Python 3 and the next generation App Engine runtime.
- Module 7: App Engine Push Task Queues (required if you use [push] Task Queues)
- Adds App Engine
taskqueue
push tasks to Module 1 app - Prepares users for migrating to Cloud Tasks in Module 8
- Adds App Engine
- Module 4: Migrate to Cloud Run with Docker
- Containerize your app to run on Cloud Run with Docker
- Allows you to stay on Python 2
- Module 5: Migrate to Cloud Run with Cloud Buildpacks
- Containerize your app to run on Cloud Run with Cloud Buildpacks
- Do not need to know anything about Docker, containers, or
Dockerfile
s - Requires you to have already migrated your app to Python 3
- Module 3:
- Modernize Datastore access from Cloud NDB to Cloud Datastore
- This is the library used for Python 3 App Engine apps and non-App Engine apps
- Module 6: Migrate to Cloud Firestore
- Migrate to Cloud Firestore to access Firebase features
- While Cloud Firestore supports Python 2, this codelab is available only in Python 3.
To access the latest App Engine runtime and features, we recommend that you migrate to Python 3. In our sample app, Datastore was the only built-in service we used, and since we've migrated from ndb
to Cloud NDB, we can now port to App Engine's Python 3 runtime.
Overview
While porting to Python 3 is not within the scope of a Google Cloud tutorial, this part of the codelab gives developers an idea of how the Python 3 App Engine runtime differs. One outstanding feature of the next-gen runtime is simplified access to third-party packages: There's no need to specify built-in packages in app.yaml
nor a requirement to copy or upload non-built-in libraries; they are implicitly installed from being listed in requirements.txt
.
Because our sample is so basic and Cloud Datastore is Python 2-3 compatible, no application code needs to be explicitly ported to 3.x: The app runs on 2.x & 3.x unmodified, meaning the only required changes are in configuration in this case:
- Simplify
app.yaml
to reference Python 3 and remove reference to bundled 3rd-party libraries. - Delete
appengine_config.py
and thelib
folder as they're no longer necessary.
The main.py
and templates/index.html
application files remain unchanged.
Update requirements.txt
The final version of the Cloud Datastore supporting Python 2 is 1.15.3. Update requirements.txt
by with the latest version for Python 3 (may be newer by now). When this tutorial was written, the latest version was 2.1.0, so edit that line to look like this (or whatever the latest version is):
google-cloud-datastore==2.1.0
Simplify app.yaml
BEFORE:
The only real change for this sample app is to significantly shorten app.yaml
. As a reminder, here's what we had in app.yaml
at the conclusion of Module 3:
runtime: python27
threadsafe: yes
api_version: 1
handlers:
- url: /.*
script: main.app
libraries:
- name: grpcio
version: 1.0.0
- name: setuptools
version: 36.6.0
AFTER:
In Python 3, the threadsafe
, api_version
, and libraries
directives are all deprecated; all apps are presumed threadsafe and api_version
isn't used in Python 3. There are no longer built-in third-party packages preinstalled on App Engine services, so libraries
is also deprecated. Check the documentation on changes to app.yaml
for more information on these changes. As a result, you should delete all three from app.yaml
and update to a supported Python 3 version (see below).
Use of handlers
directive
In addition, the handlers
directive, which directs traffic at App Engine applications has also been deprecated. Since the next-gen runtime expects web frameworks to manage app routing, all "handler scripts" must be changed to "auto
". Combining the changes from above, you arrive at this app.yaml
:
runtime: python38
handlers:
- url: /.*
script: auto
Learn more about "script: auto
" from its documentation page.
Removing handlers
directive
Since handlers
is deprecated, you can remove the entire section too, leaving a single-line app.yaml
:
runtime: python38
By default, this will launch the Gunicorn WSGI web server which is available for all applications. If you're familiar with gunicorn
, this is the command executed when it's started by default with the barebones app.yaml
:
gunicorn main:app --workers 2 -c /config/gunicorn.py
Use of entrypoint
directive
If, however, your application requires a specific start-up command, that can be specified with an entrypoint
directive:
runtime: python38
entrypoint: python main.py
This example specifically requests the Flask development server be used instead of gunicorn
. Code that starts the development server must also be added to your app to launch on the 0.0.0.0
interface on port 8080 by adding this small section to the bottom of main.py
:
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080, debug=True)
Learn more about entrypoint
from its documentation page. More examples & best practices can be found here and here.
Delete appengine_config.py
and lib
Delete the appengine_config.py
file and the lib
folder. In migrating to Python 3, App Engine acquires and installs packages listed in requirements.txt
.
The appengine_config.py
config file is used to recognize third-party libraries/packages, whether you've copied them yourself or use ones already available on App Engine servers (built-in). When moving to Python 3, a summary of the big changes are:
- No bundling of copied third-party libraries (listed in
requirements.txt
) - No
pip install
into alib
folder, meaning nolib
folder period - No listing built-in third-party libraries in
app.yaml
- No need to reference app to third-party libraries, so no
appengine_config.py
file
Listing all required third-party libraries in requirements.txt
is all that's needed.
Deploy application
Re-deploy your app to ensure that it works. You can also confirm how close your solution is to the Module 2 sample Python 3 code. To visualize the differences with Python 2, compare the code with its Python 2 version.
Congrats on finishing the bonus step in Module 2! Visit the documentation on preparing configuration files for the Python 3 runtime. Finally, review the earlier summary above for next steps and cleanup.
Preparing your application
When it is time to migrate your application, you will have to port your main.py
and other application files to 3.x, so a best practice is to try your best to make your 2.x application as "forward-compatible" as possible.
There are plenty of online resources to help you accomplish that, but some of the key tips:
- Ensure all application dependences are fully 3.x-compatible
- Ensure your application runs on at least 2.6 (preferably 2.7)
- Ensure application passes entire test suite (and minimum 80% coverage)
- Use compatibility libraries such as
six
, Future, and/or Modernize - Educate yourself on key backwards-incompatible 2.x vs. 3.x differences
- Any I/O will likely lead to Unicode vs. byte string incompatibilities
The sample app was designed with all this in mind, hence why it runs on 2.x and 3.x right out of the box so we can focus on showing you what needs to be changed in order to use the next-gen platform.
App Engine migration module codelabs issues/feedback
If you find any issues with this codelab, please search for your issue first before filing. Links to search and create new issues:
Migration resources
Links to the repo folders for Module 2 (START) and Module 3 (FINISH) can be found in the table below. They can also be accessed from the repo for all App Engine migrations.
Codelab | Python 2 | Python 3 |
Module 3 |
App Engine resources
Below are additional resources regarding this specific migration:
- Python Cloud NDB and Cloud Datastore references
- Migrating to Python 3 & GAE next-generation runtime
- General