Coral provides hardware and software tools for developers to build applications using the Edge TPU — a small ASIC designed by Google that provides high performance machine learning (ML) inferencing for low-power devices.

TensorFlow™ is an open source software library for numerical computation using data flow graphs. TensorFlow has become a popular framework for training machine learning models and using those models to solve real world problems that require fast decision-making software.

What you'll build

In this codelab, you will use the Edge TPU Python API to build a device that streams image frames from the camera and locally classifies them against a pre-trained Inception v2 model.

What you'll learn

What you'll need

Flash the OS

If your board is brand new, you need to flash it with the Mendel system image.

Connect peripherals

Connect a USB camera to the USB-A host port, or follow these instructions to connect the CSI camera module.

Connect to the device shell

  1. Plug a 2 - 3A power cable into the USB-C port labeled PWR
  2. Use a USB-C cable to connect your workstation to the USB-C data port labeled OTG
  3. Connect the HDMI port to an external display
  4. Install the Mendel Development Tool on your workstation:
$ pip3 install --user mendel-development-tool
  1. Verify that your workstation using can see your device using mdt:
$ mdt devices
device-name                (192.168.100.2)

Click the following link to download the starter project for this codelab:

Download source code

...or you can clone the GitHub repository from the command line:

$ git clone https://github.com/googlecodelabs/edgetpu-classifier

About the project

The starter project contains the following:

The lib.gstreamer module in the starter project creates the following pipeline based on gstreamer:

Much of the pipeline is handled internally, but the Image Processing step triggers a callback function (frame_callback) exposed within the main script.

Deploy and run the starter project

Navigate to the classifier-start directory. Push the starter script and model files to the device:

$ cd classifier-start/
$ mdt exec mkdir /home/mendel/lib
$ mdt push lib/* /home/mendel/lib

$ mdt push models/* /home/mendel
$ mdt push main.py /home/mendel
...

Open a second terminal window, connect to the device shell and verify the files are all present:

$ mdt shell
...
mendel@device-name:~$ ls
imagenet_labels.txt  inception_v2_224_quant.tflite  lib  main.py

Run the starter script in the device shell window using the provided model and labels file:

mendel@device-name:~$ python3 main.py \
  --model=inception_v2_224_quant.tflite \
  --labels=imagenet_labels.txt

Using a USB Camera

The main script defaults to the Coral camera connected over CSI for input. If you are using a USB camera, follow these instructions to determine the proper input source, resolution, and frame rate to use for your camera. Pass these values to the script as follows:

mendel@device-name:~$ python3 main.py \
  --source=/dev/video1 \
  --resolution=1280x720 \
  --frames=30 \
  --model=inception_v2_224_quant.tflite \
  --labels=imagenet_labels.txt

Verify the starter project

You should see a live camera preview on the connected HDMI display, indicating that the camera pipeline has been set up correctly, but the TensorFlow Lite model is not actually running yet. In the next step, we will add code to classify each image from the camera using the Edge TPU Python API.

Our current starter project is not processing the images before they are displayed (the classify_image() function currently does nothing). In this section, we will instantiate a ClassificationEngine using the Edge TPU Python API and use it to run inference on each frame.

Create the inference engine

The main.py script passes the model file provided as an argument to the init_engine() function. Locate this function and return a ClassificationEngine based on that model.

main.py

from edgetpu.classification.engine import ClassificationEngine

def init_engine(model):
    """Returns an Edge TPU classifier for the model"""
    return ClassificationEngine(model)

Determine the input tensor shape

The Image Resize stage of the gstreamer pipeline scales the images you give it so they match the input dimensions required by the classification model. You simply need to specify what those required dimensions are.

The get_input_tensor_shape() method returns the expected tensor shape of the model in the format [dimensions, height, width, channels]. Use this to implement the input_size() function and return the required image dimensions. This input_size() function is then called by the main script to pass the necessary image scaling dimensions to the gstreamer pipeline.

main.py

def input_size(engine):
    """Returns the required input size for the model"""
    _, h, w, _ = engine.get_input_tensor_shape()
    return w, h

Profile the inference

In order to evaluate the performance, we need to know how long it took to process each frame. Implement the inference_time function to report this value for the last inference run. The ClassificationEngine provides the time taken by the last inference through the get_inference_time() method.

main.py

def inference_time(engine):
    """Returns the time taken to run inference"""
    return engine.get_inference_time()

Classify the image

Now we can implement classify_image. The gstreamer pipeline invokes this callback (via the frame_callback method) to deliver each resized image frame as a flattened 1-D tensor. Use the ClassifyWithInputTensor() method of ClassificationEngine to process the image frame. In addition to the input tensor, the following parameters are required:

The engine returns a list of results, each as a tuple with two values: label index and confidence score (how closely the image matched each label). Remap the indices to the actual labels using the text file that accompanies the model.

main.py

def classify_image(tensor, engine, labels):
    """Runs inference on the provided input tensor and
    returns an overlay to display the inference results
    """
    results = engine.ClassifyWithInputTensor(
        tensor, threshold=0.1, top_k=3)
    return [(labels[i], score) for i, score in results]

Verify the result

Deploy the updated script to the device:

$ mdt push main.py /home/mendel

Run the script again from the device shell terminal window:

mendel@device-name:~$ python3 main.py \
  --model=inception_v2_224_quant.tflite \
  --labels=imagenet_labels.txt

You should now see the inference results overlaid on top of the camera preview. Try pointing the device camera at various objects and watch the inference results update automatically.

Verify that you see an expected output result similar to the following:

Inference time: 655.08 ms (1.53 fps)                goldfish, Carassius auratus (0.96)

Improving performance

At this point, model inference is currently running entirely on the device's CPU, which is not optimized for these inference operations and is sharing the workload with the rest of the system. In the next section, we'll see how the Edge TPU can help us improve performance.

Now let's see how we can improve performance using the dedicated Edge TPU chip to accelerate model inference. All we need to do is recompile our Tensorflow Lite model. This process replaces the CPU-bound inference operations with those supported by the Edge TPU. We don't even need to change any of our code!

Compile the model for Edge TPU

Navigate to the Edge TPU Model Compiler on the Coral site, agree to the terms, and upload the models/inception_v2_224_quant.tflite file from the project.

Review the compilation results after the process is complete. The report indicates how much of the model was successfully compiled for the Edge TPU and if any operations will fall back to the CPU. Our model should run 100% on the Edge TPU.

Click Download model and save the resulting file in your project directory as inception_v2_224_quant_edgetpu.tflite.

Test the new model

Deploy the new model file to the device:

$ mdt push inception_v2_224_quant_edgetpu.tflite /home/mendel

From your device shell terminal window, run main.py again but now using the Edge TPU file as the --model argument:

mendel@device-name:~$ python3 main.py \
  --model=inception_v2_224_quant_edgetpu.tflite \
  --labels=imagenet_labels.txt

Notice how the inference time has dropped over 10x from before! The model inference is no longer the performance bottleneck and we can easily keep up with the frame rate of the camera.

Point the device camera at the same objects you used in the previous step to verify the classification results are still valid.

How it works

Notice that we didn't change any code to enable the Edge TPU acceleration — this process is governed by the model that we use. The Edge TPU Model Compiler processes the nodes in the model graph and converts supported operations into a new custom operation understood by the Edge TPU runtime. During inference, the runtime determines whether individual operations run on the Edge TPU or the CPU.

Models do not have to be fully composed of Edge TPU supported operations to take advantage of acceleration. At the first point in the model graph where an unsupported operation occurs, the compiler partitions the graph into two parts. The first part of the graph that contains only supported operations executes on the Edge TPU, and everything else executes on the CPU.

See TensorFlow models on the Edge TPU for more details on supported operations.

Congratulations! You have successfully built an application to classify images in real-time using the Inception v2 model and the Edge TPU accelerator. Here are some things you can do to go deeper:

What we've covered