TensorFlow is a multipurpose machine learning framework. TensorFlow can be used anywhere from training huge models across clusters in the cloud, to running models locally on an embedded system like your phone.

What you'll Learn

What you will build

A simple camera app that runs a TensorFlow image recognition program to identify flowers.

Image CC-BY, by Orazio Puccio

Most of this codelab will be using the terminal. Open it now.

Install TensorFlow

Before we can begin the tutorial you need to install tensorflow.

Use the git repository from the first codelab

This codelab uses files generated during the TensorFlow for Poets 1 codelab. If you have not completed that codelab we recommend you go do it now. If you prefer not to, instructions for downloading the missing files are given in the next sub-section.


In TensorFlow for Poets 1, you also cloned the relevant files for this codelab. We will be working in that same git directory, ensure that it is your current working directory, and check the contents, as follows:

cd tensorflow-for-poets-2
ls

This directory should contain three other subdirectories:

ls tf_files/
retrained_graph.pb  retrained_labels.txt

Otherwise (if you don't have the files from Part 1)

Clone the Git repository

The following command will clone the Git repository containing the files required for this codelab:

git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

Now cd into the directory of the clone you just created. That's where you will be working for the rest of this codelab:

cd tensorflow-for-poets-2

The repo contains three directories: android/, scripts/, and tf_files/

Checkout the branch with the required files

git checkout end_of_first_codelab
ls tf_files/

Next, verify that the model is producing sane results before starting to modifying it.

The scripts/ directory contains a simple command line script, label_image.py, to test the network. Now we'll test label_image.py on this picture of some daisies:

flower_photos/daisy/3475870145_685a19116d.jpg

Image CC-BY, by Fabrizio Sciami

Now test the model. If you are using a different architecture you will need to set the "--input_size" flag.

python -m scripts.label_image \
  --graph=tf_files/retrained_graph.pb  \
  --image=tf_files/flower_photos/daisy/3475870145_685a19116d.jpg

The script will print the probability the model has assigned to each flower type. Something like this:

daisy 0.94237
roses 0.0487475
sunflowers 0.00510139
dandelion 0.00343337
tulips 0.00034759

This should hopefully produce a sensible top label for your example. You'll be using this command to make sure you're still getting sensible results as you do further processing on the model file to prepare it for use in a mobile app.

Mobile devices have significant limitations, so any pre-processing that can be done to reduce an app's footprint is worth considering.

Limited libraries on mobile

One way the TensorFlow library is kept small, for mobile, is by only supporting the subset of operations that are commonly used during inference. This is a reasonable approach, as training is rarely conducted on mobile platforms. Similarly it also excludes support for operations with large external dependencies. You can see the list of supported ops in the tensorflow/contrib/makefile/tf_op_files.txt file.

By default, most graphs contain training ops that the mobile version of TensorFlow doesn't support. . TensorFlow won't load a graph that contains an unsupported operation (even if the unsupported operation is irrelevant for inference).

Optimize for inference

To avoid problems caused by unsupported training ops, the TensorFlow installation includes a tool, optimize_for_inference, that removes all nodes that aren't needed for a given set of input and outputs.

The script also does a few other optimizations that help speed up the model, such as merging explicit batch normalization operations into the convolutional weights to reduce the number of calculations. This can give a 30% speed up, depending on the input model. Here's how you run the script:

python -m tensorflow.python.tools.optimize_for_inference \
  --input=tf_files/retrained_graph.pb \
  --output=tf_files/optimized_graph.pb \
  --input_names="input" \
  --output_names="final_result"

Running this script creates a new file at tf_files/optimized_graph.pb.

Verify the optimized model

To check that optimize_for_inference hasn't altered the output of the network, compare the label_image output for retrained_graph.pb with that of optimized_graph.pb:

python -m scripts.label_image \
  --graph=tf_files/retrained_graph.pb\
  --image=tf_files/flower_photos/daisy/3475870145_685a19116d.jpg
python -m scripts.label_image \
    --graph=tf_files/optimized_graph.pb \
    --image=tf_files/flower_photos/daisy/3475870145_685a19116d.jpg

When I run these commands I see no change in the output probabilities to 5 decimal places.

Now run it yourself to confirm that you see similar results.

Investigate the changes with TensorBoard

If you followed along for the first tutorial, you should have a tf_files/training_summaries/ directory (otherwise, just create the directory by issuing the following Linux command: mkdir tf_files/training_summaries/).

The following two commands will kill any runninng TensorBoard instances and launch a new instance, in the background watching that directory:

pkill -f tensorboard
tensorboard --logdir tf_files/training_summaries &

TensorBoard, running in the background, may occasionally print the following warning to your terminal, which you may safely ignore

WARNING:tensorflow:path ../external/data/plugin/text/runs not found, sending 404.

Now add your two graphs as TensorBoard logs:

python -m scripts.graph_pb2tb tf_files/training_summaries/retrained \
  tf_files/retrained_graph.pb 

python -m scripts.graph_pb2tb tf_files/training_summaries/optimized \
  tf_files/optimized_graph.pb 

Now open TensorBoard, and navigate to the "Graph" tab. Then from the pick-list labeled "Run"on the left side, select "Retrained".

Explore the graph a little, then select "Optimized" from the "Run" menu.

From here you can confirm some nodes have been merged to simplify the graph. You can expand the various blocks by double-clicking them.

Check the compression baseline

The retrained model is still 84MB in size at this point. That large download size may be a limiting factor for any app that includes it.

Every mobile app distribution system compresses the package before distribution. So test how much the graph can be compressed using the gzip command:

gzip -c tf_files/optimized_graph.pb > tf_files/optimized_graph.pb.gz

gzip -l tf_files/optimized_graph.pb.gz
         compressed        uncompressed  ratio uncompressed_name
            5028302             5460013   7.9% tf_files/optimized_graph.pb

Not much!

On its own, compression is not a huge help. For me this only shaves 8% off the model size. If you're familiar with how neural networks and compression work this should be unsurprising.

The majority of the space taken up by the graph is by the weights, which are large blocks of floating point numbers. Each weight has a slightly different floating point value, with very little regularity.

But compression works by exploiting regularity in the data, which explains the failure here.

Example: Quantize an Image

Images can also be thought of as large blocks of numbers. One simple technique for compressing images it to reduce the number of colors. You will do the same thing to your network weights, after I demonstrate the effect on an image.

Below I've used ImageMagick's convert utility to reduce an image to 32 colors. This reduces the image size by more than a factor of 5 (png has built in compression), but has degraded the image quality.

24 bit color: 290KB

32 colors: 55KB

Image CC-BY, by Fabrizio Sciami

Quantize the network weights

Applying an almost identical process to your neural network weights has a similar effect. It gives a lot more repetition for the compression algorithm to take advantage of, while reducing the precision by a small amount (typically less than a 1% drop in precision).

It does this without any changes to the structure of the network, it simply quantizes the constants in place.

Now use the quantize_graph script to apply these changes:

(This script is from the TensorFlow repository, but it is not included in the default installation)

python -m scripts.quantize_graph \
  --input=tf_files/optimized_graph.pb \
  --output=tf_files/rounded_graph.pb \
  --output_node_names=final_result \
  --mode=weights_rounded

Now try compressing this quantized model:

gzip -c tf_files/rounded_graph.pb > tf_files/rounded_graph.pb.gz

gzip -l tf_files/rounded_graph.pb.gz
         compressed        uncompressed  ratio uncompressed_name
            1633131             5460032  70.1% tf_files/rounded_graph.pb

You should see a significant improvement. I get 70% compression instead of the 8% that gzip provided for the original model.

Now before you continue, verify that the quantization process hasn't had too negative an effect on the model's performance.

First manually compare the two models on an example image.

python -m scripts.label_image \
  --image=tf_files/flower_photos/daisy/3475870145_685a19116d.jpg \
  --graph=tf_files/optimized_graph.pb
python -m scripts.label_image \
  --image=tf_files/flower_photos/daisy/3475870145_685a19116d.jpg \
  --graph=tf_files/rounded_graph.pb

For me, on this input image, the output probabilities have each changed by less than one tenth of a percent (absolute).

Next verify the change on a larger slice if the data to see how it affects overall performance.

First evaluate the performance of the baseline model on the validation set. The last two lines of the output show the average performance. It may take a minute or two to get the results back.

python -m scripts.evaluate  tf_files/optimized_graph.pb

For me, optimized_graph.pb scores 96.1% accuracy, and 0.258 for cross entropy error (lower is better).

Now compare that with the performance of the model in rounded_graph.pb:

python -m scripts.evaluate  tf_files/rounded_graph.pb

You should see less than a 1% change in the model accuracy.

These differences are far from statistically significant. The goal is simply to confirm that the model was clearly not broken by this change.

Add your model to the project

The demo project is configured to search for the rounded_graph.pb, and retrained_labels.txt files in the android/assets directory. Copy the files you just created into the expected location:

cp tf_files/rounded_graph.pb android/assets/graph.pb
cp tf_files/retrained_labels.txt android/assets/labels.txt 

Install AndroidStudio

If you don't have it installed already, go install AndroidStudio.

Open the project with AndroidStudio

Open a project with AndroidStudio by taking the following steps:

  1. Open AndroidStudio. After it loads select " Open an existing Android Studio project" from this popup:

  1. In the file selector, choose tensorflow-for-poets-2/android from your working directory.
  1. You will get a "Gradle Sync" popup, the first time you open the project, asking about using gradle wrapper. Click "OK".

Change the output_name in ClassifierActivity.java

The app is currently set up to run the baseline MobileNet. The output node for our model has a different name. Open ClassifierActivity.java and update the OUTPUT_NAME variable.

ClassifierActivity.java

  private static final String INPUT_NAME = "input";
  private static final String OUTPUT_NAME = "final_result";

Set up the Android device

You can't load the app from android studio onto your phone unless you activate "developer mode" and "USB Debugging". This is a one time setup process.

Follow the instructions at how to geek.

Build and install the app

Plug in your phone and hit play, , in Android Studio to start the build and install process.

Next you will need to select your phone from this popup:

Now allow the Tensorflow Demo to access your camera and files:

Now the app is installed, click the app icon, , to launch it. You can hold the power and volume-down buttons to take a screenshot.

Now try a web search for flowers, point the camera at the computer screen, and see if those pictures are correctly classified.

It should look something like this:

Image CC-BY by Orazio Puccio

Or have a friend take a picture of you and find out what kind of TensorFlower you are !

So now that you have the app running, let's look at the TensorFlow specific code.

TensorFlow-Android AAR

This app uses a pre-compiled Android Archive (AAR) for its TensorFlow dependencies. This AAR is hosted on jcenter. The code to build the AAR lives in tensorflow.contrib.android.

The following lines in the build.gradle file include the AAR in the project.

build.gradle

repositories {
   jcenter()
}

dependencies {
   compile 'org.tensorflow:tensorflow-android:+'
}

Using the TensorFlow Inference Interface

The code interfacing to the TensorFlow is all contained in TensorFlowImageClassifier.java.

Create the Interface

The first block of interest simply creates a TensorFlowInferenceInterface, which loads the named TensorFlow graph using the assetManager.

This is similar to a tf.Session (for those familiar with TensorFlow in Python).

TensorFlowImageClassifier.java

// load the model into a TensorFlowInferenceInterface.
c.inferenceInterface = new TensorFlowInferenceInterface(
    assetManager, modelFilename);

Inspect the output node

This model can be retrained with different numbers of output classes. To ensure that we build an output array with the right size, we inspect the TensorFlow operations:

TensorFlowImageClassifier.java

// Get the tensorflow node
final Operation operation = c.inferenceInterface.graphOperation(outputName);

// Inspect its shape
final int numClasses = (int) operation.output(0).shape().size(1);

// Build the output array with the correct size.
c.outputs = new float[numClasses];

Feed in the input

To run the network, we need to feed in our data. We use the feed method for that. To use feed, we must pass:

The following lines execute the feed method.

TensorFlowImageClassifier.java

inferenceInterface.feed(
    inputName,   // The name of the node to feed. 
    floatValues, // The array to feed
    1, inputSize, inputSize, 3 ); // The shape of the array

Execute the calculation

Now that the inputs are in place, we can run the calculation.

Note how this run method takes an array of output names because you may want to pull more than one output. It also accepts a boolean flag to control logging.

TensorFlowImageClassifier.java

inferenceInterface.run(
    outputNames, // Names of all the nodes to calculate.
    logStats);   // Bool, enable stat logging.

Fetch the output

Now that the output has been calculated we can pull it out of the model into a local variable.
The outputs array here is the one we sized by inspecting the output Operation earlier.

Call this fetch method once for each output you wish to fetch.

TensorFlowImageClassifier.java

inferenceInterface.fetch(
    outputName,  // Fetch this output.
    outputs);    // Into the prepared array.

There are lots of options: