AI Speech Recognition with TensorFlow Lite for Microcontrollers and SparkFun Edge

1. Introduction

What you'll build

In this codelab, we'll learn to use TensorFlow Lite For Microcontrollers to run a deep learning model on the SparkFun Edge Development Board. We'll be working with the board's built-in speech detection model, which uses a convolutional neural network to detect the words "yes" and "no" being spoken via the board's two microphones.

bf256d403a1821af.gif

Machine Learning on Microcontrollers

Machine learning can be used to create intelligent tools that make users' lives easier, like Google Assistant. But often, these experiences require a lot of computation or resources that can include a powerful cloud server or a desktop. However, it's now possible to run machine learning inference on tiny, low-powered hardware, like microcontrollers.

Microcontrollers are extremely common, cheap, require very little energy, and are very reliable. They are part of all sorts of household devices: think appliances, cars, and toys. In fact, there are around 30 billion microcontroller-powered devices produced each year.

1360b61fbfa33657.jpeg

By bringing machine learning to tiny microcontrollers, we can boost the intelligence of billions of devices that we use in our lives, without relying on expensive hardware or reliable internet connections. Imagine smart appliances that can adapt to your daily routine, intelligent industrial sensors that understand the difference between problems and normal operation, and magical toys that can help kids learn in fun and delightful ways.

TensorFlow Lite For Microcontrollers (Software)

358ffdb9eb758b90.png

TensorFlow is Google's open source machine learning framework for training and running models. TensorFlow Lite is a software framework, an optimized version of TensorFlow, targeted to run tensorflow models on small, relatively low-powered devices such as mobile phones.

TensorFlow Lite For Microcontrollers is a software framework, an optimized version of TensorFlow, targeted to run tensorflow models on tiny, low-powered hardware such as microcontrollers. It adheres to constraints required in these embedded environments, i.e, it has a small binary size, it doesn't require operating system support, any standard C or C++ libraries, or dynamic memory allocation, etc.

SparkFun Edge (Hardware)

The SparkFun Edge is a microcontroller-based platform: a tiny computer on a single circuit board. It has a processor, memory, and I/O hardware that allows it to send and receive digital signals to other devices. It has four software-controllable LEDs, in your favorite Google colors.

aa4493835a2338c6.png

Unlike a computer, a microcontroller doesn't run an operating system. Instead, the programs you write run directly on the hardware. You write your code on a computer and download it to the microcontroller via a device called a programmer.

Microcontrollers are not powerful computers. They have small processors, and not much memory. But because they are designed to be as simple as possible, a microcontroller can use very little energy. Depending on what your program does, the SparkFun Edge can run for weeks on a single coin cell battery!

What you'll learn

  • Compile the sample program for the SparkFun Edge on your computer
  • Deploy the program to your device
  • Make changes to the program and deploy it again

What you'll need

You will need the following hardware:

You will need the following software:

  • Git (check if it's installed by running git on the command line)
  • Python 3 (check if it's installed by running python3 or python --version on the command line)
  • Pip for Python 3 ( helpful StackOverflow answer)
  • Make 4.2.1 or higher (check if it's installed by running make --version on the command line)
  • SparkFun Serial Basic drivers

2. Set up your hardware

The SparkFun Edge microcontroller comes with a pre-installed binary that can run the speech model. Before we overwrite this with our own version, let's first run this model.

Power your board by:

  1. Inserting a coin cell battery into the battery connector on the back of the board (with the "+" side of the battery facing up. If your board came with a battery already inserted, pull out the plastic tab, and push the battery to ensure it's fully inserted)

25a6cc6b208e8a4e.png

  1. If you don't have a coin battery, you can use the SparkFun USB-C Serial Basic programmer device to power the board. To attach this device to your board, perform the following steps:
  • Locate the six pin header on the side of the SparkFun Edge.
  • Plug the SparkFun USB-C Serial Basic into these pins, ensuring the pins labelled "BLK" and "GRN" on each device are lined up correctly.
  • Connect a USB-C cable between the SparkFun USB-C Serial Basic and your computer.

b140822f0019f92a.png

Once you've powered your board by inserting the battery or connecting the USB programmer, the board will wake up and begin listening with its microphones. The blue light should begin to flash.

The machine learning model on the board is trained to recognize the words "yes" and "no", and to detect the presence and absence of speech. It communicates its results by lighting colored LEDs. The following table shows the meaning of each LED color:

Detection result

LED color

"Yes"

Yellow

"No"

Red

Unknown speech

Green

No speech detected

No LEDs lit

Give it a try

Hold the board up to your mouth and say "yes" a few times. You'll see the yellow LED flash. If nothing happens when you say "yes", here are some things to try:

  • Hold the board around 10" from your mouth
  • Avoid excessive background noise
  • Repeat "yes" several times in quick succession (try saying "yes yes yes")

3. Set up your software

We're now going to download, install and run the speech model on the microcontroller ourselves. For this, we first download the source code for this program and the dependencies we need to build it. The program is written in C++, which must be compiled into a binary before being downloaded onto the board. A binary is a file that contains the program in a form that can be run directly by the SparkFun Edge hardware.

The following instructions are written for Linux or MacOS.

Download the TensorFlow repo

The code is available in the TensorFlow repository on GitHub, in the following location:

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/micro

Open a terminal on your computer, change to a directory where you usually store coding projects, download the TensorFlow repository and enter the directory created, as shown below:

cd ~  # change into your home (or any other) directory
git clone --depth 1 https://github.com/tensorflow/tensorflow.git
cd tensorflow

Download Python dependencies

We'll be using Python 3 to prepare our binary and flash it to the device. The Python scripts depend on certain libraries being available. Run the following command to install these dependencies:

pip3 install pycrypto pyserial --user

4. Build and prepare the binary

We're going to build the binary and run commands that prepare it for downloading to the device.

Build the binary

To download all required dependencies and create the binary, run the following command:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=sparkfun_edge micro_speech_bin

If the build works successfully, the final line of the output should appear as follows:

arm-none-eabi-objcopy tensorflow/lite/micro/tools/make/gen/sparkfun_edge_cortex-m4/bin/micro_speech tensorflow/lite/micro/tools/make/gen/sparkfun_edge_cortex-m4/bin/micro_speech.bin -O binary

To confirm that the binary was successfully created, run the following command:

test -f \
tensorflow/lite/micro/tools/make/gen/sparkfun_edge_cortex-m4/bin/micro_speech.bin && \
 echo "Binary was successfully created" || echo "Binary is missing"

You should see Binary was successfully created printed to the console! If you see Binary is missing, there was a problem with the build process that will require debugging.

Prepare the binary

The binary must be signed with cryptographic keys to be deployed to the device. We'll now run some commands that will sign our binary so it can be downloaded to the SparkFun Edge.

Enter the following command to set up some dummy cryptographic keys we can use for development:

cp tensorflow/lite/micro/tools/make/downloads/AmbiqSuite-Rel2.2.0/tools/apollo3_scripts/keys_info0.py tensorflow/lite/micro/tools/make/downloads/AmbiqSuite-Rel2.2.0/tools/apollo3_scripts/keys_info.py

Now, run the following command to create a signed binary:

python3 tensorflow/lite/micro/tools/make/downloads/AmbiqSuite-Rel2.2.0/tools/apollo3_scripts/create_cust_image_blob.py \
--bin tensorflow/lite/micro/tools/make/gen/sparkfun_edge_cortex-m4/bin/micro_speech.bin \
--load-address 0xC000 \
--magic-num 0xCB \
-o main_nonsecure_ota \
--version 0x0

This will create the file main_nonsecure_ota.bin. We'll now run another command to create a final version of the file that can be used to flash our device with the bootloader script we will use in the next step:

python3 tensorflow/lite/micro/tools/make/downloads/AmbiqSuite-Rel2.2.0/tools/apollo3_scripts/create_cust_wireupdate_blob.py \
--load-address 0x20000 \
--bin main_nonsecure_ota.bin \
-i 6 \
-o main_nonsecure_wire \
--options 0x1

You should now have a file called main_nonsecure_wire.bin in the directory where you ran the commands. This is the file we'll be flashing to the device.

5. Get ready to flash the binary

What is flashing?

The SparkFun Edge stores the program it is currently running in its 512 kilobytes of flash memory. If we want the board to run a new program, we have to send it to the board, which will store it in flash memory, overwriting any program that was previously saved.

This process is called "flashing", and we'll use it to send our program to the board.

Attach the programmer to the board

To download new programs to the board, we'll be using the SparkFun USB-C Serial Basic serial programmer. This device allows your computer to communicate with the microcontroller via USB.

To attach this device to your board, perform the following steps:

  1. Locate the six pin header on the side of the SparkFun Edge.
  2. Plug the SparkFun USB-C Serial Basic into these pins, ensuring the pins labelled "BLK" and "GRN" on each device are lined up correctly.

b140822f0019f92a.png

Attach the programmer to your computer

We'll be connecting the board to your computer via USB. To program the board, we'll need to know the name that your computer gives the device. The best way of doing this is to list all the computer's devices before and after attaching it, and look to see which device is new.

Before attaching the device via USB, run the following command:

If you are using Linux: ls /dev/tty*
If you are using MacOS: ls /dev/cu*

This should output a list of attached devices that looks something like the following:

/dev/cu.Bluetooth-Incoming-Port
/dev/cu.MALS
/dev/cu.SOC

Now, connect the programmer to your computer's USB port. Enter the following command again:

If you are using Linux: ls /dev/tty*
If you are using MacOS: ls /dev/cu*

You should see an extra item in the output, as in the example below. Your new item may have a different name. This new item is the name of the device.

/dev/cu.Bluetooth-Incoming-Port
/dev/cu.MALS
/dev/cu.SOC
/dev/cu.wchusbserial-1450

First, we'll create an environment variable to identified the device name:

export DEVICENAME=put your device name here

Next, we'll create an environment variable to specify the baud rate, which is the speed at which data will be sent to the device:

export BAUD_RATE=921600

6. Flash the binary

Run the script to flash your board

To flash the board, we have to put it into a special "bootloader" state that prepares it to receive the new binary. We'll then run a script to send the binary to the board.

Let's get familiar with the following buttons on the board:

64c620570b9d2f83.png

Perform the following steps to reset and flash the board:

  1. Ensure your board is connected to the programmer, and the entire setup is connected to your computer via USB.
  2. Start holding the button marked 14 on the board. Keep holding it until Step 6.
  3. Still holding the button marked 14, in order to reset the board into its bootloader state, click the button marked RST to reset the board.
  4. Still holding the button marked 14, paste the following command into your terminal and hit enter to run it (For convenience, you can paste this command into your terminal before you start holding the button, but don't press enter until you reach this step)
python3 tensorflow/lite/micro/tools/make/downloads/AmbiqSuite-Rel2.2.0/tools/apollo3_scripts/uart_wired_update.py -b ${BAUD_RATE} ${DEVICENAME} -r 1 -f main_nonsecure_wire.bin -i 6
  1. Still holding the button marked 14, you should now see something like the following appearing on-screen:
Connecting with Corvette over serial port /dev/cu.usbserial-1440...
Sending Hello.
Received response for Hello
Received Status
length =  0x58
version =  0x3
Max Storage =  0x4ffa0
Status =  0x2
State =  0x7
AMInfo =
0x1
0xff2da3ff
0x55fff
0x1
0x49f40003
0xffffffff
[...lots more 0xffffffff...]
Sending OTA Descriptor =  0xfe000
Sending Update Command.
number of updates needed =  1
Sending block of size  0x158b0  from  0x0  to  0x158b0
Sending Data Packet of length  8180
Sending Data Packet of length  8180
[...lots more Sending Data Packet of length  8180...]
  1. Stop holding the button marked 14 on the board after seeing Sending Data Packet of length 8180 (but it's okay if you keep holding it). The program will continue to print lines on the terminal. It will eventually look something like the following:
[...lots more Sending Data Packet of length  8180...]
Sending Data Packet of length  8180
Sending Data Packet of length  6440
Sending Reset Command.
Done.

If you see Done then this indicates a successful flashing. If the program output ends with an error, check if Sending Reset Command was printed. If so, flashing was likely successful despite the error.

On a Linux machine, you may encounter a NoResponse Error. This is because the ch34x serial driver has been installed alongside the existing serial driver, which can be resolved as follows:

Step 1: Re-install the correct version of the ch34x library. Ensure that the device is unplugged from the computer during the installation.

git clone https://github.com/juliagoda/CH341SER.git
cd CH341SER/
make
sudo insmod ch34x.ko
sudo rmmod ch341

Step 2: Plug the board USB in and run:

dmesg | grep "ch34x"

You should see a message like this:

[ 1299.444724]  ch34x_attach+0x1af/0x280 [ch34x]
[ 1299.445386] usb 2-13.1: ch34x converter now attached to ttyUSB0

If the driver used is not "ch34x" (eg: ch341), try disabling the other driver by running:

rmmod <non-ch34x driver name>

Unplug and replug the device and ensure that the driver being used is "ch34x".

7. Demo

Try the program out

Once your board has been successfully flashed, hit the button marked

RST to restart the board and start the program. If the blue LED starts blinking, flashing is successful. If not, scroll down to the "What if it didn't work?" section below.

bf256d403a1821af.gif

The machine learning model on the board is trained to recognize the words "yes" and "no", and to detect the presence and absence of speech. It communicates its results by lighting colored LEDs. The following table shows the meaning of each LED color:

Detection result

LED color

"Yes"

Yellow

"No"

Red

Unknown speech

Green

No speech detected

No LEDs lit

Give it a try

Hold the board up to your mouth and say "yes" a few times. You'll see the yellow LED flash. If nothing happens when you say "yes", here are some things to try:

  • Hold the board around 10" from your mouth
  • Avoid excessive background noise
  • Repeat "yes" several times in quick succession (try saying "yes yes yes")

What if it didn't work?

Here are some possible issues and how to debug them:

Problem: After flashing, none of the LEDs are coming on.

Solution: Try hitting the RST button, or disconnecting and reconnecting the board from the programmer. If none of these work, try flashing the board again.

Problem: The blue LED is lighting up, but it's very dim.

Solution: Replace the battery as it's running low. Alternatively, the board can be powered by computer using the programmer and the cable.

8. Read the debug output (optional)

Review this section if you face issues and need to debug your code in detail. In order to understand what is going on in a microcontroller when your code runs, you can print debugging information through the board's serial connection. You use your computer to connect to the board and display the data that the board is sending.

Open a serial connection

By default, our SparkFun Edge sample code logs any spoken commands, along with their confidence. To see the board's output you can run the following command:

screen ${DEVICENAME} 115200

You may initially see an output that looks something like the following: (This only appears if the board is reset once connected else you may start seeing debug information)

Apollo3 Burst Mode is Available

                               Apollo3 operating in Burst Mode (96MHz)

Try issuing some commands by saying "yes" or "no". You should see the board printing debug information for each command:

 Heard yes (202) @65536ms

In the above log, yes refers to the command. The number 202 refers to the level of confidence that the command was heard (with 200 being the minimum). Finally, 65536ms refers to the amount of time that has elapsed since the microcontroller was last reset.

To stop viewing the debug output, hit Ctrl+A, immediately followed by the K key, then hit the Y key.

Write debug logs

You can see the code that logs this information in the command_responder.cc file you were just working with:

tensorflow/lite/micro/examples/micro_speech/sparkfun_edge/command_responder.cc

To log data, you can call the error_reporter->Report() method. It supports the standard printf tokens for string interpolation, which you can use to include important information in your logs:

error_reporter->Report("Heard %s (%d) @%dms", found_command, score, current_time);

This method should come in handy when you are making your own changes to the code in the next section.

9. Extend the code (optional)

Now that you know how to build and flash your SparkFun Edge, you can start playing with the code and deploying it to your device to see the results.

Read the code

A good place to start reading the code is the following file, command_responder.cc.

tensorflow/lite/micro/examples/micro_speech/sparkfun_edge/command_responder.cc

You can see the file on GitHub here.

The method in this file, RespondToCommand, is called when a voice command is detected. The existing code turns on a different LED depending on whether "yes", "no", or an unknown command was heard. The following snippet shows how this works:

if (found_command[0] == 'y') {
  am_hal_gpio_output_set(AM_BSP_GPIO_LED_YELLOW);
}
if (found_command[0] == 'n') {
  am_hal_gpio_output_set(AM_BSP_GPIO_LED_RED);
}
if (found_command[0] == 'u') {
  am_hal_gpio_output_set(AM_BSP_GPIO_LED_GREEN);
}

The found_command argument contains the name of the command that was detected. By checking the first character, this set of if statements determines which LED to light.

The method RespondToCommand is called with several arguments:

void RespondToCommand(tflite::ErrorReporter* error_reporter,
    int32_t current_time, const char* found_command,
    uint8_t score, bool is_new_command) {
  • error_reporter is used to log debug information (more on that later).
  • current_time represents the time that the command was detected.
  • found_command tells us which command was detected.
  • score tells us how confident we are that we detected a command.
  • is_new_command lets us know if this is the first time hearing the command.

The score is an integer number from 0-255 that represents the probability that a command was detected. The sample code only considers a command as valid if the score is greater than 200. Based on our testing, most valid commands fall within the range of 200-210.

Modify the code

The SparkFun Edge board has four LEDs. Currently, we're flashing the blue LED to indicate that recognition is occurring. You can see this in the command_responder.cc file:

static int count = 0;

// Toggle the blue LED every time an inference is performed.
++count;
if (count & 1) {
  am_hal_gpio_output_set(AM_BSP_GPIO_LED_BLUE);
} else {
  am_hal_gpio_output_clear(AM_BSP_GPIO_LED_BLUE);
}

Since we have a bank of four LEDs, let's modify the program to use them as a visual indicator of the score of a given command. A low score will merit a single lit LED, and a high score will result in multiple lights.

To ensure we have a way to know that the program is running, we'll make the red LED flash continually instead of the blue. The adjacent blue, green, and yellow LEDs will be used to show the strength of our most recent score. And for simplicity, we'll only light up those LEDs if the word "yes" is spoken. If another word is detected, the LEDs will clear.

To make this change, replace all the code in your command_responder.cc file with the following snippet:

#include "tensorflow/lite/micro/examples/micro_speech/command_responder.h"

#include "am_bsp.h"

// This implementation will light up the LEDs on the board in response to different commands.
void RespondToCommand(tflite::ErrorReporter* error_reporter,
                      int32_t current_time, const char* found_command,
                      uint8_t score, bool is_new_command) {
  static bool is_initialized = false;
  if (!is_initialized) {
    // Setup LEDs as outputs
    am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_RED, g_AM_HAL_GPIO_OUTPUT_12);
    am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_BLUE, g_AM_HAL_GPIO_OUTPUT_12);
    am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_GREEN, g_AM_HAL_GPIO_OUTPUT_12);
    am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_YELLOW, g_AM_HAL_GPIO_OUTPUT_12);
    // Ensure all pins are cleared
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_RED);
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_BLUE);
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_GREEN);
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_YELLOW);
    is_initialized = true;
  }
  static int count = 0;

   // Toggle the red LED every time an inference is performed.
   ++count;
   if (count & 1) {
     am_hal_gpio_output_set(AM_BSP_GPIO_LED_RED);
   } else {
     am_hal_gpio_output_clear(AM_BSP_GPIO_LED_RED);
   }

  if (is_new_command) {
    // Clear the last three LEDs
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_BLUE);
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_GREEN);
    am_hal_gpio_output_clear(AM_BSP_GPIO_LED_YELLOW);
    error_reporter->Report("Heard %s (%d) @%dms", found_command, score,
                           current_time);
    // Only indicate a 'yes'
    if (found_command[0] == 'y') {
      // Always light the blue LED
      am_hal_gpio_output_set(AM_BSP_GPIO_LED_BLUE);
      // Light the other LEDs depending on score
      if (score >= 205) {
        am_hal_gpio_output_set(AM_BSP_GPIO_LED_GREEN);
      }
      if(score >= 210) {
        am_hal_gpio_output_set(AM_BSP_GPIO_LED_YELLOW);
      }
    }
  }
}

If a new command is detected, is_new_command will be true. We'll clear the blue, green, and yellow LEDs, then light them up again depending on the values of found_command and score.

Rebuild and flash

Once you've made code changes, test it by running all steps from Build and prepare the binary.

10. Next Steps

Congratulations, you've successfully built your first speech detector on a microcontroller!

We hope you've enjoyed this brief introduction to development with TensorFlow Lite for Microcontrollers. The idea of deep learning on microcontrollers is new and exciting, and we encourage you to go out and experiment!

Reference docs

26699b18f2b199f.png

Thanks, and have fun building!