TensorFlow Lite for Microcontrollers in the Gecko SDK

TensorFlow Lite for Microcontrollers is a framework that provides a set of tools for running neural network inference on microcontrollers. The framework is limited to model inference and does not support training. It contains a wide selection of kernel operators with good support for 8-bit quantized networks.

Silicon Labs provides an integration of TensorFlow Lite for Microcontrollers with the Gecko SDK by introducing TensorFlow component support. This overview explains how to get started with TensorFlow Lite for Microcontrollers with the Gecko SDK.

SDK Component Overview

The components required to use TensorFlow Lite for Microcontrollers are under Platform|Machine Learning|TensorFlow. Important components are described here.

TensorFlow Third Party Dependencies

A specific version of the CMSIS-NN library is used as a part of TensorFlow Lite for Microcontrollers to optimize certain kernels. This library is included in the project together with TensorFlow Lite for Microcontrollers. Because of a mismatch between this version and the CMSIS library included by the Gecko SDK, avoid using functions from the Gecko SDK version of CMSIS-DSP and CMSIS-NN elsewhere in the project.

Sample Applications

The following applications demonstrate the TensorFlow Lite for Microcontrollers framework with the Gecko SDK.

Voice Control Light

This application demonstrates a neural network with TensorFlow Lite for Microcontrollers to detect the spoken words "on" and "off" from audio data recorded on the microphone in a Micrium OS kernel task.

The detected keywords are used to control an LED on the board. The audio data is sampled continuously and preprocessed using the Audio Feature Generator component. Inference is run every 200ms on the past ~1s of audio data.

TensorFlow Lite Micro - Hello World

This application demonstrates a model trained to replicate a sine function, and use the inference results to fade an LED. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.

The model used is approximately 2.5KB. The entire application takes around 157KB flash and 15KB RAM. This application uses large amounts of flash memory because it does not manually specify which operations are used in the model, and, as a result, compiles all kernel implementations.

The application illustrates a minimal inference application and serves as a good starting point for understanding the TensorFlow Lite for Microcontrollers model interpretation flow.

TensorFlow Lite Micro - Micro Speech

This application demonstrates a 20KB model trained to detect simple words from speech data recorded from a microphone. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.

This application uses around 100KB flash and 37KB of RAM. Around 10KB of the RAM usage is related to FFT frontend and to store audio data. With a clock speed of 38.4MHz, and using the optimized kernel implementations, the inference time on ~1s of audio data is approximately 111ms.

This application illustrates the process of generating features from audio data and doing detections in real time. It also demonstrates how to manually specify which operations are used in the network, which saves a significant amount of flash.

TensorFlow Lite Micro - Magic Wand

This application demonstrates a 10KB model trained to recognize various hand gestures using an accelerometer to detect the motion. The detected gestures are printed to the serial port. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.

This application uses around 104KB flash and 25KB of RAM. This application demonstrates how to use accelerometer data as inference input and also shows how to manually specify which operations are used in the network, which saves a significant amount of flash.

TensorFlow Model Profiler

This application is designed to profile a TensorFlow Lite Micro model on Silicon Labs hardware. The model used by the application is provided by a TensorFlow Lite flatbuffer file called model.tflite in the config/tflite subdirectory. The profiler will measure the number of CPU clock cycles and elapsed time in each layer of the model when performing an inference. It will also produce a summary when inference is done. The input layer of the model is filled with all zeroes before performing a single inference. Profiling results are transmitted over VCOM.

In order to run the application with a different .tflite model the user can replace the file called model.tflite with a new TensorFlow Lite Micro flatbuffer file. This new file must also be called "model.tflite" and be placed inside the config/tflite subdirectory in order to be picked up by the sample application. After the model has been replaced the user will need to regenerate the project.

In order to load and perform inference on a TensorFlow Lite Micro model we need to allocate a number of bytes to a "tensor arena" in order to hold state needed by TensorFlow Lite Micro. The size of this tensor arena depends on the size of the model and the number of operators. This TensorFlow Model Profiler application can be used to measure the amount of RAM needed by the tensor arena in order to load the specific TensorFlow Lite Micro model. This is measured by dynamically allocating RAM for the tensor arena and reporting the number of bytes needed on VCOM. The number of bytes needed for the tensor arean can later be used to statically allocate memory when the model is used in a different application.

Getting Started with Machine Learning on Silicon Labs Devices

Getting Started with Machine Learning provides step-by-step instructions on how to build a machine learning (ML) application using TensorFlow Lite for Microcontrollers on Silicon Labs devices.

Version

Commit #3e190e5389be49c94475e509452bdae245bd4fa6 of TensorFlow Lite for Microcontrollers is used in the Gecko SDK.