TensorFlow Lite for Microcontrollers in the Gecko SDK
TensorFlow Lite for Microcontrollers is a framework that provides a set of tools for running neural network inference on microcontrollers. The framework is limited to model inference and does not support training. It contains a wide selection of kernel operators with good support for 8-bit quantized networks.
Silicon Labs provides an integration of TensorFlow Lite for Microcontrollers with the Gecko SDK by introducing TensorFlow component support. This overview explains how to get started with TensorFlow Lite for Microcontrollers with the Gecko SDK.
SDK Component Overview
The components required to use TensorFlow Lite for Microcontrollers are
Platform|Machine Learning|TensorFlow. Important components are described here.
- TensorFlow Lite for Microcontrollers - This component contains the TensorFlow Lite for Microcontrollers framework and includes all necessary code to set up and perform model inference.
- TensorFlow Lite for Microcontrollers Init - This init component provides initialization of TensorFlow Lite Micro by creating an opcode resolver and interpreter for the given flatbuffer. In addition it creates the tensor arena buffer, where the size can be set in the component configuration.
- TensorFlow Lite Micro Reference Kernels This component includes the necessary framework around all the supported kernels, and is automatically included with the TensorFlow Lite for Microcontrollers component. In addition, this component provides unoptimized software implementations of all kernels. This is a default implementation, which is easy to understand and can run on any platform. As a result, these kernels may run more slowly than an optimal implementation.
- TensorFlow Lite Micro Optimized Kernels Some kernels have implementations that have been optimized for certain CPU architectures. Using these kernels when available can improve inference performance significantly. By enabling this component, the available optimized kernel implementations are added to the project, replacing the corresponding reference kernel implementations. The remaining kernels fall back to use the reference implementations.
- TensorFlow Lite Micro Debug Logging IO Stream/None - Debug logging is used in TensorFlow to display debug and error information. Additionally, it can be used to display inference results. Debug logging is enabled by default, with an implementation that uses I/O Stream to print over UART to VCOM, and can be disabled by ensuring that the component "TensorFlow Lite Micro Debug Log - None" is included in the project.
- Audio Feature Generator - The audio feature generator can be used to extract time-frequency features from an audio signal for use with machine learning (ML) audio classification applications. The generated feature array is a mel-scaled spectrogram, representing the frequency information of the signal of a given sample length of audio.
TensorFlow Third Party Dependencies
A specific version of the CMSIS-NN library is used as a part of TensorFlow Lite for Microcontrollers to optimize certain kernels. This library is included in the project together with TensorFlow Lite for Microcontrollers. Because of a mismatch between this version and the CMSIS library included by the Gecko SDK, avoid using functions from the Gecko SDK version of CMSIS-DSP and CMSIS-NN elsewhere in the project.
The following applications demonstrate the TensorFlow Lite for Microcontrollers framework with the Gecko SDK.
Voice Control Light
This application demonstrates a neural network with TensorFlow Lite for Microcontrollers to detect the spoken words "on" and "off" from audio data recorded on the microphone in a Micrium OS kernel task.
The detected keywords are used to control an LED on the board. The audio data is sampled continuously and preprocessed using the Audio Feature Generator component. Inference is run every 200ms on the past ~1s of audio data.
TensorFlow Lite Micro - Hello World
This application demonstrates a model trained to replicate a sine function, and use the inference results to fade an LED. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.
The model used is approximately 2.5KB. The entire application takes around 157KB flash and 15KB RAM. This application uses large amounts of flash memory because it does not manually specify which operations are used in the model, and, as a result, compiles all kernel implementations.
The application illustrates a minimal inference application and serves as a good starting point for understanding the TensorFlow Lite for Microcontrollers model interpretation flow.
TensorFlow Lite Micro - Micro Speech
This application demonstrates a 20KB model trained to detect simple words from speech data recorded from a microphone. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.
This application uses around 100KB flash and 37KB of RAM. Around 10KB of the RAM usage is related to FFT frontend and to store audio data. With a clock speed of 38.4MHz, and using the optimized kernel implementations, the inference time on ~1s of audio data is approximately 111ms.
This application illustrates the process of generating features from audio data and doing detections in real time. It also demonstrates how to manually specify which operations are used in the network, which saves a significant amount of flash.
TensorFlow Lite Micro - Magic Wand
This application demonstrates a 10KB model trained to recognize various hand gestures using an accelerometer to detect the motion. The detected gestures are printed to the serial port. The application is originally written by TensorFlow, but has been ported to the Gecko SDK.
This application uses around 104KB flash and 25KB of RAM. This application demonstrates how to use accelerometer data as inference input and also shows how to manually specify which operations are used in the network, which saves a significant amount of flash.
TensorFlow Model Profiler
This application is designed to profile a TensorFlow Lite Micro model on Silicon Labs hardware. The model used by the application is provided by a TensorFlow Lite flatbuffer file called model.tflite in the config/tflite subdirectory. The profiler will measure the number of CPU clock cycles and elapsed time in each layer of the model when performing an inference. It will also produce a summary when inference is done. The input layer of the model is filled with all zeroes before performing a single inference. Profiling results are transmitted over VCOM.
In order to run the application with a different .tflite model the user can replace the file called model.tflite with a new TensorFlow Lite Micro flatbuffer file. This new file must also be called "model.tflite" and be placed inside the config/tflite subdirectory in order to be picked up by the sample application. After the model has been replaced the user will need to regenerate the project.
In order to load and perform inference on a TensorFlow Lite Micro model we need to allocate a number of bytes to a "tensor arena" in order to hold state needed by TensorFlow Lite Micro. The size of this tensor arena depends on the size of the model and the number of operators. This TensorFlow Model Profiler application can be used to measure the amount of RAM needed by the tensor arena in order to load the specific TensorFlow Lite Micro model. This is measured by dynamically allocating RAM for the tensor arena and reporting the number of bytes needed on VCOM. The number of bytes needed for the tensor arean can later be used to statically allocate memory when the model is used in a different application.
Getting Started with Machine Learning on Silicon Labs Devices
Getting Started with Machine Learning provides step-by-step instructions on how to build a machine learning (ML) application using TensorFlow Lite for Microcontrollers on Silicon Labs devices.
Commit #3e190e5389be49c94475e509452bdae245bd4fa6 of TensorFlow Lite for Microcontrollers is used in the Gecko SDK.