Getting Started with Machine Learning on Silicon Labs Devices
Introduction
Silicon Labs integrates TensorFlow as a component within our Gecko SDK and Project Configurator for our EFx32 series microcontrollers, making it simple to add machine learning capability to any application. This guide covers how to get started using TensorFlow Lite for Microcontrollers on Silicon Labs' EFx32 devices.
TensorFlow Lite for Microcontrollers
TensorFlow is a widely used deep learning framework, with capability for developing and executing neural networks across a variety of platforms. TensorFlow Lite provides an optimized set of tools specifically catered towards machine learning for mobile and embedded devices.
TensorFlow Lite for Microcontrollers (TFLM) specifically provides a C++ library for running machine learning models in embedded environments with tight memory constraints. Silicon Labs provides tools and support for loading and running pre-trained models that are compatible with this library.
Gecko SDK TensorFlow Integration
The Gecko SDK includes TensorFlow as a third-party submodule, allowing for easy integration and testing with Silicon Labs' projects. Note that the included TensorFlow version may differ from the latest release of TensorFlow.
Additionally, TensorFlow Software Components in the Project Configurator simplify the process of including the necessary dependencies to use TFLM in a project.
Developing a Machine Learning Model in TFLM
Block Diagram of TensorFlow Lite Micro workflow
When developing and training neural networks for use in embedded systems, it is important to note the limitations on TFLM that apply to model architecture and training . Embedded platforms also have significant performance constraints that must be considered when designing and evaluating a model. The embedded TLFM documentation links describe these limitations and considerations in detail.
Additionally, the
TensorFlow Software Components
in Studio require a quantized
*.tflite
representation of the trained model. Thus
TensorFlow
and
Keras
are the recommended platforms for model development and training, as both platforms are supported by the TensorFlow Lite Converter that generates
.tflite
model representations.
Both TensorFlow and Keras provide guides on model development and training:
Once a model has been created and trained in TensorFlow or Keras, it needs to be converted and serialized into a
*.tflite
file. During model conversion, it is important to optimize the memory usage of the model by quantizing it. It is highly recommended to use
integer quantization
on Silicon Labs devices.
A complete example demonstrating the training, conversion, and quantization of a simple TFLM compatible neural network is available from TensorFlow:
Developing an Inference Application using Simplicity Studio and the Gecko SDK
Once a trained and quantized TFLite model is obtained, the next step is to set up the TFLM libraries to run inference on the EFx32 device.
Project Configurator Setup
The Project Configurator includes TFLM libraries as software components. These software components may be added to any existing project, and the TFLM Software Components are described in the SDK Component Overview . The core components needed for any machine learning project are:
-
TensorFlow Lite Micro. This is the core software component that pulls in all the TFLM dependencies
-
A supported TFLM kernel implementation. A kernel is a specific hardware/platform implmentation of a low level operation used by TensorFlow. Kernel selection can drastically change the performance and computation time of a neural network.
-
A supported TFLM debug logger. The Project Configurator defaults to using the IO Stream implementation of the logger. To disable logging entirely, add the TensorFlow Lite Micro Debug Log - None component.
In addition to these required TFLM components, software components for obtaining and pre-processing sensor data can be added to the project. As an example for audio applications, TensorFlow provides a microphone frontend component that includes powerful DSP features to filter and extract features from raw audio data. Silicon Labs developed drivers for microphones , accelerometers , and other sensors provide an simple interface for obtaining sensor data to feed to a network.
Model Inclusion
With TensorFlow Lite Micro added in the project configurator, the next step is to load the model file into the project. To do this, simply copy the
.tflite
model file into the
config/tflite
directory of the project. The project configurator provides a tool that will automatically convert
.tflite
files into a
sl_ml_model
source and header files. The full documentation for this tool is available at
Flatbuffer Conversion
.
TFLM Initialization and Inference
To instantiate and use the TensorFlow APIs, follow the steps below to add TFLM functionality to the application layer of a project. This guide closely follows the TFLM Getting Started Guide , and is adapted for use with Silicon Labs\' projects.
A special note needs to be taken regarding the operations used by TensorFlow. Operations are specific types of computations executed by a layer in the neural network. All operations may be included in a project at once, but doing this may increase the binary size dramatically (>100kB). The more efficient option is to only include the operations necessary to run a specific model. Both options OBare described in the steps below.
1. Include the library headers
If using a custom, limited set of operations (recommended):
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"
If using all operations:
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"
2. Include the model header
The
autogen/
folder is always included in the project include paths, and so an imported model may be generically included in any project source file with:
#include "sl_ml_model.h"
3. Define a Memory Arena
TensorFlow requires a memory arena for runtime storage of input, output, and intermediate arrays. This arena should be statically allocated, and the size of the arena depends on the model used. It is recommended to start with a large arena size during prototyping.
constexpr int tensor_arena_size = 10 * 1024;
uint8_t tensor_arena[tensor_arena_size];
Note: After prototyping, it is recommended to manually tune the memory arena size to the model used. Once the model is finalized, start with a large arena size and incrementally decrease it until interpreter allocation (described below) fails.
4. Set up Logging
This should be performed even if the
TensorFlow Lite Micro Debug Log - None
component is used. It is recommended to instantiate this statically and call the logger init functions during the
app_init()
sequence in
app.c
static tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = µ_error_reporter;
5. Load the Model
Continuing during the
app_init()
sequence, the next step is to load the model into
tflite
:
const tflite::Model* model = ::tflite::GetModel(g_model);
if (model->version() != TFLITE_SCHEMA_VERSION) {
TF_LITE_REPORT_ERROR(error_reporter,
"Model provided is schema version %d not equal "
"to supported version %d.\n",
model->version(), TFLITE_SCHEMA_VERSION);
}
6. Instantiate the Operations Resolver
If using all operations, this is very straightforward. During
app_init()
, statically instantiate the resolver via:
static tflite::AllOpsResolver resolver;
Note: loading all operations will result in large increases to the binary size. It is recommended to use a custom set of operations.
If using a custom set of operations. A mutable ops resolver must be configured and initialized. This will vary based on the model and application, and to determine the operations utilized in a given
.tflite
file, third party tools such as
netron
may be used to visualize the network and inspect which operations are in use.
Netron Visualization Netron visualization from the TensorFlow Lite Micro hello world example
The example below loads the minimal operators required for the TensorFlow hello_world example model. As shown in the Netron visualization, this only requires fully connected layers:
#define NUM_OPS 1
static tflite::MicroMutableOpResolver<NUM_OPS> micro_op_resolver;
if (micro_op_resolver.AddFullyConnected() != kTfLiteOk) {
return;
}
7. Initialize the Interpreter
The final step during
app_init()
is to instatiate an interpreter and allocate buffers within the memory arena for the interpreter to use:
// static declaration
tflite::MicroInterpreter* interpreter = nullptr;
// initialization in app_init
tflite::MicroInterpreter interpreter(model, micro_op_resolver, tensor_arena,
tensor_arena_size, error_reporter);
interpreter = &interpreter_struct;
TfLiteStatus allocate_status = interpreter.AllocateTensors();
if (allocate_status != kTfLiteOk) {
TF_LITE_REPORT_ERROR(error_reporter, "AllocateTensors() failed");
return;
}
The allocation will fail if the arena is too small to fit all the operations and buffers required by the model. Adjust the
tensor_arena_size
accordingly to resolve the issue.
8. Run the Model
For default behavior in bare-metal application, it is recommended to run the model during
app_process_action()
in
app.c
in order for periodic inferences to occur during the standard event loop. Running the model involves three stages:
-
Sensor data is pre-processed (if necessary) and then is provided as input to the interpreter.
TfLiteTensor* input = interpreter.input(0); // stores 0.0 to the input tensor of the model input->data.f[0] = 0.;
It is important to match the shape of the incoming sensor data to the shape expected by the model. This can optionally be queried by checking properties defined in the
input
struct. An example of this for the hello_world example is shown below:
TfLiteTensor* input = interpreter->input(0);
if ((input->dims->size != 1) || (input->type != kTfLiteFloat32)) {
TF_LITE_REPORT_ERROR(error_reporter,
"Bad input tensor parameters in model");
return;
}
-
The interpreter is then invoked to run all layers of the model.
TfLiteStatus invoke_status = interpreter->Invoke(); if (invoke_status != kTfLiteOk) { TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x_val: %f\n", static_cast<double>(x_val)); return; }
-
The output prediction is read from the interpreter.
TfLiteTensor* output = interpreter->output(0); // Obtain the output value from the tensor float value = output->data.f[0];
At this point, application dependent behavior based on the output prediction should be performed. The application will run inference on each iteration of
app_process_action()
.
Full Code Snippet
After following the steps above and choosing to use the mutable ops resolver, the resulting
app.cpp
now appears as:
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"
#include "sl_ml_model.h"
#define NUM_OPS 1
constexpr int tensor_arena_size = 10 * 1024;
uint8_t tensor_arena[tensor_arena_size];
tflite::MicroInterpreter* interpreter = nullptr;
/***************************************************************************//**
* Initialize application.
******************************************************************************/
void app_init(void)
{
static tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = µ_error_reporter;
const tflite::Model* model = ::tflite::GetModel(g_model);
if (model->version() != TFLITE_SCHEMA_VERSION) {
TF_LITE_REPORT_ERROR(error_reporter,
"Model provided is schema version %d not equal "
"to supported version %d.\n",
model->version(), TFLITE_SCHEMA_VERSION);
}
static tflite::MicroMutableOpResolver<NUM_OPS> micro_op_resolver;
if (micro_op_resolver.AddFullyConnected() != kTfLiteOk) {
return;
}
static tflite::MicroInterpreter interpreter_struct(model, micro_op_resolver, tensor_arena,
tensor_arena_size, error_reporter);
interpreter = &interpreter_struct;
TfLiteStatus allocate_status = interpreter.AllocateTensors();
if (allocate_status != kTfLiteOk) {
TF_LITE_REPORT_ERROR(error_reporter, "AllocateTensors() failed");
return;
}
}
/***************************************************************************//**
* App ticking function.
******************************************************************************/
void app_process_action(void)
{
// stores 0.0 to the input tensor of the model
TfLiteTensor* input = interpreter->input(0);
input->data.f[0] = 0.;
TfLiteStatus invoke_status = interpreter->Invoke();
if (invoke_status != kTfLiteOk) {
TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x_val: %f\n",
static_cast<double>(x_val));
return;
}
TfLiteTensor* output = interpreter->output(0);
float value = output->data.f[0];
}
Examples
As described in
TensorFlow Lite for Microcontrollers in the Gecko SDK
, TensorFlow developed examples demonstrating the
hello_world
example described in this guide, as well as a simple speech recognition example
micro_speech
, are included in the Gecko SDK.
Note that the
micro_speech
example demonstrates use of the
MicroMutableOpResolver
to only load required operations.
Third-party Tools and Partners
Tools:
-
Netron
is a visualization tool for neural networks, compatible with
.tflite
model formats. This is useful for viewing the operations used in a model that need to be included with themutable_ops_resolver
.
AI/ML Partners:
Silicon Labs AI/ML partners provide expertise and platforms for data collection, model development, and training. See the partner pages below to learn more: