Getting Started with Machine Learning on Silicon Labs Devices

Introduction

Silicon Labs integrates TensorFlow as a component within our Gecko SDK and Project Configurator for our EFx32 series microcontrollers, making it simple to add machine learning capability to any application. This guide covers how to get started using TensorFlow Lite for Microcontrollers on Silicon Labs' EFx32 devices.

TensorFlow Lite for Microcontrollers

TensorFlow is a widely used deep learning framework, with capability for developing and executing neural networks across a variety of platforms. TensorFlow Lite provides an optimized set of tools specifically catered towards machine learning for mobile and embedded devices.

TensorFlow Lite for Microcontrollers (TFLM) specifically provides a C++ library for running machine learning models in embedded environments with tight memory constraints. Silicon Labs provides tools and support for loading and running pre-trained models that are compatible with this library.

Gecko SDK TensorFlow Integration

The Gecko SDK includes TensorFlow as a third-party submodule, allowing for easy integration and testing with Silicon Labs' projects. Note that the included TensorFlow version may differ from the latest release of TensorFlow.

Additionally, TensorFlow Software Components in the Project Configurator simplify the process of including the necessary dependencies to use TFLM in a project.

Developing a Machine Learning Model in TFLM

Block Diagram of TensorFlow Lite Micro workflow

When developing and training neural networks for use in embedded systems, it is important to note the limitations on TFLM that apply to model architecture and training. Embedded platforms also have significant performance constraints that must be considered when designing and evaluating a model. The embedded TLFM documentation links describe these limitations and considerations in detail.

Additionally, the TensorFlow Software Components in Studio require a quantized *.tflite representation of the trained model. Thus TensorFlow and Keras are the recommended platforms for model development and training, as both platforms are supported by the TensorFlow Lite Converter that generates .tflite model representations.

Both TensorFlow and Keras provide guides on model development and training:

Once a model has been created and trained in TensorFlow or Keras, it needs to be converted and serialized into a *.tflite file. During model conversion, it is important to optimize the memory usage of the model by quantizing it. It is highly recommended to use integer quantization on Silicon Labs devices.

A complete example demonstrating the training, conversion, and quantization of a simple TFLM compatible neural network is available from TensorFlow:

Developing an Inference Application using Simplicity Studio and the Gecko SDK

Once a trained and quantized TFLite model is obtained, the next step is to set up the TFLM libraries to run inference on the EFx32 device.

Project Configurator Setup

The Project Configurator includes TFLM libraries as software components. These software components may be added to any existing project, and the TFLM Software Components are described in the SDK Component Overview. The core components needed for any machine learning project are:

  1. TensorFlow Lite Micro. This is the core software component that pulls in all the TFLM dependencies

  2. A supported TFLM kernel implementation. A kernel is a specific hardware/platform implmentation of a low level operation used by TensorFlow. Kernel selection can drastically change the performance and computation time of a neural network.

  3. A supported TFLM debug logger. The Project Configurator defaults to using the IO Stream implementation of the logger. To disable logging entirely, add the TensorFlow Lite Micro Debug Log - None component.

In addition to these required TFLM components, software components for obtaining and pre-processing sensor data can be added to the project. As an example for audio applications, TensorFlow provides a microphone frontend component that includes powerful DSP features to filter and extract features from raw audio data. Silicon Labs developed drivers for microphones, accelerometers, and other sensors provide an simple interface for obtaining sensor data to feed to a network.

Model Inclusion

With TensorFlow Lite Micro added in the project configurator, the next step is to load the model file into the project. To do this, simply copy the .tflite model file into the config/tflite directory of the project. The project configurator provides a tool that will automatically convert .tflite files into a sl_ml_model source and header files. The full documentation for this tool is available at Flatbuffer Conversion.

TFLM Initialization and Inference

To instantiate and use the TensorFlow APIs, follow the steps below to add TFLM functionality to the application layer of a project. This guide closely follows the TFLM Getting Started Guide, and is adapted for use with Silicon Labs\' projects.

A special note needs to be taken regarding the operations used by TensorFlow. Operations are specific types of computations executed by a layer in the neural network. All operations may be included in a project at once, but doing this may increase the binary size dramatically (>100kB). The more efficient option is to only include the operations necessary to run a specific model. Both options OBare described in the steps below.

1. Include the library headers

If using a custom, limited set of operations (recommended):

#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"

If using all operations:

#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"

2. Include the model header

The autogen/ folder is always included in the project include paths, and so an imported model may be generically included in any project source file with:

#include "sl_ml_model.h"

3. Define a Memory Arena

TensorFlow requires a memory arena for runtime storage of input, output, and intermediate arrays. This arena should be statically allocated, and the size of the arena depends on the model used. It is recommended to start with a large arena size during prototyping.

constexpr int tensor_arena_size = 10 * 1024;
uint8_t tensor_arena[tensor_arena_size];

Note: After prototyping, it is recommended to manually tune the memory arena size to the model used. Once the model is finalized, start with a large arena size and incrementally decrease it until interpreter allocation (described below) fails.

4. Set up Logging

This should be performed even if the TensorFlow Lite Micro Debug Log - None component is used. It is recommended to instantiate this statically and call the logger init functions during the app_init() sequence in app.c

static tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = &micro_error_reporter;

5. Load the Model

Continuing during the app_init() sequence, the next step is to load the model into tflite:

  const tflite::Model* model = ::tflite::GetModel(g_model);
  if (model->version() != TFLITE_SCHEMA_VERSION) {
    TF_LITE_REPORT_ERROR(error_reporter,
        "Model provided is schema version %d not equal "
        "to supported version %d.\n",
        model->version(), TFLITE_SCHEMA_VERSION);
  }

6. Instantiate the Operations Resolver

If using all operations, this is very straightforward. During app_init(), statically instantiate the resolver via:

  static tflite::AllOpsResolver resolver;

Note: loading all operations will result in large increases to the binary size. It is recommended to use a custom set of operations.

If using a custom set of operations. A mutable ops resolver must be configured and initialized. This will vary based on the model and application, and to determine the operations utilized in a given .tflite file, third party tools such as netron may be used to visualize the network and inspect which operations are in use.

Netron Visualization Netron visualization from the TensorFlow Lite Micro hello world example

The example below loads the minimal operators required for the TensorFlow hello_world example model. As shown in the Netron visualization, this only requires fully connected layers:

#define NUM_OPS 1

  static tflite::MicroMutableOpResolver<NUM_OPS> micro_op_resolver;
  if (micro_op_resolver.AddFullyConnected() != kTfLiteOk) {
    return;
  }

7. Initialize the Interpreter

The final step during app_init() is to instatiate an interpreter and allocate buffers within the memory arena for the interpreter to use:

// static declaration
tflite::MicroInterpreter* interpreter = nullptr;

  // initialization in app_init
  tflite::MicroInterpreter interpreter(model, micro_op_resolver, tensor_arena,
                                       tensor_arena_size, error_reporter);
  interpreter = &interpreter_struct;
  TfLiteStatus allocate_status = interpreter.AllocateTensors();
  if (allocate_status != kTfLiteOk) {
    TF_LITE_REPORT_ERROR(error_reporter, "AllocateTensors() failed");
    return;
  }

The allocation will fail if the arena is too small to fit all the operations and buffers required by the model. Adjust the tensor_arena_size accordingly to resolve the issue.

8. Run the Model

For default behavior in bare-metal application, it is recommended to run the model during app_process_action() in app.c in order for periodic inferences to occur during the standard event loop. Running the model involves three stages:

  1. Sensor data is pre-processed (if necessary) and then is provided as input to the interpreter.
    TfLiteTensor* input = interpreter.input(0);
    // stores 0.0 to the input tensor of the model
    input->data.f[0] = 0.;

It is important to match the shape of the incoming sensor data to the shape expected by the model. This can optionally be queried by checking properties defined in the input struct. An example of this for the hello_world example is shown below:

  TfLiteTensor* input = interpreter->input(0);
  if ((input->dims->size != 1) || (input->type != kTfLiteFloat32)) {
    TF_LITE_REPORT_ERROR(error_reporter,
                         "Bad input tensor parameters in model");
    return;
  }
  1. The interpreter is then invoked to run all layers of the model.

    TfLiteStatus invoke_status = interpreter->Invoke();
    if (invoke_status != kTfLiteOk) {
     TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x_val: %f\n",
                          static_cast<double>(x_val));
     return;
    }
  2. The output prediction is read from the interpreter.

    TfLiteTensor* output = interpreter->output(0);
    // Obtain the output value from the tensor
    float value = output->data.f[0];

At this point, application dependent behavior based on the output prediction should be performed. The application will run inference on each iteration of app_process_action().

Full Code Snippet

After following the steps above and choosing to use the mutable ops resolver, the resulting app.cpp now appears as:

#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"

#include "sl_ml_model.h"

#define NUM_OPS 1

constexpr int tensor_arena_size = 10 * 1024;
uint8_t tensor_arena[tensor_arena_size];

tflite::MicroInterpreter* interpreter = nullptr;

/***************************************************************************//**
 * Initialize application.
 ******************************************************************************/
void app_init(void)
{
  static tflite::MicroErrorReporter micro_error_reporter;
  tflite::ErrorReporter* error_reporter = &micro_error_reporter;

  const tflite::Model* model = ::tflite::GetModel(g_model);
  if (model->version() != TFLITE_SCHEMA_VERSION) {
    TF_LITE_REPORT_ERROR(error_reporter,
        "Model provided is schema version %d not equal "
        "to supported version %d.\n",
        model->version(), TFLITE_SCHEMA_VERSION);
  }



  static tflite::MicroMutableOpResolver<NUM_OPS> micro_op_resolver;
  if (micro_op_resolver.AddFullyConnected() != kTfLiteOk) {
      return;
  }

  static tflite::MicroInterpreter interpreter_struct(model, micro_op_resolver, tensor_arena,
                                       tensor_arena_size, error_reporter);
  interpreter = &interpreter_struct;
  TfLiteStatus allocate_status = interpreter.AllocateTensors();
  if (allocate_status != kTfLiteOk) {
    TF_LITE_REPORT_ERROR(error_reporter, "AllocateTensors() failed");
    return;
  }
}

/***************************************************************************//**
 * App ticking function.
 ******************************************************************************/
void app_process_action(void)
{
  // stores 0.0 to the input tensor of the model
  TfLiteTensor* input = interpreter->input(0);
  input->data.f[0] = 0.;

  TfLiteStatus invoke_status = interpreter->Invoke();
  if (invoke_status != kTfLiteOk) {
    TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x_val: %f\n",
                         static_cast<double>(x_val));
    return;
  }

  TfLiteTensor* output = interpreter->output(0);
  float value = output->data.f[0];
}

Examples

As described in TensorFlow Lite for Microcontrollers in the Gecko SDK, TensorFlow developed examples demonstrating the hello_world example described in this guide, as well as a simple speech recognition example micro_speech, are included in the Gecko SDK.

Note that the micro_speech example demonstrates use of the MicroMutableOpResolver to only load required operations.

Third-party Tools and Partners

Tools:

AI/ML Partners:

Silicon Labs AI/ML partners provide expertise and platforms for data collection, model development, and training. See the partner pages below to learn more: