Interrupt and Thread Safety in RAIL#
In embedded software development, some of the most complicated debug challenges are caused by calling non-reentrant functions in interrupt context. Hence, it's in the developer's best interest to carefully design their application to avoid these scenarios.
To do so, however, requires sufficiently detailed knowledge of the interrupts and API functions - which are not accessible in a closed-source product like RAIL. This document aims to provide the information required to develop interrupt- and thread-safe applications in RAIL.
Thread Safety#
In an application without a task scheduler, only an interrupt request can interrupt the main program. If you use a preemptive scheduler, like the scheduler available in most embedded OSes, higher priority tasks can interrupt lower priority tasks as well. Regardless, when looking at the RAIL APIs, the same concerns are present in either case:
Is it safe to interrupt this API?
Is it safe to call this API from a thread/interrupt which interrupted something?
The Event Handler#
RAIL uses an event handler, which is set up by
RAIL_Init(). In
our examples, it's usually called sl_rail_util_on_event()
. This function is
called by the RAIL library, and it's almost always called from an interrupt
handler. This means the event handler should be used with care:
It should be kept in mind that interrupts are disabled when the event handler is running, so the function must not take long to return.
More importantly, the function might be interrupting the main loop (or some other task).
Note that the first point above might not be completely true if interrupt priorities are used, in which case only interrupts at the same and lower priorities are disabled. However, the event handler will never be interrupted by another event handler as all RAIL interrupts must be used at the same priority.
General Rules for the RAIL API#
First, let's collect the general rules of the API, and we'll detail exceptions in later points:
Calling any RAIL API from the main thread (or a single OS thread) is safe.
Calling any API from multiple threads is unsafe, except for DMP.
Calling most APIs from an interrupt handler is safe (see exceptions below).
Dynamic Multiprotocol (DMP)#
In general, if you have a multi-threaded application, you should use RAIL from a
single thread. The exception to this guidance is DMP, where in most cases each
protocol runs in its own thread. In this scenario, using RAIL from each thread
is safe, as each protocol has its own rail_handle
. So, a more generalized
wording of rule 2 is:
Calling any API from multiple threads is only safe if each thread has a
dedicated rail_handle
, and each thread only accesses RAIL with its own
handle.
The few APIs that don't use rail_handle
- like
RAIL_GetTime(),
RAIL_Sleep() , or
RAIL_Wake() - can be
called from any thread.
Interrupt Safety in General#
In general, calling an API which changes the radio state (i.e., between Rx, Idle and Tx) can be risky. The simplest way to write interrupt safe application is to not call state changing APIs from any interrupt handler, including the RAIL event handler. This can be achieved by setting a flag or changing a state variable in the event handler instead of calling an API directly:
typedef enum {
S_IDLE,
S_START_RX,
S_START_TX,
} state_t;
volatile state_t state;
volatile RAIL_Time_t last_event;
int main(){
//init code
state = S_START_TX;
while(1){
switch(state){
case S_START_TX:
RAIL_StartTx(rail_handle, 0, RAIL_TX_OPTIONS_DEFAULT, NULL);
state = S_IDLE;
break;
case S_START_RX:
RAIL_StartRx(rail_handle, 0, NULL);
state = S_IDLE;
break;
default:
break;
}
}
return 0;
}
void sl_rail_util_on_event(RAIL_Handle_t rail_handle, RAIL_Events_t events)
{
last_event = RAIL_GetTime();
if ( events & RAIL_EVENTS_TX_COMPLETION ){
state = S_START_RX;
RAIL_SetTxPower(rail_handle, 200);
}
}
Note that some RAIL APIs were called from the event handler, but none of those were state changing APIs.
Interrupt Safety with State Changing APIs#
In some (usually time critical) cases however, it's not possible to avoid calling state changing APIs from the event handler (or other interrupt handler). State changing APIs are not always risky: Some APIs might be safe, as long as they don't interrupt another specific API.
Hence, in the following list, we identify the risky API after first specifying which initially-running (i.e., "interrupted") API makes it risky (and how). We've included in this list some interrupt combinations that might be "safe", but the end result is not predictable - i.e., the radio might be in Rx or in Idle, depending on which API is called first.
Interrupting
RAIL_Start<something>()
with anotherRAIL_Start<something>()
is risky, especially if they would start on different channels.Interrupting
RAIL_Idle(handle, <something>, true)
with anyRAIL_Start<something>()
is risky.Interrupting
RAIL_Idle(handle, <something>, false)
with anyRAIL_Start<something>()
is safe, but the end result is not predictable (i.e., the radio will either be in Idle, or start the requested operation).Interrupting
RAIL_Start<something>()
withRAIL_Idle()
is safe but the end result is not predictable, and might cause strange events (see the next section for details).Interrupting
RAIL_StopTxStream()
with anyRAIL_Start<something>()
is very risky (the radio might remain in test configuration and start transmitting/receiving).Interrupting
RAIL_StopTx()
is safe. InterruptingRAIL_StopTx()
withRAIL_Start<something>()
is safe but the end result is not predictable (i.e., the radio will either be in Idle, or start the requested operation).Interrupting anything with
RAIL_StopTx()
is safe (see next section for important clarification). InterruptingRAIL_StartTx()
withRAIL_StopTx(handle, RAIL_STOP_MODE_ACTIVE)
is safe, but not predictable.Interrupting anything with
RAIL_StopTxStream()
is safe. InterruptingRAIL_StartTxStream()
withRAIL_StopTxStream()
is safe but not predictable.
RAIL_Idle in the Event Handler#
Calling
RAIL_Idle()
or RAIL_StopTx(rail_handle, RAIL_STOP_MODE_ACTIVE)
from the event handler
might cause strange results. For example, let's say you're receiving on a
channel and want to detect preambles using the event
RAIL_EVENT_RX_PREAMBLE_DETECT
and RAIL_EVENT_RX_PREAMBLE_LOST
. The
following scenario may unfold:
Preamble lost interrupt is received, so (at least) other radio interrupts are temporarily disabled.
You enter the event handler with
RAIL_EVENT_RX_PREAMBLE_LOST
.At this point, the radio detects a preamble. The interrupt is logged, but the handler cannot run since the interrupts are masked.
Still in the event handler, you decide to turn off the radio with
RAIL_Idle(railHandle, RAIL_IDLE_ABORT, true)
.The radio turning off will generate a preamble lost interrupt.
The radio is now off, and you return from the event handler.
Interrupts are enabled again, so the pending preamble detect interrupt handler starts running.
You enter the event handler with
RAIL_EVENT_RX_PREAMBLE_DETECT
andRAIL_EVENT_RX_PREAMBLE_LOST
both set at the same time.
So you end up with a preamble detect event, even though the radio is off.
This is usually harmless, since you always have the _LOST
or _ABORTED
event
as well, but this demonstrates why your design must carefully consider in what
order to handle events.
The easiest way to avoid this conflicted outcome is to disable the events that might cause problems when turning off the radio.
Another way to avoid this issue is to use RAIL_Idle(rail_handle, RAIL_IDLE_FORCE_SHUTDOWN_CLEAR_FLAGS, true)
, which will clear the pending
interrupts. However, using
RAIL_IDLE_FORCE_SHUTDOWN_CLEAR_FLAGS
has other drawbacks. It does force the radio state machine to idle state, and it
might corrupt the transmit or receive FIFOs - in which case it must clear them,
losing all data that might already be in there. It could also take more time to
finish running than
RAIL_IDLE_ABORT.
Critical Blocks#
One usual way to avoid internal safety issues is to create critical (a.k.a. atomic, although in case of EFR32, their meaning can be different) blocks, in which interrupts are disabled, in the main thread to make sure some code segment is never interrupted. However, this can create other problems, so it should be used carefully. There's no general rule to avoid this kind of "collateral damage", but here's an example that should be avoided:
RSSI averaging is running, and just before it finishes, we interrupt it with RAIL_StartTx() which is called from a critical block. The following race condition could happen:
We enter the critical block, interrupts are disabled.
RSSI averaging done interrupt is received, but the interrupt handler won't start since interrupts are masked.
StartTx turns off the radio, prepares it for transmit, then starts transmitting.
We leave the critical block, interrupts are enabled again.
RSSI averaging done interrupt handler runs at this point which will turn off the radio, aborting the current transmit.
One way to avoid the problem above is to clear interrupts in the critical block.
This can be done by using RAIL_Idle(handle,
RAIL_IDLE_FORCE_SHUTDOWN_CLEAR_FLAGS, true)
at the beginning of the critical
block, but the drawbacks of doing so (mentioned above) should be kept in mind.
In general, it's better to avoid risky interrupts without using critical blocks
in the main thread.
Using FORCE_SHUTDOWN#
In the two sections above we mentioned two usecases where
RAIL_IDLE_FORCE_SHUTDOWN_CLEAR_FLAGS
can be useful. In general however,
RAIL_IDLE
or RAIL_IDLE_ABORT
is a sufficient and preferred way to stop
transmitting/receiving - therefore the FORCE_SHUTDOWN
modes should be only
used when they are really needed (as in the specific scenarios described here).
For more details, see the article on the idle
modes.