In multi core embedded Rust, can I use a static mut for one way data sharing?

365 Views Asked by At

In multi core embedded Rust, is it appropriate to use a static mut for one way data sharing from one core to the other core?

Here’s the code (using embassy)

#![no_std]
// …

static mut CORE1_STACK: Stack<4096> = Stack::new();
static EXECUTOR0: StaticCell<Executor> = StaticCell::new();
static EXECUTOR1: StaticCell<Executor> = StaticCell::new();

static mut one_way_data_exchange:u8 = 0;


#[cortex_m_rt::entry]
fn main() -> ! {
    spawn_core1(p.CORE1, unsafe { &mut CORE1_STACK }, move || {
        let executor1 = EXECUTOR1.init(Executor::new());
        executor1.run(|spawner| unwrap!(spawner.spawn(core1_task())));
    });

    let executor0 = EXECUTOR0.init(Executor::new());
    executor0.run(|spawner| unwrap!(spawner.spawn(core0_task())));
}

#[embassy_executor::task]
async fn core0_task() {
    info!("Hello from core 0");
    loop {
        unsafe { one_way_data_exchange = 128; } // sensor value
    }
}

#[embassy_executor::task]
async fn core1_task() {
    info!("Hello from core 1");
    let sensor_val:u8 = 0;
    loop {
        unsafe { sensor_val = one_way_data_exchange; }

        // continue with rest of program
        }
    }
}

If I were to be writing to the static var from both cores, that would obviously create a race condition. But if I only ever write from one core and only ever read from the other core, does that solve the race condition? Or, is it still problematic for both cores to be accessing it in parallel, even if only one is writing?

The order of read->write or write->read, in this case, doesn’t matter. One core is just creating a stream of IO input and the other dips into that stream whenever it’s ready to process the loop again, even if it misses some intermittent inputs.

1

There are 1 best solutions below

4
Finomnis On BEST ANSWER

No, it is still a problem.

  • Writes are not always guaranteed to be atomic. For example on a 32-bit system, a u64 takes multiple cpu cycles to write - and therefore the reading side could see only half of the value updated.
  • This breaks soundness because the compiler can no longer prove that your code is free of undefined behavior

It is true that accessing very simple primitive types like this can be safe. You don't need static mut for it, though - there are mechanisms built into the language / core library so you don't have to resort to static mut. In this case, the important one would be atomic.

It provides something called interior mutability. This means your value can be static without mut, and can be shared normally, and the type itself provides the mutability.

Let me demonstrate. As I don't have a microcontroller available right now, I rewrote your example for normal execution:

use core::time::Duration;

static mut ONE_WAY_DATA_EXCHANGE: u8 = 0;

fn main() {
    std::thread::scope(|s| {
        s.spawn(thread0);
        s.spawn(thread1);
    });
}

fn thread0() {
    std::thread::sleep(Duration::from_millis(500));
    unsafe { ONE_WAY_DATA_EXCHANGE = 42 };
}

fn thread1() {
    for _ in 0..4 {
        std::thread::sleep(Duration::from_millis(200));
        let value = unsafe { ONE_WAY_DATA_EXCHANGE };
        println!("{}", value);
    }
}
0
0
42
42

Here is how this would look like when implemented with atomic:

use core::{
    sync::atomic::{AtomicU8, Ordering},
    time::Duration,
};

static ONE_WAY_DATA_EXCHANGE: AtomicU8 = AtomicU8::new(0);

fn main() {
    std::thread::scope(|s| {
        s.spawn(thread0);
        s.spawn(thread1);
    });
}

fn thread0() {
    std::thread::sleep(Duration::from_millis(500));
    ONE_WAY_DATA_EXCHANGE.store(42, Ordering::Release);
}

fn thread1() {
    for _ in 0..4 {
        std::thread::sleep(Duration::from_millis(200));
        let value = ONE_WAY_DATA_EXCHANGE.load(Ordering::Acquire);
        println!("{}", value);
    }
}
0
0
42
42

Note that the code does not contain an unsafe; this is prefectly valid for the compiler to understand and has (almost) no runtime overhead.


To demonstrate how little overhead this really causes:

#![no_std]

use core::sync::atomic::{AtomicU8, Ordering};

static SHARED_VALUE_ATOMIC: AtomicU8 = AtomicU8::new(0);

pub fn write_static_atomic(val: u8){
    SHARED_VALUE_ATOMIC.store(val, Ordering::SeqCst)
}

pub fn read_static_atomic() -> u8 {
    SHARED_VALUE_ATOMIC.load(Ordering::SeqCst)
}

static mut SHARED_VALUE_STATICMUT: u8 = 0;

pub fn write_static_staticmut(val: u8){
    unsafe {
        SHARED_VALUE_STATICMUT = val;
    }
}

pub fn read_static_staticmut() -> u8 {
    unsafe {
        SHARED_VALUE_STATICMUT
    }
}

The code compiles to the following, using the flags -C opt-level=3 -C linker-plugin-lto --target=thumbv6m-none-eabi:

example::write_static_atomic:
        dmb     sy
        ldr     r1, .LCPI0_0
        strb    r0, [r1]
        dmb     sy
        bx      lr
.LCPI0_0:
        .long   example::SHARED_VALUE_ATOMIC.0

example::read_static_atomic:
        ldr     r0, .LCPI1_0
        ldrb    r0, [r0]
        dmb     sy
        bx      lr
.LCPI1_0:
        .long   example::SHARED_VALUE_ATOMIC.0

example::write_static_staticmut:
        ldr     r1, .LCPI2_0
        strb    r0, [r1]
        bx      lr
.LCPI2_0:
        .long   example::SHARED_VALUE_STATICMUT.0

example::read_static_staticmut:
        ldr     r0, .LCPI3_0
        ldrb    r0, [r0]
        bx      lr
.LCPI3_0:
        .long   example::SHARED_VALUE_STATICMUT.0

example::SHARED_VALUE_ATOMIC.0:
        .byte   0

example::SHARED_VALUE_STATICMUT.0:
        .byte   0

An AtomicU8 in on thumbv6m-none-eabi seems to have almost zero overhead. The only changes are the dmb sy, which are memory barriers that prevent race conditions; using Ordering::Relaxed (if your problem allows it) should eliminate those, causing actual zero overhead. Other architectures should behave similar.