Till: Part 2

Implementing MVP and testing on an Arduino

January 08 2024

See Till: Part 1 if you haven’t already for an introduction to the project.

In this part I’ll be implementing the traits written in the first part, and then testing it on an Arduino nano.

Implementation

Marshall

I’ll introduce the Marshall implementation first since it’s quite short but does have some details worth pointing out.

/// An executor marshall that is only safe to be used from a single thread.
///
/// Its implementation of [Sync] is only so that you can have static values.
pub struct SingleThreadMarshall {
    ptr: Cell<*const Cell<bool>>,
}

// SAFETY: creating values of this type is unsafe and requires upholding a
// contract that ensures no UB can occur due to what would otherwise be an
// unsound impl of Sync for this type.
unsafe impl Sync for SingleThreadMarshall {}

impl SingleThreadMarshall {
    /// You must not share references to this value between threads. All method calls
    /// on this value must come from the same thread for its entire lifetime.
    ///
    /// If this was called outside of a static initialiser, the value can only be used
    /// from the thread that created it.
    ///
    /// Its implementation of [Sync] is only so that you can have static values.
    pub const unsafe fn new() -> Self {
        Self {
            ptr: Cell::new(core::ptr::null()),
        }
    }

    pub unsafe fn register(&self, b: &Cell<bool>) {
        self.ptr.set(b);
    }

    pub unsafe fn unregister(&self) {
        self.ptr.set(core::ptr::null());
    }
}

Unfortunately this implementation ends up with some unsafe and worse, a dangerous unsafe Sync impl for a type that clearly can’t be shared between threads. This is an example of Rust’s assumptions about the target environment being wrong and then subsequently constraining us more tightly than is actually necessary for our target.

In Rust, static variables must be of types that implement Sync since the language can’t (or at least doesn’t) restrict access by thread. That lets us ‘fearlessly’ add multithreading to our programs without risk of UB, but it’s not a free lunch. Our types have to be designed to actually be thread-safe and that’s on us. Since my goal is to test on an AtMega328, the Sync requirement is all downside, no upside as there’s no threading on that target anyway.

You may wonder why I don’t use thread_local! since that lets you avoid the Sync requirement. It’s because it’s not part of core, it’s part of std, and goal 1 was to support no-std targets.

This boxes me into the unsafe impl with little alternative. To try to make it a bit clearer how dangerous this type is for threaded targets, I’ve made the new method unsafe with a doc comment that describes a contract that undoes the Sync.

With that annoying detail out of the way, we can talk about why register and unregister are also unsafe. The marshall’s job is to provide a 'static lifetime stepping stone for a waker to send a signal to a task manager. The task manager might have a lifetime shorter than 'static so the only way to keep the marshall 'static is to unsafely extend the reference lifetime or equivalently use a raw pointer. The task manager could be destroyed or moved while a waker tied to this marshall still exists, which in the &'static case would be instant UB and in the raw pointer case would be UB when the waker attempts to signal the task manager. Making registering a task manager with the marshall unsafe makes it clear that it is the programmers responsibility to unregister before moving or destroying the task manager.

This probably could be simplified with some sort of registration handle which borrows the task manager and unregisters it from the marshall when the handle is dropped, but it would probably have to involve pinning to guarantee that it is actually dropped and not forgotten. It might also cause issues with not being able to mutably borrow the task manager for anything else, so for now this manual and more error-prone approach will have to do.

The implementation of the Marshall trait is straightforward:

impl Marshall for SingleThreadMarshall {
    fn wake(&'static self) {
        let ptr = self.ptr.get();
        if ptr.is_null() {
            return;
        }
        unsafe {
            (&*ptr).set(true);
        }
    }

    fn waker(&'static self) -> core::task::Waker {
        unsafe {
            core::task::Waker::from_raw(core::task::RawWaker::new(
                self as *const Self as *const (),
                &SINGLE_THREAD_MARSHALL_VTABLE,
            ))
        }
    }
}

The SINGLE_THREAD_MARSHALL_VTABLE just calls the marshall’s wake method for both wake and wake_by_ref.

Implementing FusedFutureWithWakeStatus

In the previous section I talked about how the marshall signals the task manager, but that isn’t really true. It signals the task directly as it is the task which must hold its wake status (that’s what FusedFutureWithWakeStatus does). It’s worth introducing an implementation of FusedFutureWithWakeStatus to make that more clear.

#[pin_project::pin_project]
pub struct SingleThreadWithWakeStatus<F> {
    #[pin]
    f: F,
    status: core::cell::Cell<bool>,
}

impl<F> SingleThreadWithWakeStatus<F> {
    pub unsafe fn register(self: Pin<&mut Self>, marshall: &'static SingleThreadMarshall) {
        let projection = self.project();
        marshall.register(&projection.status);
    }
}

Notice that this implementation is specifically tied to SingleThreadMarshall as they both agree on using a Cell<bool> for storing the wake status.

We then have to implement the future traits for this type:

impl<F: Future> Future for SingleThreadWithWakeStatus<F> {
    type Output = F::Output;

    fn poll(
        self: Pin<&mut Self>,
        cx: &mut core::task::Context<'_>,
    ) -> core::task::Poll<Self::Output> {
        let projection = self.project();
        projection.f.poll(cx)
    }
}

impl<F: FusedFuture> FusedFuture for SingleThreadWithWakeStatus<F> {
    fn is_terminated(&self) -> bool {
        self.f.is_terminated()
    }
}

impl<F: FusedFuture> FusedFutureWithWakeStatus for SingleThreadWithWakeStatus<F> {
    fn status(&self) -> WakeStatus {
        if self.status.get() {
            WakeStatus::Woken
        } else {
            WakeStatus::Asleep
        }
    }

    fn set_status(self: Pin<&mut Self>, status: WakeStatus) {
        match status {
            WakeStatus::Woken => self.status.set(true),
            WakeStatus::Asleep => self.status.set(false),
        }
    }
}

Finally an extension trait can make it a bit more ergonomic to set tasks up:

pub trait FusedFutureExt {
    fn with_wake_status_st(self) -> SingleThreadWithWakeStatus<Self>
    where
        Self: Sized;
}

impl<F: futures::future::FusedFuture> FusedFutureExt for F {
    fn with_wake_status_st(self) -> SingleThreadWithWakeStatus<Self> {
        SingleThreadWithWakeStatus {
            f: self,
            status: core::cell::Cell::new(true),
        }
    }
}

Task Manager

pub struct ArrayTaskManager<'a, const N: usize, MarshallType: Marshall> {
    pub tasks: [(
        Pin<&'a mut dyn FusedFutureWithWakeStatus<Output = ()>>,
        &'static MarshallType,
    ); N],
}

Here’s a simple task manager type. It doesn’t support spawning tasks since it has a const generic fixed capacity. It also doesn’t own the tasks, but instead holds pinned mutable references to them. This makes everything a consistent size, without which we wouldn’t be able to use an array in this way. Each task must have its own marshall so tasks is an array of task-marshall pairs.

Implementing the TaskManager trait is as follows:

impl<'a, const N: usize, Marshall: ExecutorMarshall> TaskManager
    for ArrayTaskManager<'a, N, Marshall>
{
    type Marshall = Marshall;

    type TaskIterator<'b> = ArrayTaskManagerIter<'b, 'a, N, Marshall>
    where
        Self: 'b;

    fn get_task(
        &mut self,
        i: usize,
    ) -> Option<(
        Pin<&mut dyn FusedFutureWithWakeStatus<Output = ()>>,
        &'static Self::Marshall,
    )> {
        self.tasks
            .get_mut(i)
            .map(|(task, marshall)| (unsafe { core::mem::transmute(task.as_mut()) }, *marshall))
    }

    fn sleep_task(&mut self, i: usize) {
        self.tasks[i]
            .0
            .as_mut()
            .set_status(crate::WakeStatus::Asleep);
    }

    fn sleep_all(&mut self) {
        for task in &mut self.tasks {
            task.0.as_mut().set_status(crate::WakeStatus::Asleep);
        }
    }

    fn tasks<'b>(&'b mut self) -> Self::TaskIterator<'b> {
        ArrayTaskManagerIter {
            array: &mut self.tasks,
            i: 0,
        }
    }
}

I was running into all kinds of variance problems when writing this (the ArrayTaskManagerIter implementation which I’ve omitted has more too) but besides that the implementation is unsurprising.

Testing

To test it out I want to build a simple program that blinks an LED and prints over UART periodically, but those periods should be different.

Task Definitions

async fn blink_task(mut pin: impl OutputPin) {
    loop {
        let _ = pin.set_high();
        sleep(1000).await;
        let _ = pin.set_low();
        sleep(1000).await;
    }
}

async fn print_task() {
    for i in 0.. {
        println!("i={i}");
        sleep(2500).await;
    }
}

These tasks are very simple but should serve as a good first test, but before I can try it I need to implement the sleep future, as well as println! since that normally comes from std.

sleep

I took an implementation of a Rust version of Arduino’s millis from avr-hal to build this future around it:

#[pin_project::pin_project]
pub struct Sleep {
    end: u32
}

impl Future for Sleep {
    type Output = ();

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let projection = self.project();
        let now = millis();
        if now.wrapping_sub(*projection.end) <= (*projection.end).wrapping_sub(now) {
            cx.waker().wake_by_ref();
            Poll::Ready(())
        } else {
            cx.waker().wake_by_ref();
            Poll::Pending
        }
    }
}

pub fn sleep(sleep_millis: u32) -> Sleep {
    Sleep { end: millis() + sleep_millis }
}

Because we are operating without an event pool the future has no way to signal to the executor what it is actually waiting for so it opts to request wake up with cx.waker().wake_by_ref() then return Poll::Pending. This will cause the executor to repeatedly poll this task until sufficient time has passed, but it will interleave polling the other tasks so not to effective block as that would defeat the purpose. This is a wasteful way to write a sleep future, but should still work.

println!

Though not on the surface of particular relevance to building an executor, I though I’d include how I implemented println! as there are some interesting details.

pub struct GlobalSerial {
    serial: UnsafeCell<
        Option<
            avr_hal_generic::usart::Usart<
                Atmega,
                USART0,
                avr_hal_generic::port::Pin<Input, PD0>,
                avr_hal_generic::port::Pin<Output, PD1>,
                MHz16,
            >,
        >,
    >,
}

unsafe impl Sync for GlobalSerial {}

impl GlobalSerial {
    pub unsafe fn init(
        &self,
        usart: avr_device::atmega328p::USART0,
        rx: avr_hal_generic::port::Pin<Input, PD0>,
        tx: avr_hal_generic::port::Pin<Output, PD1>,
        baudrate: u32,
    ) {
        unsafe {
            *self.serial.get() = Some(avr_hal_generic::usart::Usart::new(
                usart,
                rx,
                tx,
                baudrate.into_baudrate::<MHz16>(),
            ))
        };
    }
}

impl core::fmt::Write for &'static GlobalSerial {
    fn write_str(&mut self, s: &str) -> core::fmt::Result {
        avr_device::interrupt::free(|_| -> core::fmt::Result {
            let Some(serial) = (unsafe { &mut *self.serial.get() }) else {
                return Err(core::fmt::Error::default());
            };
            for byte in s.as_bytes() {
                nb::block!(serial.write(*byte)).map_err(|_| Default::default())?
            }
            Ok(())
        })
    }
}

pub static SERIAL: GlobalSerial = GlobalSerial {
    serial: UnsafeCell::new(None),
};

That annoying unsafe impl Sync appears again. avr_device::interrupt::free is used to prevent UB in the event that an interrupt handler tries to print (if we were interrupted during a print we may have already acquired a mutable reference to the USART through the UnsafeCell but our print in the interrupt handler would then acquire the mutable reference again while the other reference still exists which is UB). The part of this implementation I like the least is probably the use of nb::block!, but its necessity highlights some issues with Rust’s fmt module.

Ideally I would like to write asynchronously since writing over serial can take a long time, especially at low baudrates. One approach might be to define an asynchronous write trait but we have no way to pass such an implementor to any of the formatting traits like Display since they take a Formatter<'a> which encapsulates a &'a dyn core::fmt::Write among other things. We could replace the formatting traits too but then we encounter another issue, core::fmt::Arguments<'a> is almost completely opaque. We’d have to not only replace the write trait and the formatting traits, but format_args! as well. That’s not really practical for an initial test so for now we’ll have to settle for a synchronous blocking implementation and so nb::block! is here to stay.

The println! macro is then defined in terms of writeln! with the static SERIAL writer.

#[macro_export]
macro_rules! println {
    ($format_string:literal $(, $($name:ident = )? $e:expr)* $(,)?) => {
        writeln!(&crate::serial::SERIAL, $format_string $(, $($name = )? $e)*).unwrap()
    };
}

main

Here’s the rest of the code so you can see how the executor is setup:

static SLEEP_TASK_MARSHALL: SingleThreadMarshall = unsafe { SingleThreadMarshall::new() };

static PRINT_TASK_MARSHALL: SingleThreadMarshall = unsafe { SingleThreadMarshall::new() };

#[arduino_hal::entry]
fn main() -> ! {
    let peripherals = Peripherals::take().expect("Failed to get peripherals");
    let pins = arduino_hal::pins!(peripherals);
    millis_init(peripherals.TC0);
    unsafe {
        SERIAL.init(
            peripherals.USART0,
            pins.d0.into_floating_input().forget_imode(),
            pins.d1.into_output(),
            115200,
        )
    };
    let mut blink_task = core::pin::pin!(blink_task(pins.d13.into_output())
        .fuse()
        .with_wake_status_st());
    unsafe { blink_task.as_mut().register(&SLEEP_TASK_MARSHALL) };
    let mut print_task = core::pin::pin!(print_task().fuse().with_wake_status_st());
    unsafe { print_task.as_mut().register(&PRINT_TASK_MARSHALL) };

    let mut tasks = ArrayTaskManager {
        tasks: [
            (blink_task, &SLEEP_TASK_MARSHALL),
            (print_task, &PRINT_TASK_MARSHALL),
        ],
    };
    let mut pool = DummyPool;

    let executor = Executor::new(&mut tasks, &mut pool);

    println!("Starting Execution");
    unsafe { avr_device::interrupt::enable() };
    executor.run_to_completion();
    println!("Execution Complete");

    loop {}
}

And… it works! The tasks blink the onboard LED and periodically prints over UART. I’ve omitted the implementation of DummyPool since it’s not interesting as it doesn’t do anything. This has proved to me that async Rust can be used even on tiny embedded systems like the AtMega328.

Next Steps

In the next part I’ll start to build the event source pool and try using it to improve the sleep future and to add a task that waits for a button to be pressed before doing something else.

Recommended Projects

Till: Part 1

Writing a customisable Rust Async Executor