Test Like You Fly: Implementing On-Target Testing for the Libuavcan Project

June 19, 2019 Scott Dixon

Adapted from: White, Julie; Tilney, Lindsay “Applying the “Test Like You Fly” (TLYF) Process To Flight Software Testing” Workshop of Spacecraft Flight Software 2013, http://flightsoftware.jhuapl.edu/files/2013/talks/FSW-13-TALKS/TLYF_Apply2FSW_Dec2013r1.pdf

Most vehicle systems divide testing up into layers that are something like what is pictured in the above Integration Test triangle. The higher layers rely on the quality and interface guarantees of the layers below. As you progress up the triangle, testing becomes more complex, expensive, and time consuming. The ideal is for each layer to be a trusted foundation for the next layer to build upon.

While libuavcan lives in the sub-system layer of an integrated vehicle system it is a software library when unintegrated that can be considered a “part”. As such all the layers of a system-of-systems end up relying on it. Given that libuavcan is targeted at safety critical, real-time, and resource constrained applications it is important that the project ensures its software is adhering to specifications and honoring interface contracts

The system should never experience expected operations, environments, stresses or their combinations for the first time during the mission

White, Julie; Tilney, Lindsay “Applying the “Test Like You Fly” (TLYF) Process To Flight Software Testing” Workshop of Spacecraft Flight Software 2013, http://flightsoftware.jhuapl.edu/files/2013/talks/FSW-13-TALKS/TLYF_Apply2FSW_Dec2013r1.pdf

In the above diagram I show the width as being relative complexity in a system. As you can see libuavcan sits on top of a lot of lower-level logic and, especially with CAN, hardware that can have significant effects on it. So the story goes that the hardware is verified to exhibit the required electrical characteristics, the peripherals are verified to exhibit the required logic when used with compliant hardware, the drivers are verified to work with the peripherals based on the behaviour published in datasheets, and now libuavcan must be verified to work with the drivers. The libuavcan project could simply decide to specify driver behavour and only test against abstract models of ideal drivers but given the higher bar we are trying to set we feel we can do better by testing against real drivers running on real hardware connected to real busses. One of the reasons for this is to check for unexpected holes in the cheese.

For example, a small problem in a CAN bus might not be caught by a defective error check in the driver which might expose a timing flaw in libuavcan which might then violate one of its interface contracts. By testing our assumptions of the layers below libuavcan against real implementations we reveal blind spots and false assumptions that would otherwise be found only after the library was integrated into a real system. So while our on-target testing is limited to reference hardware (i.e. we obviously cannot test on every possible target platform given libuavcan’s inherent portability) we are increasing the fidelity of our tests which reduces the size and number of holes in our part.

Test Like You Fly

Aerospace has long lived by the mantra of “Test Like You Fly.” The only reason we don’t always do it is because it can be difficult and expensive. Libuavcan is a very small open-source project and we can’t afford rocket stands but we can afford a bit of cloud infrastructure and some Raspberry PIs. Even more, a lot of the infrastructure is free for open-source and we take full advantage of that. Doing this the libuavcan project has set out to expose hardware test targets to our contributors as part of our automated build pipelines. We’ve started small by simply cross-compiling googletest-based unit tests for armv7-m and running them on NXP dev kits.

Open Source Hardware and Services

The first enabling technology we deployed is one I’m pretty keen on; buildkite. This continuous integration service is factored out in a way that makes it particularly well suited for firmware projects. Instead of hosting everything in a single cloud Buildkite provides automation services in their cloud but allows you to attach your own build hosts using their portable “agent” software.

See this content in the original post

How Buildkite agents work (taken from https://buildkite.com/docs/agent/v3)

Buildkite’s architecture allows the libuavcan pipelines to compile, perform static analysis, run unit tests, and generate coverage analysis on EC2 hosts and to then forward the resulting binaries to a farm of Raspberry PIs. The agents on the PIs pull hex files and use a little bit of Python glue we call Nanaimo to flash these test binaries onto NXP S32K148 eval boards using a JLink Edu mini.

Raspberry PI B+3 running Buildkite’s golang agent which coordinates loading and monitoring tests on an NXP eval board for a Cortex M4 MCU. The agent is pulling binaries uploaded by the AWS Buildkite agent running on our EC2 build hosts after each of our docker builds completes.

So let’s tally up the services and hardware we’ve had to purchase to build this workflow: Github is free for open-source, Dockerhub is free for open-source, And Buildkite is free for open-source. EC2 does have a free tier but even using beefier hosts the libuavcan bill is only tens of dollars (USD) per month.

While the dev kit here is a bit more expensive then is typical the PI B+3 is usually less than $40 USD and the JLink Edu mini runs for about $20 USD (again, the EDU cannot be used for commercial purposes so this is a huge discount for open-source). Where most Cortex-M dev kits run less than $100 USD one can build a full armv7-m test farm for less than $200 USD per node and operate it for the cost of electricity and bandwidth.

Googletest

At this point we’re simply re-running our googletest unit tests on Cortex-M4 CPUs. This is far short of TLYF but it is a first step. As such, let me take a moment to detail how we manage to get googletest to build and run on an MCU since it’s a little bit involved.

Googletest doesn’t support no-sys builds as part of their CMake system they deliver with their source so we have to write our own makefiles. The following are snippets from the libuavcan v1 cmake files (for full details see the source).

set(GOOGLETEST_SUBMODULE "${EXTERNAL_PROJECT_DIRECTORY}/googletest-src")

include_directories(
    ${GOOGLETEST_SUBMODULE}/googletest/include
    ${GOOGLETEST_SUBMODULE}/googlemock/include
)

add_library(gmock_main STATIC EXCLUDE_FROM_ALL
    ${GOOGLETEST_SUBMODULE}/googletest/src/gtest-all.cc
    ${GOOGLETEST_SUBMODULE}/googlemock/src/gmock-all.cc
)

target_include_directories(gmock_main PRIVATE
    ${GOOGLETEST_SUBMODULE}/googletest
    ${GOOGLETEST_SUBMODULE}/googlemock
)

Next we have to define a whole bunch of stuff to turn off various operating system features googletest would otherwise use.

add_definitions(-DGTEST_HAS_POSIX_RE=0
                -DGTEST_HAS_PTHREAD=0
                -DGTEST_HAS_DEATH_TEST=0
                -DGTEST_HAS_STREAM_REDIRECTION=0
                -DGTEST_OS_NONE
                -DGTEST_HAS_RTTI=0
                -DGTEST_HAS_EXCEPTIONS=0
                -DGTEST_HAS_DOWNCAST_=0
                -DGTEST_HAS_MUTEX_AND_THREAD_LOCAL_=0
                -DGTEST_USES_POSIX_RE=0
                -DGTEST_USES_PCRE=0
                -DGTEST_LANG_CXX11=1
                -DGTEST_OS_WINDOWS=0
                -DGTEST_OS_WINDOWS_DESKTOP=0
                -DGTEST_OS_WINDOWS_MINGW=0
                -DGTEST_OS_WINDOWS_RT=0
                -DGTEST_OS_WINDOWS_MOBILE=0
                -DGTEST_OS_WINDOWS_PHONE=0
                -DGTEST_OS_LINUX_ANDROID=0
                -DGTEST_OS_CYGWIN=0
                -DGTEST_OS_QNX=0
                -DGTEST_OS_MAC=0
                -DGTEST_OS_AIX=0
                -DGTEST_OS_HPUX=0
                -DGTEST_OS_OPENBSD=0
                -DGTEST_OS_FREEBSD=0
                -DGTEST_OS_LINUX=0
                -DGTEST_OS_SOLARIS=0
                -DGTEST_OS_SYMBIAN=0
                -DGTEST_LINKED_AS_SHARED_LIBRARY=0
                -DGTEST_CREATE_SHARED_LIBRARY=0
                -DGTEST_DONT_DEFINE_FAIL=0
                -DGTEST_DONT_DEFINE_SUCCEED=0
                -DGTEST_DONT_DEFINE_ASSERT_EQ=0
                -DGTEST_DONT_DEFINE_ASSERT_NE=0
                -DGTEST_DONT_DEFINE_ASSERT_GT=0
                -DGTEST_DONT_DEFINE_ASSERT_LT=0
                -DGTEST_DONT_DEFINE_ASSERT_GE=0
                -DGTEST_DONT_DEFINE_ASSERT_LE=0
                -DGTEST_DONT_DEFINE_TEST=0
)

Here’s what a typical main() might look like:

int main(void)
{
    /*
     * Init your MCU hardware...
     */
    LPUART1_init();
    LPUART1_transmit_string("Running googletest\n\r");

    char program_name[] = "libuavcan_ontarget";

    // the arguments array
    char* argv[] = {program_name};

    int argc = 1;

    // This also initialized googletest.
    testing::InitGoogleMock(&argc, argv);

    RUN_ALL_TESTS();
    while(true)
    {
        WDT_reset();
        LowPowerSleep();
    }   
}

You’ll need to implement a minimal system yourself. Don’t worry, the only thing you need that’s real is some sort of character-based I/O. You can use a UART for that (as is demonstrated here) or optimally, for Cortex-M, you can use SWO.

#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/unistd.h>
#include "LPUART.h"
#include <errno.h>

char* getcwd(char* buf, size_t size)
{
    if (!buf || size <= 1)
    {
        errno = EINVAL;
        return 0;
    }
    buf[0] = '/';
    buf[1] = 0;
    return buf;
}

int mkdir(const char* path, mode_t mode)
{
    (void) path;
    (void) mode;
    errno = EACCES;
    return -1;
}

// our replacement for the libc call.
__attribute__((visibility("default"))) int _gettimeofday(struct timeval* tp, void* tzp)
{
    (void) tzp;
    if (tp)
    {
        // TODO: use systick or even an RTC if you reall
        // care about printing accurate time in the test
        // output.
        tp->tv_sec  = 0;
        tp->tv_usec = 0;
    }
    else
    {
        errno = EINVAL;
        return -1;
    }
    return 0;
}

int _write(int fd, const void* buf, size_t count)
{
    (void) fd;
    if (count == 0 || buf == 0)
    {
        errno = EINVAL;
        return -1;
    }
    LPUART1_transmit_string_len(buf, count);
    return 0;
}

int _open(const char* filename, int oflag, int pmode)
{
    (void) filename;
    (void) oflag;
    (void) pmode;
    errno = ENOENT;
    return -1;
}

int _close(int fd)
{
    (void) fd;
    errno = EIO;
    return -1;
}

void _putchar(int c)
{
    LPUART1_transmit_char((char) c);
}

Finally, note that you will need a lot of heap memory to run googletest. Our initial tests run on an S32K148 that has 256K of ram and allocates 116K of that ram for a heap!

Next Steps

As the libuvcan V1 project continues we will be evolving our PI farm to include more realistic scenarios, different target hardware, and different test types. The first increase in fidelity we are planning is to add a real CAN bus to each PI worker in the farm. This will allow us to run tests that exchange data between a test running on the PI and libuavcan running on-target or vice-versa.

Eventually we will define a simple test protocol that does not require googletest. We started with googletest because we can simply cross-compile unit-tests written for the native build and run them again on-target, but, as stated in the previous section, googletest is big and not at all optimized for resource constrained targets. Along the same lines we’ll be expanding Nanaimo to parse SWO output which allows for greater fidelity since the tests do not require a UART to report results to.

We also want to add more integrated targets like Pixhawk running Nuttx to our farm. This would allow us to validate changes in even more TLYF ways by, for example, loading libuavcan onto ESCs and spinning motors. Additional targets that are more resource constrained are also of interest. For these we would implement non-googletest performance tests to verify both that we can slim down libuavan to fit on an M0, for example, and that it fulfills it’s realtime guarantees even when stripped down.

Finally, we would like to purchase some JLink Trace probes and build the infrastructure necessary to collect and report on-target coverage using Cortex ETM.

I’ll try to update this post over time as this part of the project evolves. Please check back for updates.

Also, below is a talk I gave at the 2019 PX4 Dev summit that was derived from this blog post.