I have finally successfully managed to get ACADOS and the
main_pendulum_ode example up and running on the STM32H743, a 32-bit ARM Cortex-M7 device running at 480 MHz and with double-precision floating point accelerator (FPU)
The example and I believe any useful problems with ACADOS will be quite memory expensive. This particular example consumed 468 KB of RAM which is why I had to utilize the D1 memory bank of the STM32H7 which is still within the MCU but it lives outside the CPU. This causes a slight memory bottleneck but luckily the STM32H7 implements a D-Cache for speeding it up.
- Without any code optimizations (-O0) and no cache the solve time is 1.27 seconds.
- With code optimizations (-O3) and no cache the solve time is 463 milliseconds.
- With code optimizations (-O3) and with D-Cache enabled the solve time is 182 milliseconds.
In my opinion 182 ms on an embedded platform compared to the 19.20 ms on my x64 machine is not too bad.
And yes, I did verify the results against the same problem and solver compiled on my x64 machine - and they were identical
It took quite some effort and tweaks and it also included some modifications to ACADOS to help with the 32-bit alignment issue (yes, I will opening a PR as well).
I will be uploading the project to my GitHub soon, so keep posted