Is it currently possible to use acados in microcontrollers with only single-precision FPUs? In particular, I would ideally want to run it on bare-metal 32-bit ARM architecture. I have not been able to find any successful examples of this, but I also haven’t found anything saying it is fundamentally impossible either.
From what I’ve found, it seems like it could be possible. There are several solvers that support single-precision (e.g. qpOASES), and I would expect that this is where numerical stability is most sensitive. For my application, I have a discrete-time dynamics model with analytical Jacobians, so numerical differentiation and discretization should not be an issue. I am willing to do some modifications to the source code if needed, and perhaps open a PR to contribute if I get it to work.
If this is possible, I will be using the C interface of acados to implement the problem and incorporate it with the rest of my code. If there are any particular settings that are needed to try this out, please let me know.
unfortunately I don’t have any experience with such a project myself, but you might want to have a look at this example. Perhaps there are some useful hints.
Thank you for the reference. I’ve looked at this example before, and I can see it uses doubles. As far as I know, the STM32H7 comes with a double-precision FPU, so the example doesn’t directly address the single-precision feasibility question. That said, it does have a lot of useful hints for running acados on 32-bit ARM in general, which I appreciate.
Hi, in general the dependencies BLASFEO and HPIPM come with both double and single precision version of each routine, but so far acados has been implemented only in double precision. There are some subtle changes that may need to be implemented, and if you have not been involved in the development of the acados core it may be hard to nail them down.
Just for better understanding, what is the reason for wanting to use acados in single precision?
In general on 32-bit ARM architectures it should be possible to compile the code with soft-fp options, such that the floating point instructions are emulated if not available in hardware. Clearly this makes all computations much slower, but you haven’t mentioned speed requirements in your post.
If this soft-fp works, you could use this to identify where the computational bottleneck is.
Likely, it would only be in the QP solver: at that point, you could keep acados in double precision, and just modify the wrapper to your selected QP solver, in order to use the single precision version of that QP solver for improved speed.
That would be a much easier project than porting the entire acados to single precision.
Thank you for the detailed answer and suggestions. This is very helpful! I should have mentioned in my original post that I do have strict speed and memory requirements, which is why I want to avoid compiling with softfp.
I’ll definitely explore the approach with a modified QP solver wrapper and see if that gives me the performance I need.