I have some questions about acados i.c.w. ARM Cortex R-series. Not really close to implementing yet, rather looking ahead and being curious:
- Is it expected that cross-compilation with target=generic will work for such systems now? In other threads, I have come across 32-bit open issues and required support for vectorized floating point instructions?
- What kind of performance increase do BLASFEO and HPMPC achieve with ARM Cortex A-series specific implementations vs generic? Would those be easily transferrable to R-series processors?
- I have also come across https://developer.arm.com/tools-and-software/server-and-hpc/compile/arm-compiler-for-linux/resources/tutorials/benchmarks . The benchmark curves are interesting in that they achieve significant speedups w.r.t. BLAS for smaller size matrices, just as BLASFEO does. The ARM PL speedup looks huge though?! Can anyone provide a comparative statement? Could acados and/or its solvers be compiled against the ARM performance libraries for BLAS operations?
Apologies for the many questions, but it’s all so fascinating