Better control of optimization options for libraries (HPIPM, ACADOS and BLASFEO)

Hi all,

While trying to optimize the solver performance on our target environment we have had to make small modifications to the HPIPM, BLASFEO and Acados cmake files. Specifically we have removed the ‘-O2’ optimization flag in order to use the following custom flags.

-DCMAKE_C_FLAGS="-march=skylake -O3"

Question is if there are very good reasons to enforce -02 and if there is a good way for us to make the flags configurable (with a sensible default), both in the Acados library, but also in BLASFEO and HPIPM?

Hi,

I remember that the perfomance gained by using -O3 is marginal / negligible and that this is why we use -O2.
For the generated code, we have the option ext_fun_compile_flags, where the default is "-O2", so that is easy to change.
I don’t know what is the best way to pass down C flags in a hierachical CMake workflow.
But if you have a suggestion / PR, I think it would be welcome.

Best,
Jonathan

In acados, the computationally most intensive operations are either linear algebra operations or model functions; in HPIPM there are no model functions (the model is entirely matrix based), and linear algebra operations account for most of the computation time.

Jonathan already replied about the model external functions.

Regarding the linear algebra operations, these are implemented in BLASFEO, and for the most used targets the key computational kernels are entirely hand coded in assembly, that is not further optimized by the compiler, but simply assembled into object code.
The remaining C code has mostly negligible effect on performance, and, compared to -O3, the flag -O2 gives practically identical performance at significantly smaller compilation time.
Similarly, targeting a specific architecture with -march=skylake doesn’t seem to affect performance, and the flags enabling targeted instruction sets such as AVX are used explicitly with e.g. -mavx.

At least these were our findings when investigating the matter a few years ago.
If in your experiments you find that things have changed in the mean while and that other compilation flags give noticeable speedups, please let us know so we can improve them.

1 Like