Derivatives of optimal solutions w.r.t parameters

Hi all,

I’m wonder if there is anyway to get the derivate of the optimal solution w.r.t the parameters (preferably from Python interface, C interface is also ok). The eval_param_sens function seems to be able to calculate the sensitivity w.r.t the initial condition, but I’m wondering if more general settings exists, e.g., when parameters appears in the cost function or system dynamics.

Thanks a lot!

1 Like

Hi :wave:

we just merged a pull request implementing solution sensitivities with respect to parameters! Note that this feature is currently limited to external costs and discrete dynamics and does not cover parametric constraints.

I recommend you check the example here. Note that the solution sensitivities are only correct if an exact Hessian is used. Whenever you use any form of Hessian approximation or regularization (e.g. Gauss-Newton Hessian) for solving your OCP, we recommend to create two solvers: one for solving using an approximate Hessian, and one using an exact Hessian that is used to compute solution sensitivities. This approach is also taken in the example.

Best, Katrin

3 Likes

Hi Katrin,

Glad to see this feature! I believe this would be super useful for applications combining MPC and learning, e.g. differentiable MPC.

Thanks for your work. I will have a look into that.

Best,
Fenglong

1 Like

Hi Katrin,

I’ve looked into the example and I’d like to thank you for adding this great feature. Now I’m wondering what would be the best manner if I want to have a differentiable OCP layer in neural network, for example, if I want to implement an AcadosOcpLayer which can be integrated into PyTorch and trained on GPU.

I think it’s possible to extend a PyTorch layer by subclassing torch.autograd.Function and customize forward() and backward() function with acados (forward just solves the OCP and return optimal inputs, backward evaluates the sensitivity of optimal inputs w.r.t OCP parameters). But I’m wondering if this is a good solution for training on GPU. As far as I know, acados is optimized for CPU operations. So I’m not sure how efficient it would be on GPU. Another approach might be to run the AcadosOcpLayer on CPU and keep transporting the data between CPU and GPU, which could be slow as I can imagine.

I don’t have an answer to this question, so I’d like to ask about your thoughts. I would appreciate your opinions.

Thanks and best,
Fenglong

Hi Fenglong,

I guess your proposed approach with the acados layer running on CPU and everything else on GPU is the only way to go at the moment.

One more important point to keep in mind is initialization/warm-starting. You might see a significant speed up if you manage to keep your solver warm-started, i.e. the problem parameters and the initial state constraint should not change to much from one problem to the next. This might not be trivial to achieve within your training routine.

Let me know how it goes, very interested in combining optimal control and learning-based approaches!

1 Like

One more interesting feature for your AcadosOcpLayer might be the recently added batch solver, see this example.

They allow parallelization of solves via openmp.

Thinking about the computations in the SQP, I don’t know if most of the non-QP solving operations will map well to a GPU, actually - since they probably boil down to evaluating nonlinear functions to form the matrices to be passed into the QP to compute the step size.

What might help this is that in the upcoming[1] OSQP 1.0 release we will have a CUDA backend that can do all the QP computations on an NVIDIA GPU. So once we release that and update Acados to use it, then the QP part of Acados can be done on a GPU, but you would still have the data movement between GPU<->CPU in many places.

[1] Yes, I keep saying it is upcoming - but it is almost done :smiley:

2 Likes

Hi @FenglongSong,

we have been adapting the solution sensitivity capabilities of acados quite a bit recently, please check this PR and the corresponding examples.

Best, Katrin

1 Like

Hi @kaethe,

Thank you for the great library and the sensitivity features. I was wondering if it possible to calculate the sensitivity of the solution of the MPC w.r.t. a time-varying input reference? I see that it is possible to the sensitivity w.r.t. a global (which I believe is time-invariant?) parameter like mass, or the sensitivity w.r.t. the initial state. What about for time-varying parameters? My only thought is to do something similar to the example where you change the global parameter before every optimization but unsure if this is efficient or the correct approach.

Also I was wondering if @FenglongSong managed to get the PyTorch layer working? I am hoping to do something similar.

Thank you!!

Best, Federico

Hi @federico.pizarro ,

Here is a piece of code I was using half a year ago. I remember I got some meaningful results with this. But I no longer work on this now because I lost confidence in differentiable-MPC.

Hope this will be useful to you or other people.

import numpy as np
from acados_template import AcadosOcpSolver
from torch import nn
import torch


class AcadosMpcLayer(nn.Module):
	r"""
	Differentiable optimization layer with acados.
	Solves an optimal control problem (OCP) using acados.

	Following links are useful for implementation:
	- Code structure is inspired from here: https://github.com/cvxgrp/cvxpylayers/blob/master/cvxpylayers/torch/cvxpylayer.py
	- Instructions for extending pytorch function: https://pytorch.org/docs/stable/notes/extending.html#combining-forward-context
	"""

	def __init__(self, acados_ocp_solver: AcadosOcpSolver) -> None:
		super(AcadosMpcLayer, self).__init__()
		if acados_ocp_solver.acados_ocp.solver_options.hessian_approx != 'EXACT':
			raise RuntimeError("The exact hessian must be used.")
		if acados_ocp_solver.acados_ocp.solver_options.with_solution_sens_wrt_params is False:
			raise RuntimeError("Sensitivity evaluation must be enabled in acados")
		self.acados_ocp_solver = acados_ocp_solver
		self.nx = acados_ocp_solver.acados_ocp.model.x.shape[0]
		self.nu = acados_ocp_solver.acados_ocp.model.u.shape[0]
		self.np = acados_ocp_solver.acados_ocp.model.p.shape[0]

	def forward(self, x0: torch.Tensor, params: torch.Tensor) -> torch.Tensor:
		# x should be of size (batch_size, nx)
		f = _AcadosOcpLayerFunction(self.acados_ocp_solver)
		sol = f(x0, params)
		return sol


def _AcadosOcpLayerFunction(acados_ocp_solver: AcadosOcpSolver):
	r"""
	The implementation of AcadosOcpLayer.
	:param acados_ocp_solver:
	:return:
	"""
	class _AcadosOcpLayer(torch.autograd.Function):
		@staticmethod
		def forward(x0, params):
			# x0 and params should be in batches
			# Up to now, we only consider setting the same p for all stages
			if not x0.shape[0] == params.shape[0]:
				raise ValueError('Batch size mismatch')

			batch_size = x0.shape[0]
			# TODO: pay attention to the data type
			u0 = torch.Tensor(batch_size, acados_ocp_solver.acados_ocp.dims.nu)
			for batch in range(batch_size):
				for stage in range(acados_ocp_solver.acados_ocp.dims.N):
					acados_ocp_solver.set(stage, 'p', params[batch, :].cpu().numpy())

				# RESET initial guesses (discard warm start)
				for stage in range(acados_ocp_solver.acados_ocp.dims.N):
					acados_ocp_solver.set(stage, 'x', np.zeros(acados_ocp_solver.acados_ocp.model.x.shape[0]))
					acados_ocp_solver.set(stage, 'u', np.zeros(acados_ocp_solver.acados_ocp.model.u.shape[0]))
				acados_ocp_solver.set(acados_ocp_solver.acados_ocp.dims.N, 'x', np.zeros(acados_ocp_solver.acados_ocp.model.x.shape[0]))

				u_opt = acados_ocp_solver.solve_for_x0(x0[batch, :].cpu().numpy(), fail_on_nonzero_status=False)
				if acados_ocp_solver.get_status() not in [0, 2]:
					pdb.set_trace()
				# if acados_ocp_solver.get_status() not in [0, 2]:
				# 	for stage in range(acados_ocp_solver.acados_ocp.dims.N):
				# 		acados_ocp_solver.set(stage, 'x', np.zeros(acados_ocp_solver.acados_ocp.model.x.shape[0]))
				# 		acados_ocp_solver.set(stage, 'u', np.zeros(acados_ocp_solver.acados_ocp.model.u.shape[0]))
				# 	u_opt = acados_ocp_solver.solve_for_x0(x0[batch, :].cpu().numpy(), fail_on_nonzero_status=False)
				# 	pdb.set_trace()
				u0[batch, :] = torch.tensor(u_opt)
			return u0

		@staticmethod
		def setup_context(ctx, inputs, output):
			x0, _ = inputs
			batch_size = x0.shape[0]
			ctx.batch_size = batch_size
			ctx.nu = acados_ocp_solver.acados_ocp.dims.nu
			ctx.nx = acados_ocp_solver.acados_ocp.dims.nx
			ctx.np = acados_ocp_solver.acados_ocp.dims.np

		@staticmethod
		def backward(ctx, grad_output):
			# The output of backward() function must be the same dimension as input arguments of forward
			# Refer to: https://discuss.pytorch.org/t/a-question-about-autograd-function/201364
			batch_size = ctx.batch_size
			grad_u0_x0, grad_u0_p = None, None

			if ctx.needs_input_grad[0]:
				grad_u0_x0 = torch.Tensor(batch_size, ctx.nx).type(grad_output.dtype)
				for batch in range(batch_size):
					_, dudx0 = acados_ocp_solver.eval_solution_sensitivity(0, "initial_state")
					grad_u0_x0[batch, :] = grad_output[batch] @ torch.tensor(dudx0, dtype=grad_output.dtype)
			if ctx.needs_input_grad[1]:
				grad_u0_p = torch.Tensor(batch_size, ctx.np).type_as(grad_output)
				for batch in range(batch_size):
					_, dudp = acados_ocp_solver.eval_solution_sensitivity(0, "params_global")
					grad_u0_p[batch, :] = grad_output[batch] @ torch.tensor(dudp, dtype=grad_output.dtype)

			return grad_u0_x0, grad_u0_p
	
	return _AcadosOcpLayer.apply

Best,
Fenglong

1 Like

Hi Federico, hi Fenglong,

we have been working on the solution sensitivities quite a bit in the last months. Please check these examples on how to use them.

Besides, we have been teaming up with the group of Sebastien Gros from NTNU and the group of Joschka Boedecker from University Freiburg to work on a Python package that allows you to tune a parametrized MPC formulation via RL (publication coming soon :rocket:). The package uses pytorch and ist built on top of acados. It is very much work in progress at the moment, but includes a differentiable MPC layer, so maybe it is already helpful for you!

Regarding your question on how to compute sensitivities with respect to a time-varying reference: This is supported via multi-phase OCPs. In this case, each stage has to be a separate phase. The global parameters are then the concatenation of all global parameters used by the stages.

Best, Katrin

3 Likes

Thank you so much @FenglongSong, I will take a look!

Thank you @kaethe! I actually saw your paper [2501.15897] MPC4RL -- A Software Package for Reinforcement Learning based on Model Predictive Control which is what led me to trying to use acados rather than calculating the sensitivities by hand in casadi. Looking forward to the final publication, I think it will be very cool! I will take a closer look at the examples and leap-c, and definitely take a look at the multi-phase stuff, but thanks to you both I actually managed to get the initial step working! Thank you so much!

2 Likes