Neural Network as Terminal Cost

Hi

I am working with acados in python and have implemented a neural network as an external terminal cost. The initial results look logical with better stabilisation with the neural network acting as a “infinite horizon” of sorts. However, I would like to better understand if my implementation is working correctly.

  1. To avoid deriving the Hessian of the neural network, I intend to exploit the Gauss-Newton approximation and hence the last layer of my neural network is defined as a V_\theta(x) = \frac{1}{2} \| f_\theta(x) \|^2 or 0.5 * L2 squared norm. And if I set these variables below, does that mean it will automatically implement the above?
ocp.solver_options.ext_cost_num_hess = True
ocp.solver_options.hessian_approx = 'GAUSS_NEWTON'
  1. If the above is true, my next question would be to ask if there are additional mathematical operations like scaling of the output within the network after the L2 squared norm, would the Gauss-Newton approximation still be valid?
  2. Is there a way to extract the individual cost(stage/terminal) instead of the combined value? Currently, I use the method below.
cost = self.ocp_solver.get_cost()

My code for setting up with the model is below.

def NNsetup(config, yref, collision_config=None):
    mpc_config = config["mpc"]

    Fmax = mpc_config["Fmax"]
    N_horizon = mpc_config["N_horizon"]
    RTI = mpc_config["use_RTI"]
    x0 = np.array(mpc_config["x0"])
    Tf = N_horizon * mpc_config["mpc_timestep"]  # Time horizon
    Q_mat = np.diag(mpc_config["Q_mat"]) # State cost weight matrix
    R_mat = np.diag(mpc_config["R_mat"]) # Input cost weight matrix

    # Create ocp object to formulate the OCP
    ocp = AcadosOcp()

    # Call model creation function
    model, robot_sys = export_ode_model(config)
    ocp.model = model

    # Extract state and input dimensions
    nx = model.x.rows() # Possible to extract the last predicted state here to insert into NN. maybe...
    nu = model.u.rows()
    ny = nx + nu
    ny_e = nx

    # set cost module
    ocp.cost.cost_type = 'NONLINEAR_LS'     # Stage cost
    ocp.cost.cost_type_e = 'EXTERNAL'   # Terminal cost

    ocp.cost.W = scipy.linalg.block_diag(Q_mat, R_mat)  # Stage cost includes both states and input penalty

    # === NN ===
    device = "cpu"
    checkpoint_path = "neural_network/output/2025-12-04_14-08-53_train_model/model_epoch_2252.pt"
    NNmodel = PendulumModelTruncated().to(device)
    state_dict = torch.load(checkpoint_path, map_location=device)
    NNmodel.load_state_dict(state_dict)
    NNmodel.eval()

    # Wrap for CasADi
    l4c_model = l4c.L4CasADi(NNmodel, device=device)

    # Evaluate NN symbolically
    y_sym = l4c_model(ca.transpose(model.x))
    ocp.model.cost_expr_ext_cost_e = y_sym
    ocp.solver_options.ext_cost_num_hess = True
    ocp.solver_options.model_external_shared_lib_dir = l4c_model.shared_lib_dir
    ocp.solver_options.model_external_shared_lib_name = l4c_model.name
    # === ====

    # Get collision constraints
    if mpc_config["IK_required"] and collision_config is not None:
        # Generate collision constraints
        # constraints = build_capsule_collision_constraints(robot_sys, 
        #                                                       collision_config["links"], 
        #                                                       collision_config["obstacles"], 
        #                                                       collision_config["collision_pairs"])

        # Add collision avoidance constraint between two capsules
        capsule_dist_sq = capsule_squared_distance_function()

        p1 = robot_sys.j_1
        p2 = robot_sys.attachment_site

        q1 = np.array([0.0, 0.7, 0.35])
        q2 = np.array([0.0, 0.7, 0.65])

        dist = capsule_dist_sq(p1, p2, q1, q2, 0.1, 0.05)

        ocp.model.con_h_expr = dist
        ocp.constraints.lh = np.array([0.05])       # epsilon (additional safety distance)
        ocp.constraints.uh = np.array([1e6])
 
        x0 = np.array(mpc_config["x0_q"])

        # Pad yref with input reference(u). Currently only position and velocity.
        yref_u = np.zeros((yref.shape[0], nu))
        yref = np.hstack((yref, yref_u))

        ocp.model.cost_y_expr = vertcat(model.x, model.u)   # Stage cost includes both states and input
        # ocp.model.cost_y_expr_e = y_sym                   # Terminal cost only inlcudes states
        ocp.cost.yref  = np.zeros((ny, ))                   # Set stage references to match first entry of yref for all states and inputs
        # ocp.cost.yref_e = np.zeros((ny_e, ))                # Set terminal reference to match first entry of yref for states only

    else:
        ocp.model.cost_y_expr = vertcat(model.x, model.u)   # Stage cost includes both states and input
        # ocp.model.cost_y_expr_e = y_sym                   # Terminal cost only inlcudes states
        ocp.cost.yref  = np.zeros((ny, ))                   # Set stage references to match first entry of yref for all states and inputs
        # ocp.cost.yref_e = np.zeros((ny_e, ))                # Set terminal reference to match first entry of yref for states only

    # Set input constraints
    ocp.constraints.lbu = -np.array(Fmax)
    ocp.constraints.ubu = np.array(Fmax)
    # Apply above to all inputs
    ocp.constraints.idxbu = np.arange(nu)

    # Set initial constraint
    ocp.constraints.x0 = x0

    # set prediction horizon
    ocp.solver_options.N_horizon = N_horizon
    ocp.solver_options.tf = Tf # Total predicton time

    ocp.solver_options.qp_solver = 'PARTIAL_CONDENSING_HPIPM' # FULL_CONDENSING_QPOASES
    ocp.solver_options.hessian_approx = 'GAUSS_NEWTON'
    ocp.solver_options.integrator_type = 'IRK'
    ocp.solver_options.sim_method_newton_iter = 10
    ocp.solver_options.regularize_method = mpc_config["regularize_method"] # For the Hessian
    ocp.solver_options.levenberg_marquardt = mpc_config["levenberg_marquardt"]
    ocp.solver_options.nlp_solver_warm_start_first_qp_from_nlp = mpc_config["nlp_solver_warm_start_first_qp_from_nlp"]
    ocp.solver_options.nlp_solver_warm_start_first_qp = mpc_config["nlp_solver_warm_start_first_qp"]
    ocp.solver_options.qp_solver_warm_start = mpc_config["qp_solver_warm_start"]
    # ocp.solver_options.adaptive_levenberg_marquardt_lam

    if RTI:
        ocp.solver_options.nlp_solver_type = 'SQP_RTI'
    else:
        ocp.solver_options.nlp_solver_type = 'SQP'
        ocp.solver_options.globalization = 'MERIT_BACKTRACKING' # turns on globalization
        ocp.solver_options.nlp_solver_max_iter = 150
    
    ocp.solver_options.qp_solver_cond_N = N_horizon

    # Create solver based on settings above
    solver_json = 'acados_ocp_' + model.name + '.json'
    acados_ocp_solver = AcadosOcpSolver(ocp, json_file = solver_json, verbose=False)

    return acados_ocp_solver, yref

Hi :waving_hand:

sounds like an interesting example! Your implementation has some flaws unfortunately.

There are two ways to implement a Gauss-Newton Hessian for your case:

  1. Change your neural network to output f_\theta(x). Then use the NONLINEAR_LS cost for the terminal stage with cost_y_expr_e set to f_\theta(x). With ocp.solver_options.hessian_approx = 'GAUSS_NEWTON', acados will then compute the GN Hessian approximation for you.
  2. Keep the neural network as it is but provide a custom Hessian expression via cost_expr_ext_cost_custom_hess_e which should then be a CasADi expression for the GN Hessian.

You never want to set ext_cost_num_hess = True. In this case, acados expects you to set numerical values for the (fixed) Hessian after solver creation.

Hope this is helpful,
Katrin

Hi Katrin,

Thank you for the clarification and advice. If I were to expand on your first solution which is more straightforward.

Let’s say I have an example terminal cost which I am trying to imitate through supervised learning/imitation learning. I will refer to the ground truth terminal cost as Creal, and its associated state x. Therefore, since acados will take the output of my network and perform 0.5 * squared L2 norm, I would then need to perform the inverse on Creal to √(2 Creal), such that the final terminal cost will be the same assuming the training is perfect? Am I correct to say this?

And code wise:

ocp.cost.cost_type_e = 'NONLINEAR_LS'   # Terminal cost
ocp.model.cost_y_expr_e = y_sym
ocp.solver_options.hessian_approx = 'GAUSS_NEWTON'

Thanks
Nico

Hi Katrin,

I have implemented the solution you suggested, but acados also requires that W_e and yref_e have matching dimensions.

If I understand correctly, since my neural network outputs a single scalar value, I should set:

\text{W_e} = 1

to maintain the same scale, and

y{\text{ref_e}} = 0

so that the terminal cost becomes:

\frac{1}{2} \, \big(f_\theta(x)\big)^2

Is this correct?

thanks
Nico

yes exactly