How to solve large-scale nonlinear optimization problems with Ceres?

4.9k Views Asked by At

I need to optimize a surface represented by a 2D grid of points to produce normal vectors of the surface that align with provided target normal vectors. The grid size is likely to be between 201x201 and 1001x1001. That means that the number of variables will be 40,000 to 1,000,000, as I am only modifying the z-coordinates of the mesh points.

I am using the Ceres framework, as it is supposed to excel at large-scale nonlinear optimization problems. I already tried MATLAB's fmincon, but it uses an incredible amount of memory. I wrote an objective function that works for small meshes (successful at 3x3 and 31x31). However, when I try to compile the code with a large mesh size (157x200), I see the error below. I have read that this is a limitation of Eigen. However, when I tell Ceres to use LAPACK instead of Eigen, I get the same error for large matrices. I tried these lines:

options.dense_linear_algebra_library_type = ceres::LAPACK;

options.linear_solver_type = ceres::DENSE_QR;

These tell the solver to use LAPACK and DENSE_QR, as the output using a 3x3 mesh shows:

Minimizer                        TRUST_REGION

Dense linear algebra library           LAPACK
Trust region strategy     LEVENBERG_MARQUARDT

                                    Given                     Used
Linear solver                        DENSE_QR                 DENSE_QR
Threads                                     1                        1
Linear solver threads                       1                        1

However, when I use large parameters, I still get the errors for Eigen.

Anyways, I could really use some help with this. How can I get Ceres to optimize a large number of variables (> 30,000)? Thanks in advance

Link to Ceres: http://ceres-solver.org

Link to Eigen: http://eigen.tuxfamily.org/dox/

Error:

In file included from /usr/include/eigen3/Eigen/Core:254:0,
             from /usr/local/include/ceres/jet.h:165,
             from /usr/local/include/ceres/internal/autodiff.h:145,
             from /usr/local/include/ceres/autodiff_cost_function.h:132,
             from /usr/local/include/ceres/ceres.h:37,
             from /home/ubuntu/code/surfaceopt/surfaceopt.cc:10:
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h: In instantiation of ‘Eigen::internal::plain_array<T, Size, MatrixOrArrayOptions, Alignment>::plain_array()     [with T = double; int Size = 31400; int MatrixOrArrayOptions = 2; int Alignment = 0]’:
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:117:27:   required from     ‘Eigen::DenseStorage<T, Size, _Rows, _Cols, _Options>::DenseStorage() [with T = double; int Size = 31400; int _Rows = 31400; int _Cols = 1; int _Options = 2]’
/usr/include/eigen3/Eigen/src/Core/PlainObjectBase.h:421:55:   required from ‘Eigen::PlainObjectBase<Derived>::PlainObjectBase() [with Derived = Eigen::Matrix<double, 31400, 1, 2, 31400, 1>]’
/usr/include/eigen3/Eigen/src/Core/Matrix.h:203:41:   required from ‘Eigen::Matrix<_Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols>::Matrix() [with _Scalar = double; int _Rows = 31400; int _Cols = 1; int _Options = 2; int _MaxRows = 31400; int _MaxCols = 1]’
/usr/local/include/ceres/jet.h:179:13:   required from ‘ceres::Jet<T, N>::Jet() [with T = double; int N = 31400]’
/usr/local/include/ceres/internal/fixed_array.h:138:10:   required from ‘ceres::internal::FixedArray<T, inline_elements>::FixedArray(ceres::internal::FixedArray<T, inline_elements>::size_type) [with T = ceres::Jet<double, 31400>; long int inline_elements = 0l; ceres::internal::FixedArray<T, inline_elements>::size_type = long unsigned int]’
/usr/local/include/ceres/internal/autodiff.h:233:70:   required from ‘static bool ceres::internal::AutoDiff<Functor, T, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9>::Differentiate(const Functor&, const T* const*, int, T*, T**) [with Functor = ComputeEint; T = double; int N0 = 31400; int N1 = 0; int N2 = 0; int N3 = 0; int N4 = 0; int N5 = 0; int N6 = 0; int N7 = 0; int N8 = 0; int N9 = 0]’
/usr/local/include/ceres/autodiff_cost_function.h:218:25:   required from ‘bool ceres::AutoDiffCostFunction<CostFunctor, kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9>::Evaluate(const double* const*, double*, double**) const [with CostFunctor = ComputeEint; int kNumResiduals = 1; int N0 = 31400; int N1 = 0; int N2 = 0; int N3 = 0; int N4 = 0; int N5 = 0; int N6 = 0; int N7 = 0; int N8 = 0; int N9 = 0]’
/home/ubuntu/code/surfaceopt/surfaceopt.cc:367:1:   required from here
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:41:5: error: ‘OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG’ is not a member of ‘Eigen::internal::static_assertion<false>’
 EIGEN_STATIC_ASSERT(Size * sizeof(T) <= 128 * 128 * 8, OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG);
 ^
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h: In instantiation of ‘Eigen::internal::plain_array<T, Size, MatrixOrArrayOptions, 16>::plain_array() [with T = double; int Size = 31400; int MatrixOrArrayOptions = 1]’:
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:120:59:   required from ‘Eigen::DenseStorage<T, Size, _Rows, _Cols, _Options>::DenseStorage(Eigen::DenseIndex, Eigen::DenseIndex, Eigen::DenseIndex) [with T = double; int Size = 31400; int _Rows = 1; int _Cols = 31400; int _Options = 1; Eigen::DenseIndex = long int]’
/usr/include/eigen3/Eigen/src/Core/PlainObjectBase.h:438:41:   required from ‘Eigen::PlainObjectBase<Derived>::PlainObjectBase(Eigen::PlainObjectBase<Derived>::Index, Eigen::PlainObjectBase<Derived>::Index, Eigen::PlainObjectBase<Derived>::Index) [with Derived = Eigen::Matrix<double, 1, 31400, 1, 1, 31400>; Eigen::PlainObjectBase<Derived>::Index = long int]’
/usr/include/eigen3/Eigen/src/Core/Matrix.h:273:76:   required from ‘Eigen::Matrix<_Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols>::Matrix(const Eigen::MatrixBase<OtherDerived>&) [with OtherDerived = Eigen::Transpose<const Eigen::Matrix<double, 31400, 1, 2, 31400, 1> >; _Scalar = double; int _Rows = 1; int _Cols = 31400; int _Options = 1; int _MaxRows = 1; int _MaxCols = 31400]’
/usr/include/eigen3/Eigen/src/Core/DenseBase.h:367:62:   required from ‘Eigen::DenseBase<Derived>::EvalReturnType Eigen::DenseBase<Derived>::eval() const [with Derived = Eigen::Transpose<const Eigen::Matrix<double, 31400, 1, 2, 31400, 1> >; Eigen::DenseBase<Derived>::EvalReturnType = const Eigen::Matrix<double, 1, 31400, 1, 1, 31400>]’
/usr/include/eigen3/Eigen/src/Core/IO.h:244:69:   required from ‘std::ostream& Eigen::operator<<(std::ostream&, const Eigen::DenseBase<Derived>&) [with Derived = Eigen::Transpose<const Eigen::Matrix<double, 31400, 1, 2, 31400, 1> >; std::ostream = std::basic_ostream<char>]’
/usr/local/include/ceres/jet.h:632:35:   required from ‘std::ostream& ceres::operator<<(std::ostream&, const ceres::Jet<T, N>&) [with T = double; int N = 31400; std::ostream = std::basic_ostream<char>]’
/home/ubuntu/code/surfaceopt/surfaceopt.cc:103:50:   required from ‘bool ComputeEint::operator()(const T*, T*) const [with T = ceres::Jet<double, 31400>]’
/usr/local/include/ceres/internal/variadic_evaluate.h:175:26:   required from ‘static bool ceres::internal::VariadicEvaluate<Functor, T, N0, 0, 0, 0, 0, 0, 0, 0, 0, 0>::Call(const Functor&, const T* const*, T*) [with Functor = ComputeEint; T = ceres::Jet<double, 31400>; int N0 = 31400]’
/usr/local/include/ceres/internal/autodiff.h:283:45:   required from ‘static bool ceres::internal::AutoDiff<Functor, T, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9>::Differentiate(const Functor&, const T* const*, int, T*, T**) [with Functor = ComputeEint; T = double; int N0 = 31400; int N1 = 0; int N2 = 0; int N3 = 0; int N4 = 0; int N5 = 0; int N6 = 0; int N7 = 0; int N8 = 0; int N9 = 0]’
/usr/local/include/ceres/autodiff_cost_function.h:218:25:   required from ‘bool ceres::AutoDiffCostFunction<CostFunctor, kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9>::Evaluate(const double* const*, double*, double**) const [with CostFunctor = ComputeEint; int kNumResiduals = 1; int N0 = 31400; int N1 = 0; int N2 = 0; int N3 = 0; int N4 = 0; int N5 = 0; int N6 = 0; int N7 = 0; int N8 = 0; int N9 = 0]’
/home/ubuntu/code/surfaceopt/surfaceopt.cc:367:1:   required from here
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:79:5: error: ‘OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG’ is not a member of ‘Eigen::internal::static_assertion<false>’
 EIGEN_STATIC_ASSERT(Size * sizeof(T) <= 128 * 128 * 8, OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG);
 ^
make[2]: *** [CMakeFiles/surfaceopt.dir/surfaceopt.cc.o] Error 1
make[1]: *** [CMakeFiles/surfaceopt.dir/all] Error 2
make: *** [all] Error 2

My code looks like this (abbreviated to take out irrelevant material):

#define TEXT true
#define VERBOSE false 
#define NV 31400
#define NF 62088
#define NX 157
#define NY 200
#define MAXNB 6

#include <math.h>
#include <ceres/ceres.h>
#include <ceres/rotation.h>
#include "glog/logging.h"
#include <iostream>
#include <fstream>
#include <iterator>
#include <algorithm>
#include <string>

using ceres::AutoDiffCostFunction;
using ceres::CostFunction;
using ceres::Problem;
using ceres::Solver;
using ceres::Solve;
using ceres::CrossProduct;
...

class ComputeEint {

private:
  double XY_ [NV][2];         // X and Y coords
  int C_ [NF][3];             // Connectivity list
  int AF_ [NV][MAXNB];        // List of adjacent faces to each vertex
  double Ntgt_ [NV][3];       // Target normal vectors
  int num_AF_ [NV];           // Number of adjacent faces to each vertex
public:

//Constructor
ComputeEint(double XY[][2], int C[][3], int AF[][MAXNB], double Ntgt[][3], int num_AF[NV]) {

std::copy(&XY[0][0], &XY[0][0]+NV*2,&XY_[0][0]);
...

template <typename T>
bool operator()(const T* const z, T* e) const {
  e[0] = T(0);
  ... 
  //Computes vertex normals for triangulated surface by averaging adjacent face normals
  ...
  e[0] = e[0]/T(NV);
  return true;
  }
};

int main(int argc, char** argv) {
  google::InitGoogleLogging(argv[0]);

  double Tp [NV][3];            //Points in mesh
  int    Tc [NF][3];            //Mesh connectivity list
  double Ntgt [NV][3];          //Target normals
  int    AF [NV][MAXNB];        //List of adjacent faces of each vertex
  int    num_AF [NV];           //Number of adjacent faces for each vertex

  int nx = NX;
  int ny = NY;

  //Read Tp, Tc, Ntgt, AF, num_AF from file
  ...
  // Set up XY for cost functor
  double XY [NV][2];
  double z [NV];
  //Copy first two columns of Tp into XY
  Problem problem;

  // Set up the only cost function (also known as residual). This uses
  // auto-differentiation to obtain the derivative (jacobian).
  CostFunction* cost_function =
  new AutoDiffCostFunction<ComputeEint, 1, NV>(new ComputeEint(XY, Tc, AF, Ntgt, num_AF));

  std::cout << "Created cost function" << "\n";
  problem.AddResidualBlock(cost_function, NULL, &z[0]);

  std::cout << "Added residual block" << "\n";

  // Run the solver!
  Solver::Options options;
  options.minimizer_progress_to_stdout = true;
  options.max_num_iterations = 50;
  options.function_tolerance = 1e-4;
  options.dense_linear_algebra_library_type = ceres::LAPACK;
  Solver::Summary summary;
  Solve(options, &problem, &summary);

  std::cout << summary.FullReport() << "\n";

  //Write output of optimization to file
  ...
  return 0;
}
3

There are 3 best solutions below

1
Sameer Agarwal On BEST ANSWER

Two things

  1. You are using DENSE_QR as your linear solver, which results in a dense jacobian. This is a bad idea. Change your linear solver to SPARSE_NORMAL_CHOLESKY and you should be able to solve problems of this size quite easily.

You are going to need SuiteSpare/CXSparse if you are using version 1.9 or older. If you use the latest release candidate or git version, you should be able to use Eigen to do the sparse linear algebra too.

  1. You are creating a single cost function for the whole problem. Which means you are not exposing any sparsity to the problem. Which is what is causing the stack allocation problem since the automatic differentiation stuff involves data on the stack.

Have a look at the example code that ships with ceres, for example denoising.cc which denoises an entire image, and has a similar grid like structure.

More generally create a single residual block for each vertex in your problem.

1
Eider On

This answer is based completely on my experience with C++ and Eigen (I don't know Ceres):

The option.* lines appear to control runtime behavior, but the error message is a compile time error.

The most relevant part of the error that stands out to me is:

/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:79:5: error: ‘OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG’ is not a member of ‘Eigen::internal::static_assertion’ EIGEN_STATIC_ASSERT(Size * sizeof(T) <= 128 * 128 * 8, OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG);

Eigen will often allocate fixed sized matrices on the stack and I think that is what is happening here. With matrices the size that you are dealing with, they need to be allocated on the heap. To force Eigen to do this, try choosing dynamically sized matrices. After it compiles, you should be able run it with or without Eigen by using the options that you found.

The specific solution appears to be using DynamicAutoDiffCostFunction instead of AutoDiffCostFunction. A relevant snippet of the documentation:

AutoDiffCostFunction requires that the number of parameter blocks and their sizes be known at compile time. It also has an upper limit of 10 parameter blocks.

http://ceres-solver.org/modeling.html?highlight=dynamic#dynamicautodiffcostfunction

9
Sameer Agarwal On

Peter, Yes technically you can use dynamic autodiff cost function, but it is going to be incredibly slow and it is going to present the whole problem as a single residual to the solver, so no sparsity and severely rank deficient jacobian.

You should think about adding a residual per grid point, with it only depending on neighboring nodes.

If you continue using one cost residual block, suitesparse or not you are still going to be in trouble.