Benchmarking different studies against each other

Drivers:

BayesianOptimization, ActiveLearning, CMAES, DifferentialEvolution, ScipyMinimizer

Download script: benchmark.m

In some cases, it is not clear, which driver or which driver configuration is able to minimize an objective function with the smallest number of iterations. If similar minimizations are performed on a regular basis, it can be worthwhile to find the best choice by benchmarking several studies with different drivers or configurations against each other.

As an example, the usual 2D Rastrigin function on a circular domain shall be minimized:

\[ \begin{align}\begin{aligned}&\text{min.}\,& f(x_1,x_2) = 2\cdot10 + \sum_{i=1,2} \left(x_i^2 - 10\cos(2\pi x_i)\right)\\&\text{s.t.}\,& \sqrt{x_1^2 + x_2^2} \leq 1.5.\end{aligned}\end{align} \]

We compare the heuristic minimization methods CMAES and DifferentialEvolution, the multi-start local minimization method ScipyMinimizer, and Bayesian optimization. For the latter, we use the standard BayesianOptimization driver and the ActiveLearning driver that is configured to use Neural Networks instead of a Gaussian process to learn the objective function.

We assume that we can evaluate the function 2 times in parallel and that each evaluation takes 2 seconds. Hence, the driver has to compute new samples in less than a second in order to avoid significant overhead. This is an edge case for the Bayesian optimization approach, that has a considerable overhead typically larger than a second. I this case the other methods with much faster sample computation times can be more appropriate. Nevertheless, in this case of a multi-modal objective Bayesian optimization reaches the global minim still well before the other optimization methods. Gaussian process regression is computational efficient for this small number of iterations. In comparison the Neural Network training time of the ActiveLearning driver leads to a larger overhead.

  1jcm_optimizer_path = '<JCM_OPTIMIZER_PATH>';
  2addpath(fullfile(jcm_optimizer_path, 'interface', 'matlab'));
  3 
  4server = jcmoptimizer.Server();
  5client = jcmoptimizer.Client(server.host);
  6
  7% Definition of the search domain
  8design_space = {...
  9   struct('name', 'x1', 'type', 'continuous', 'domain', [-1.5,1.5]), ...
 10   struct('name', 'x2', 'type', 'continuous', 'domain', [-1.5,1.5]), ...
 11};
 12
 13% Definition of fixed environment parameter
 14environment = {...     
 15   struct('name', 'radius', 'type', 'fixed', 'domain', 1.5) ...
 16};
 17
 18% Definition of a constraint on the search domain
 19constraints = {...
 20   struct('name', 'circle', 'expression', 'sqrt(x1^2 + x2^2) <= radius')...
 21};
 22
 23%Creation of studies to benchmark against each other
 24studies = struct();
 25drivers = {"BayesianOptimization", "ActiveLearning", "CMAES", ...
 26           "DifferentialEvolution", "ScipyMinimizer"};
 27for i = 1:length(drivers)
 28   studies.(drivers{i}) = client.create_study( ...
 29        'design_space', design_space, ...
 30        'environment', environment, ...
 31        'constraints', constraints, ...
 32        'driver', drivers{i}, ...
 33        'name', drivers{i}, ...
 34        'study_id', sprintf("benchmark_%s", drivers{i}), ...
 35        'open_browser' , false ...
 36   );
 37end
 38
 39%Configuration of all studies
 40config = {'num_parallel', 1, 'max_iter', 250};
 41min_val = 1e-3; %Stop study when this value was observed
 42for i = 1:length(drivers)
 43    driver = drivers{i};
 44    study = studies.(driver);
 45    if strcmp(driver, "ActiveLearning")
 46        study.configure( ...
 47           'surrogates', {struct('type', "NN")}, ...
 48           'objectives', {struct('type', "Minimizer", 'min_val', min_val)}, ...
 49           config{:});
 50    elseif strcmp(driver, "ScipyMinimizer")
 51        %For a more global search, 3 initial gradient-free Nelder-Mead are started
 52        study.configure( ...
 53           'method', "Nelder-Mead", 'num_initial', 3, ...
 54           'min_val', min_val, config{:});
 55    else
 56        study.configure('min_val', min_val, config{:});
 57    end
 58end
 59% Evaluation of the black-box function for specified design parameters
 60function observation = evaluate(study, sample)
 61
 62    pause(2); % make objective expensive
 63    observation = study.new_observation();
 64    x1 = sample.x1;
 65    x2 = sample.x2;
 66    observation.add(10*2 ...
 67                    + (x1.^2-10*cos(2*pi*x1)) ...
 68                    + (x2.^2-10*cos(2*pi*x2)) ...
 69                   );
 70end  
 71
 72% Creation of a benchmark with 6 repetitions and add all 4 studies
 73benchmark = client.create_benchmark('num_average', 6);
 74for i = 1:length(drivers)
 75   benchmark.add_study(studies.(drivers{i}));
 76end
 77
 78% Run the benchmark - this will take a while
 79benchmark.set_evaluator(@evaluate);
 80benchmark.run();
 81
 82% Plot cummin convergence w.r.t. number of evaluations and time
 83fig = figure('Position', [0, 0, 1000, 500]);
 84subplot(1, 2, 1) 
 85data = benchmark.get_data('x_type', "num_evaluations");
 86
 87for i = 1:length(data.names)
 88  std_error = data.sdev(i,:) ./ sqrt(6);
 89  p = plot(data.X(i,:), data.Y(i,:), 'LineWidth', 2.0);
 90  hold on
 91  patch([data.X(i,:), fliplr(data.X(i,:))], ...
 92        [data.Y(i,:) - std_error, fliplr(data.Y(i,:) + std_error)], ...
 93        p(1).Color, 'FaceAlpha', 0.2, 'EdgeAlpha', 0.2);
 94end
 95grid()
 96xlabel("Number of Evaluations", 'FontSize', 12)
 97ylabel("Average Cummulative Minimum", 'FontSize', 12)
 98ylim([-0.4, 10])
 99
100subplot(1, 2, 2)
101data = benchmark.get_data('x_type', "time");
102plots=[];
103for i = 1:length(data.names)
104  std_error = data.sdev(i,:) ./ sqrt(6);
105  p = plot(data.X(i,:), data.Y(i,:), 'LineWidth', 2.0, 'DisplayName', data.names{i});
106  plots(i) = p;
107  hold on
108  patch([data.X(i,:), fliplr(data.X(i,:))], ...
109        [data.Y(i,:) - std_error, fliplr(data.Y(i,:) + std_error)], ...
110        p(1).Color, 'FaceAlpha', 0.2, 'EdgeAlpha', 0.2);
111end
112legend(plots, data.names{:});
113grid()
114xlabel("Time (sec)", 'FontSize', 12)
115ylim([-0.4, 10])
116saveas(fig, "benchmark.png")
Benchmark results

Left: The left graph shows the cummulative minimum as a function of the number of evaluations of the objective function. The results are averaged over six independent runs of each driver. The shaded area shows the uncertainty of the mean value. Clearly, the BayesianOptimization driver performs the best and finds the global minimum zero after about 70 iterations while the other methods did not converge to zero even after 250 iterations. Right: The convergence of the cummulative minimum as a function of the total optimization time looks slightly different. As expected, the ActiveLearning and BayesianOptimization drivers show an overhead for computing new samples. If one would have an optimization budget on less than 50 seconds, the multi-start local ScipyMinimizer performs better. It can be also seen that the training of Neural Networks leads to a larger overhead compared to the Gaussian process regression used by the BayesianOptimization driver.