Solution of least-square problem using Bayesian optimization

Driver:

BayesianLeastSquares

Download script: bayesian_least_squares.m

The target of the study is to showcase the solution of a non-linear least-squares problem from the NIST statistical reference datasets. As an example, the MGH17 problem is considered which consists of fitting a vectorial model \(\mathbf{f}(\mathbf{b}) \in \mathbb{R}^{33}\) with

\[f_i(\mathbf{b}) = b_1 + b_2 \exp(-i \cdot b_4) + b_3 \exp(-i \cdot b_5),\, i = 0, \dots, 32\]

to a target vector with 33 entries. The certified best-fit values are

\[ \begin{align}\begin{aligned}b_1 &= 0.3754100521 \pm 2.0723153551 \cdot 10^{-3}&\\b_2 &= 1.9358469127 \pm 0.22031669222&\\b_3 &= -1.464687136 \pm 0.22175707739&\\b_4 &= 0.1286753464 \pm 4.4861358114\cdot 10^{-3}&\\b_5 &= 0.2212269966 \pm 8.9471996575 \cdot 10^{-3}&\end{aligned}\end{align} \]
 1jcm_optimizer_path = '<JCM_OPTIMIZER_PATH>';
 2addpath(fullfile(jcm_optimizer_path, 'interface', 'matlab'));
 3 
 4server = jcmoptimizer.Server();
 5client = jcmoptimizer.Client(server.port); 
 6
 7% Definition of the search domain
 8design_space = { ...
 9  struct('name', 'b1', 'type', 'continuous', 'domain', [0,10]), ... 
10  struct('name', 'b2', 'type', 'continuous', 'domain', [0.1,4]), ...
11  struct('name', 'b3', 'type', 'continuous', 'domain', [-4,-0.1]), ...
12  struct('name', 'b4', 'type', 'continuous', 'domain', [0.05,1]), ...
13  struct('name', 'b5', 'type', 'continuous', 'domain', [0.05,1]) ...
14};
15constraints = { ...
16    struct('name', 'test', 'expression', 'b2 + b3 <= 1.0') ...
17};
18
19 % Creation of the study object with study_id 'bayesian_least_squares'
20study = client.create_study( ...
21    'design_space', design_space, ...
22    'constraints', constraints,...
23    'driver','BayesianLeastSquares',...
24    'name','Solution of least-square problem using Bayesian optimization',...
25    'study_id', 'bayesian_least_squares');
26%The vectorial model function of the MGH17 problem
27function val = model(x)
28    s = 0:32;
29    val = x(1) + x(2)*exp(-s.*x(4)) + x(3)*exp(-s.*x(5));
30end
31
32%Target vector of the MGH17
33target=[8.44E-01, 9.08E-01, 9.32E-01, 9.36E-01, 9.25E-01, ...
34 9.08E-01, 8.81E-01, 8.50E-01, 8.18E-01, 7.84E-01, ...
35 7.51E-01, 7.18E-01, 6.85E-01, 6.58E-01, 6.28E-01, ...
36 6.03E-01, 5.80E-01, 5.58E-01, 5.38E-01, 5.22E-01, ...
37 5.06E-01, 4.90E-01, 4.78E-01, 4.67E-01, 4.57E-01, ...
38 4.48E-01, 4.38E-01, 4.31E-01, 4.24E-01, 4.20E-01, ...
39 4.14E-01, 4.11E-01, 4.06E-01];
40
41study.configure( ...
42    'target_vector', target, ...
43    'max_iter', 120 ...
44);
45% Evaluation of the black-box function for specified design parameters
46function observation = evaluate(study, sample)
47
48    observation = study.new_observation();
49    %array of design values
50    x = [sample.b1, sample.b2, sample.b3, sample.b4, sample.b5];    
51    observation.add(model(x));
52    
53end  
54
55% Run the minimization
56study.set_evaluator(@evaluate);
57study.run(); 
58
59best_sample = study.driver.best_sample;
60min_chisq = study.driver.min_objective;
61uncertainties = study.driver.uncertainties;
62fprintf("Reconstructed parameters with chi-squared value %e\n", min_chisq);
63fns = fieldnames(best_sample);
64for i = 1:length(fns)
65    fprintf("  %s = %f +/- %f\n", fns{i}, ...
66            best_sample.(fns{i}), uncertainties.(fns{i})); 
67end
68 
69% Before running a Markov-chain Monte-Carlo (MCMC) sampling we converge the surrogate
70% models by sampling around the minimum. To make the study more explorative, the
71% scaling parameter is increased and the effective degrees of freedom is set to one.
72study.configure( ...
73    'scaling', 10.0, ...
74    'effective_DOF', 1.0,  ...
75    'min_uncertainty', min_chisq*1e-8, ...
76    'max_iter', 150, ...
77    'min_val', 0.0 ...
78);
79while(not(study.is_done))
80   sug = study.get_suggestion();
81   obs = evaluate(study, sug.sample);
82   study.add_observation(obs, sug.id);
83end
84
85% Run the MCMC sampling with 32 walkers
86num_walkers = 32; max_iter = 10000;
87mcmc_result = study.driver.run_mcmc( ...
88    'rel_error', 0.01, ...
89    'num_walkers', num_walkers, ...
90    'max_iter', max_iter ...
91);
92% This Matlab code does not contain a detailed analysis of the MCMC sampling by corner plots.
93% Please, see the corresponding Python tutorial for a code example.
MCMC sampling based on surrogate

Markov-Chain Monte-Carlo (MCMC) sampling of the probability density of the parameters \(b_1,\dots,b_5\) based on the analytic model function.

MCMC sampling based on surrogate

Markov-Chain Monte-Carlo (MCMC) sampling of the probability density of the parameters \(b_1,\dots,b_5\) based on the trained surrogate of the study. A comparison between the analytic and the surrogate model function shows a good quantitative agreement.