ScipyLeastSquares

Contents

Purpose:: The purpose of the driver.
Tutorials:: Tutorials demonstrating the application of this driver.
Driver Interface:: Driver-specific methods of the Matlab interface.
Configuration:: Configuration of the driver.

Purpose

The driver solves the least squares problem based on the scipy function scipy.optimize.least_squares. It is suited to search locally for a solution of a least-squares problem, which consists of a target vector \(\mathbf{t} \in \mathbb{R}^K\) and a vectorial function \(\mathbf{f}_{\rm bb}(\mathbf{p}_{\rm design}) \in \mathbb{R}^K\). The goal is to find design parameters \(\mathbf{p}_{\rm design}\) that minimize the chi-squared deviation

\[\chi^2 = \sum_{i=1}^K \frac{\left(t_i - y_i\right)^2}{\eta_i^2}.\]

between K outputs \(y_i = f_i(\mathbf{p}_{\rm design})\) to the K targets \(t_i\) scaled by the target uncertainties \(\eta_i\).

If the target uncertainties are not independent but correlated, it is also possible to specify the covariance matrix \(G\) of the targets. In this case, the chi-squared deviation is given as

\[\chi^2 = \sum_{i,j=1}^K (t_i - y_i) G_{i j}^{-1} (t_j - y_j).\]

In order to perform search for the global minimum, the driver allows to run local minimizations starting from multiple initial points in parallel. A global minimization with this driver is advisable if the Jacobian of the inexpensive vectorial function is known.

Available optimization methods are the Trust Region Reflective algorithm (trf) and the dogleg algorithm with rectangular trust regions (dogbox) which are described in more detail in the reference of scipy.optimize.least_squares.

For expensive vectorial functions with evaluation times larger than a few seconds, the BayesianLeastSquares driver is recommended.

Tutorials

Solution of a non-expensive least-square problem based on a scipy implementation

Driver Interface

The driver instance can be obtained by Study.driver.

class ScipyLeastSquares(host, verbose, study_id): This class provides methods for retrieving information of the result of the least squares minimization using scipy.optimize.least_squares.

best_sample

LeastSquaresDriver.best_sample()
Best sample with minimal chi-squared value found during the minimization. Example:
samples = study.driver.best_sample();
fn = fieldnames(sample);
for i=1:length(fn)
   fprintf("%s = %s\n", fn{i}, sample.(fn{i}) );
end
Return type:

struct

describe

Driver.describe()
Get description of all modules and their parameters that are used by the driver. Example:
description = driver.describe();
fprintf(description.members.surrogates{1});
Returns:

A nested struct with description of submodules consisting
of a name and a descriptive text. If the entry describes a module, it has an additional "members" entry with struct describing submodules and parameters.

Return type:

struct

get_state

Driver.get_state(path)
Get state of the driver. Example:
best_sample = driver.get_state("best_sample");
Parameters:

path (str) – A dot-separated path to a submodule or parameter. If none, the full state is returned.

Returns:

If path is None, a struct with information of driver state.

Return type:

struct

Note

A description of the meaning of each entry in the state can be retrieved by describe().

historic_parameter_values

Driver.historic_parameter_values(path)
Get the values of an internal parameter for each iteration of the study. Example:
min_objective_values = driver.historic_parameter_values(...
    "acquisition_function.min_objective");
Parameters:

path (str) – A dot-separated path to the parameter.

Returns:

Array of parameter values

Return type:

array

Note

A description of the meaning of each parameter can be retrieved by describe().

min_objective

LeastSquaresDriver.min_objective()
Minimal chi-squared value found during the minimization. Example:
min_chi_sq = study.driver.min_objective;
Return type:

float

uncertainties

LeastSquaresDriver.uncertainties()
Uncertainties of the continuous parameters of the best_sample(). Example:
uncertainties = study.driver.uncertainties();
fn = fieldnames(uncertainties);
for i=1:length(fn)
    fprintf("uncertainty of %s: %s\n", fn{i}, uncertainties.(fn{i}) );
Return type:

struct

Configuration

The configuration parameters can be set by calling, e.g.

study.configure('example_parameter1',[1,2,3], 'example_parameter2',true);

This driver requires a definition of a target_vector \([t_1,\cdots, t_K]\) to find a least-squares solution that minimizes \(\chi^2 = \sum_{i=1}^K \left(t_i - y_i\right)^2\) where \(y_i = f_i(\mathbf{p}_{\rm design})\) is the \(i\)-th entry of the observed black-box function. To perform a weighted least-squares minimization one can specify an uncertainty_vector or more generally a covariance_matrix between the \(K\) target-vector entries. The driver allows to run num_initial minimizations starting from multiple initial samples in parallel. Only continous design parameters are minimized while discrete and categorial parameters are fixed to the values of the initial samples.

max_iter (int)

Maximum number of evaluations of the studied system.

Default: Infinite number of evaluations.

max_time (float)

Maximum run time of study in seconds. The time is counted from the moment, that the parameter is set or reset.

Default: inf

num_parallel (int)

Number of parallel evaluations of the studied system.

Default: 1

min_val (float)

The minimization is stopped when the chi-squared deviation \(\chi^2\) is below the specified minimum value.

Default: 0.0

min_step (float)

The minimization is stopped when the Eucledian distance between consecutive sampling points in the design parameter space is below the specified value.

Default: 0.0

target_vector (cell{float})

Vector of targets \(t_i\) for \(i=1,\dots, K\).

Default: {0.0}

uncertainty_vector (cell{float})

Vector of target uncertainties \(\eta_i\) such that \(\chi^2 = \sum_{i=1}^K \frac{(t_i - y_i)^2}{\eta_i^2}\).

Default: vector of ones.

covariance_matrix (cell{cell{float}})

Covariance matrix \(G\) such that \(\chi^2 = \sum_{i,j=1}^K (t_i - y_i) G_{i j}^{-1} (t_j - y_j).\).

Default: diagonal identity matrix.

num_initial (int)

Number of independent initial optimizers.

Default: 1

max_num_optimizers (int)

If an optimizer has converged, it is restarted at another position. If max_num_optimizers have converged, the optimization is stopped.

Default: Infinite number of optimizers.

initial_samples (cell{cell})

List of initial samples each with dimension of the design space. The role of the initial samples is twofold. First, they are regarded as initial guess to the minimization. Once the num_initial populations have been initialized these samples replace the first (best) member. Second, if the design space contains discrete or categorial parameters, they can be specified for each DE minimizer by the value of the initial samples. If num_initial > len(initial_samples) the rest of the initial populations is chosen randomly.

Default: {}

sobol_sequence (bool)

If true, all initial samples are taken from a Sobol sequence. This typically improves the coverage of the parameter space.

Default: true

method (str)

Algorithm to perform minimization.

trf:

Trust Region Reflective algorithm, particularly suitable for large sparse problems with bounds. Generally robust method.

dogbox:

dogleg algorithm with rectangular trust regions, typical use case is small problems with bounds. Not recommended for problems with rank-deficient Jacobian.

For more details on the algorithms, see the scipy documentation.

Default: "trf" Choices: "trf", "dogbox".

jac (bool)

If true, the Jacobian is used for optimization.

Default: false
Example

If set to true, the full Jacobian must be added to the observations. That is, for each continuous parameter one has to bass a list of \(K\) derivatives by calling:
observation.add('value',deriv_vec, 'derivative','param_name')

diff_step (cell{float})

Determines the relative step size for the finite difference approximation of the Jacobian. The actual step is computed as x * diff_step.

Default: list with all entries 1e-6.

ftol (float)

Tolerance for termination by the change of the cost function. The optimization process is stopped when dF < ftol * F, and there was an adequate agreement between a local quadratic model and the true model in the last step.

Default: 1e-08

xtol (float)

Tolerance for termination by the change of the independent variables. The optimization process is stopped when norm(dx) < xtol * (xtol + norm(x)).

Default: 1e-08

gtol (float)

Tolerance for termination by the norm of the gradient. Default is 1e-8. The exact condition depends on a method used:

For ‘trf’ : norm(g_scaled, ord=np.inf) < gtol, where g_scaled is the value of the gradient scaled to account for the presence of the bounds.

For ‘dogbox’ : norm(g_free, ord=np.inf) < gtol, where g_free is the gradient with respect to the variables which are not in the optimal state on the boundary.

For more details see the scipy documentation.

Default: 1e-08