Negative log-probability variable

This variable describes the negative log-probability of model parameters \(\mathbf{p}\) given a set of \(K\) measurement targets \(t_i\), \(i=1,\ldots,K\).

We assume that the measurement process can be accurately modeled by a function \(\mathbf{f}(\mathbf{p}) \in \mathbb{R}^K\). That is, the vector of measurements \(\mathbf{t}\) is a random vector with

\[\mathbf{t} = \mathbf{f}(\mathbf{p}) + \mathbf{w}\]

with \(\mathbf{w} \sim \mathcal{N}(0, \mathbf{G})\), where \(\mathbf{G}\) is a covariance matrix of the measurement errors. In most cases the covariance matrix is diagonal (i.e. the measurement errors are uncorrelated) with diagonal entries \(G_{ii} = \eta_i^2\).

Sometimes, the measurement noise is known. However, generally one has to find a parameterized error model \(\eta_i(\mathbf{d})\) for the variance itself. A common choice is to assume that the error is composed of a background term \(b\) and a noise contribution which scales linearly with \(f_i(\mathbf{p})\):

\[\eta_i^2(a,b,\mathbf{p}) = b^2 + \left[a f_i(\mathbf{p})\right]^2\]

Since every entry of the measurement vector follows a normal distribution \(t_i \sim \mathcal{N}(f_i(\mathbf{p}),\eta_i(\mathbf{d}))\) the joint likelihood of measuring the vector \(\mathbf{t}\) is given as

\[P(\mathbf{t} | \mathbf{p}, \mathbf{d}) = \prod_{i=1}^K \frac{1}{\sqrt{2\pi}\eta_i(\mathbf{d})}\exp\left[-\frac{1}{2}\left(\frac{t_i - f_i(\mathbf{p})}{\eta_i(\mathbf{d})}\right)^2\right].\]

Sometimes, non-uniform prior distributions for the design parameter vector \(P_\text{prior}(\mathbf{p})\) and the error model parameters \(P_\text{prior}(\mathbf{d})\) are available. The posterior distribution is then proportional to

\[P(\mathbf{p}, \mathbf{d} | \mathbf{t}) \propto P(\mathbf{t} | \mathbf{p}, \mathbf{d}) P_\text{prior}(\mathbf{p}) P_\text{prior}(\mathbf{d})\]

Warning

If a parameter distribution for the design space is defined, make sure that it has a non-vanishing probability distribution within the boundaries of the design space. Otherwise the negative log-probability can be infinite. In this case the sample computation is numerically unstable.

Alltogether, the target of finding the parameters with maximum posterior probability density is equivalent of minimizing the value of the negative log-likelihood

\[\begin{split}\begin{split} -\log\left(P(\mathbf{p}, \mathbf{d}| \mathbf{t})\right) = & \frac{1}{2} K\log(2\pi) +\sum_{i=1}^K\log\left(\eta_i(\mathbf{d})\right) +\frac{1}{2}\sum_{i=1}^K \left( \frac{t_i - f_i(\mathbf{p})}{\eta_i(\mathbf{d})} \right)^2 \\ &-\log\left(P_\text{prior}(\mathbf{d})\right) -\log\left(P_\text{prior}(\mathbf{p})\right). \end{split}.\end{split}\]

name (str)

The name of the variable under which it can be addressed by other variables or objectives. The name must be distinct from any surrogate name.

Default: 'v'

input (str)

The name of a surrogate model or a multi-output variable.

Default: This value has no default and must be provided

target_vector (list[float])

Vector of target values \(t_i\).

Default: Vector of zeros.

uncertainty_vector (list[float])

Vector of target uncertainties \(\eta_i\) such that \(\chi^2 = \sum_{i=1}^K \frac{(t_i - y_i)^2}{\eta_i^2}\).

Default: Vector of ones.

covariance_matrix (list[list[float]])

Covariance matrix \(G\).

Default: Diagonal identity matrix.

approximate (bool)

If true, the generalized chi-squared variable with different uncertainties of the predictions of each channel K is approximated by a chi-squared variable with averaged uncertainties. This allows to analytically compute probability densities and any acquisition function that is directly based on the variable.

Default: True

force_MC_integration (bool)

If true, the posterior of the variable is based on Monte-Carlo samples. In this case one can avoid to determine a Gaussian distribution of function values, which might be numerically unstable, if the correlation between the inputs is strong.

Default: False

effective_DOF (float)

Number of effective degrees of freedom (DOF) used for stochastic variable of the chi-squared distribution. This number roughly indicates how many output channels of the forward model are statistically independent.

Default: If not specified, the value us determined automatically.

Note

If approximate is false, this parameter has no effect.

effective_DOF_bounds (list[float])

The number of effective degrees of freedom (DOF) is determined by a maximum likelihood estimate within the given lower and upper bounds.

Default: [20.0,50.0]

Note

If approximate is false or the effective_DOF is set manually, this parameter has no effect.

error_model (dict)

Note

If an error model is specified, it overrides setting for the target_vector or covariance_matrix of the measurement uncertainties. The error model parameters \(\mathbf{d}\) are then always fit to maximize the posterior probability for any given design point \(\mathbf{p}\)

\[P(\mathbf{d} | \mathbf{p}, \mathbf{t}) = \prod_{i=1}^K \frac{1}{\sqrt{2\pi}\eta_i(\mathbf{d})}\exp\left[ -\frac{1}{2}\left( \frac{t_i - \mu_i(\mathbf{p})}{\eta_i(\mathbf{d})} \right)^2 \right] + P_\text{prior}(\mathbf{d}).\]

Here, \(\mu_i(\mathbf{p})\) is the predicted mean of the input to the variable and \(P_\text{prior}(\mathbf{d})\) is the prior probability density for the error model parameters.

Default: {'expression': 'RSE*10^(log_err)', 'distributions': [{'type': 'normal', 'parameter': 'log_err', 'mean': 0, 'stddev': 1}], 'xtol': 1e-05} The parameter is optional and may also have the value None.

An analytic model for the error of the target vector entries. The model \(\eta_i({\rm RSE}, y_i, \epsilon_i)\) for each entry \(i=1,\dots,K\) of the target vector can depend on the residual standard error (RSE), the model prediction \(y_i\) and the corresponding initially assumed measurement error \(\epsilon_i\).

The RSE is defined as \(\sqrt{\chi_\text{min}^2/\mathrm{DOF}}\) where the number of degrees of freedom (DOF) is defined as the difference between the target-vector dimension \(K\) and the number parameters to reconstruct (the dimension of the design space).

Moreover, \(M\) other random parameters \(\mathbf{d}\) can be defined in the parameter distribution and initial_parameters.

The log-probability density of the random parameters \(\mathbf{d}\) and the design space parameters \(p\) is given as (up to constant additive terms)

\[\log[P(\mathbf{p}, \mathbf{d})] = -\frac{1}{2} \cdot \sum_i\left(\frac{(t_i-y_i(\mathbf{p}))^2}{\eta_i^2(\mathbf{d})} + \log[\eta_i^2(\mathbf{d})]\right) + \log[P_\text{prior}(\mathbf{p})] + \log[P_\text{prior}(\mathbf{d})].\]

See error_model configuration for details.