.. _NegLogProbVariable:


Negative log-probability variable
---------------------------------


This variable describes the negative log-probability of model parameters :math:`\mathbf{p}` given a set of :math:`K` measurement targets :math:`t_i`, :math:`i=1,\ldots,K`.  

We assume that the measurement process can be accurately modeled by a function :math:`\mathbf{f}(\mathbf{p}) \in \mathbb{R}^K`. That is, the vector of measurements :math:`\mathbf{t}` is a random vector with  

.. math::   \mathbf{t} = \mathbf{f}(\mathbf{p}) + \mathbf{w}  

with :math:`\mathbf{w} \sim \mathcal{N}(0, \mathbf{G})`, where :math:`\mathbf{G}` is a covariance matrix of the measurement errors. In most cases the covariance matrix is diagonal (i.e. the measurement errors are uncorrelated) with diagonal entries :math:`G_{ii} = \eta_i^2`.  

Sometimes, the measurement noise is known. However, generally one has to find a parameterized error model :math:`\eta_i(\mathbf{d})` for the variance itself.  A common choice is to assume that the error is composed of a background term :math:`b` and a noise contribution which scales linearly with :math:`f_i(\mathbf{p})`:  

.. math::  

   \eta_i^2(a,b,\mathbf{p}) = b^2 + \left[a    f_i(\mathbf{p})\right]^2  

Since every entry of the measurement vector follows a normal distribution :math:`t_i \sim \mathcal{N}(f_i(\mathbf{p}),\eta_i(\mathbf{d}))` the joint *likelihood* of measuring the vector :math:`\mathbf{t}` is given as  

.. math::    P(\mathbf{t} | \mathbf{p}, \mathbf{d}) = \prod_{i=1}^K    \frac{1}{\sqrt{2\pi}\eta_i(\mathbf{d})}\exp\left[-\frac{1}{2}\left(\frac{t_i -    f_i(\mathbf{p})}{\eta_i(\mathbf{d})}\right)^2\right].  

Sometimes, non-uniform *prior distributions* for the design parameter vector :math:`P_\text{prior}(\mathbf{p})` and the error model parameters :math:`P_\text{prior}(\mathbf{d})` are available. The *posterior distribution* is then proportional to  

.. math::  

   P(\mathbf{p}, \mathbf{d} | \mathbf{t}) \propto P(\mathbf{t}    | \mathbf{p}, \mathbf{d})    P_\text{prior}(\mathbf{p}) P_\text{prior}(\mathbf{d})  

.. warning::    If a :ref:`parameter distribution<ActiveLearning.parameter_distribution>`    for the design space is defined, make sure    that it has a non-vanishing probability distribution within the    boundaries of the design space. Otherwise the negative log-probability can    be infinite. In this case the sample computation is numerically unstable.  

Alltogether, the target of finding the parameters with maximum posterior probability density is equivalent of minimizing the value of the negative log-likelihood  

.. math::    \begin{split}    -\log\left(P(\mathbf{p}, \mathbf{d}| \mathbf{t})\right) = &    \frac{1}{2} K\log(2\pi)    +\sum_{i=1}^K\log\left(\eta_i(\mathbf{d})\right)     +\frac{1}{2}\sum_{i=1}^K \left(       \frac{t_i - f_i(\mathbf{p})}{\eta_i(\mathbf{d})}    \right)^2 \\    &-\log\left(P_\text{prior}(\mathbf{d})\right)    -\log\left(P_\text{prior}(\mathbf{p})\right).    \end{split}.

.. _ActiveLearning.NegLogProbability.name:


name (str)
""""""""""
   The name of the variable under which it can be addressed by other variables or objectives. The name must be distinct from any surrogate name. 

   Default: ``'v'``

.. _ActiveLearning.NegLogProbability.input:


input (str)
"""""""""""
   The name of a surrogate model or a multi-output variable. 

   Default: This value has no default and must be provided

.. _ActiveLearning.NegLogProbability.target_vector:


target_vector (list[float])
"""""""""""""""""""""""""""
   Vector of target values :math:`t_i`. 

   Default: Vector of zeros.

.. _ActiveLearning.NegLogProbability.uncertainty_vector:


uncertainty_vector (list[float])
""""""""""""""""""""""""""""""""
   Vector of target uncertainties  :math:`\eta_i` such that :math:`\chi^2 = \sum_{i=1}^K \frac{(t_i - y_i)^2}{\eta_i^2}`. 

   Default: Vector of ones.

.. _ActiveLearning.NegLogProbability.covariance_matrix:


covariance_matrix (list[list[float]])
"""""""""""""""""""""""""""""""""""""
   Covariance matrix :math:`G`. 

   Default: Diagonal identity matrix.

.. _ActiveLearning.NegLogProbability.approximate:


approximate (bool)
""""""""""""""""""
   If true, the generalized chi-squared  variable with different uncertainties of the predictions  of each channel `K` is approximated by a chi-squared  variable with averaged uncertainties. This allows to analytically compute probability densities and any acquisition function that is directly based  on the variable. 

   Default: ``True``

.. _ActiveLearning.NegLogProbability.force_MC_integration:


force_MC_integration (bool)
"""""""""""""""""""""""""""
   If true, the posterior of the variable is based of Monte-Carlo samples. In this case one can avoid to determine a Gaussian distribution of function values, which might be numerically unstable, if the correlation between the inputs is strong. 

   Default: ``False``

.. _ActiveLearning.NegLogProbability.effective_DOF:


effective_DOF (float)
"""""""""""""""""""""
   Number of effective degrees of freedom (DOF)  used for stochastic variable of the chi-squared distribution. This number roughly indicates how many output channels of the  forward model are statistically independent. 

   Default: If not specified, the value us determined automatically.

   .. note:: If ``approximate`` is false, this parameter has no  effect.

.. _ActiveLearning.NegLogProbability.effective_DOF_bounds:


effective_DOF_bounds (list[float])
""""""""""""""""""""""""""""""""""
   The number of effective degrees of freedom (DOF)  is determined by a maximum likelihood estimate within the given lower and upper bounds. 

   Default: ``[20.0,50.0]``

   .. note:: If ``approximate`` is false or the ``effective_DOF`` is set manually, this parameter has no effect.

.. _ActiveLearning.NegLogProbability.error_model:


error_model (dict)
""""""""""""""""""
    
   
   .. note::  
   
      If an error model is specified,    it overrides setting for the    :ref:`target_vector<ActiveLearning.NegLogProbability.target_vector>` or    :ref:`covariance_matrix<ActiveLearning.NegLogProbability.covariance_matrix>` of the    measurement uncertainties. The error model parameters    :math:`\mathbf{d}` are then always fit to maximize the    posterior probability for any given design point    :math:`\mathbf{p}`             
   
      .. math::  
   
         P(\mathbf{d} | \mathbf{p}, \mathbf{t}) =       \prod_{i=1}^K \frac{1}{\sqrt{2\pi}\eta_i(\mathbf{d})}\exp\left[          -\frac{1}{2}\left(              \frac{t_i - \mu_i(\mathbf{p})}{\eta_i(\mathbf{d})}          \right)^2       \right] + P_\text{prior}(\mathbf{d}).     
   
      Here, :math:`\mu_i(\mathbf{p})` is the predicted mean of the input    to the variable and :math:`P_\text{prior}(\mathbf{d})` is the    prior probability density for the error model parameters. 

   Default: ``{'expression': 'RSE*err0', 'distributions': [], 'xtol': 1e-05}``
   The parameter is optional and may also have the  value ``None``.

   An analytic model for the error of the target vector entries. The model :math:`\eta_i({\rm RSE}, y_i, \epsilon_i)` for each entry :math:`i=1,\dots,K` of the target vector can depend on the residual standard error (RSE), the model prediction :math:`y_i` and the corresponding initially assumed measurement error :math:`\epsilon_i`.   
   
   The RSE is defined as :math:`\sqrt{\chi_\text{min}^2/\mathrm{DOF}}` where the number of degrees of freedom (DOF) is defined as the difference between the target-vector dimension :math:`K` and the number parameters to reconstruct (the dimension of the design space).  
   
   Moreover, :math:`M` other random parameters :math:`\mathbf{d}` can be defined in the :ref:`parameter distribution<ActiveLearning.NegLogProbability.error_model.distributions>` and :ref:`initial_parameters<ActiveLearning.NegLogProbability.error_model.initial_parameters>`.  
   
   The log-probability density of the random parameters :math:`\mathbf{d}` and the design space parameters :math:`p` is given as (up to constant additive terms)  
   
   .. math::  
   
      \log[P(\mathbf{p}, \mathbf{d})] = -\frac{1}{2} \cdot    \sum_i\left(\frac{(t_i-y_i(\mathbf{p}))^2}{\eta_i^2(\mathbf{d})} +    \log[\eta_i^2(\mathbf{d})]\right) + \log[P_\text{prior}(\mathbf{p})] +    \log[P_\text{prior}(\mathbf{d})].   
   
   See :ref:`error_model configuration <ErrorModel>` for details.

.. toctree::
   :maxdepth: 100 
   :hidden:
   
   ErrorModel