.. _OutlierDetector:


Outlier detector
----------------


Gaussian process regression can cope with noisy input data  that is subject to Gaussian noise. Sometimes however, input data can also be  corrupted and falls out of the probability distribution.  Corrupted data can heavily decrease the prediction accuracy of Gaussian  processes. The outlier detector assumes that the input data  :math:`\mathbf{y}` follows a Student-t distribution rather than a  Gaussian distribution. The Student-t distribution is parameterized by  the degree of freedom :math:`\text{dof}` and a scale parameter :math:`\sigma`.  

The most probable vector of function values :math:`\hat{\mathbf{f}}`  (i.e. the mode) given the observed data :math:`\mathbf{y}` is determined by  an expectation maximimzation (EM) algorithm. An observation :math:`y_i`  is assumed to be an outlier if  :math:`(\hat{f}_i - y_i)^2 > t\cdot \text{dof}\cdot \sigma^2`, where :math:`t` is the tolerance of the outlier detection.

.. _ActiveLearning.GP.detector.detection_interval:


detection_interval (int)
""""""""""""""""""""""""
   Maximum number of observations of the objective  after which outliers are identified. 

   Default: ``5``

   .. note:: Each derivative observation is counting as an independent observation.

.. _ActiveLearning.GP.detector.prec:


prec (float)
""""""""""""
   Relative precision convergence criterion of the EM algorithm. 

   Default: ``1e-08``

.. _ActiveLearning.GP.detector.max_iter:


max_iter (int)
""""""""""""""
   Maximum number of iterations of EM algorithm. 

   Default: ``100``

.. _ActiveLearning.GP.detector.tolerance:


tolerance (float)
"""""""""""""""""
   Tolerance :math:`t` of the outlier detection. 

   Default: ``1``

.. _ActiveLearning.GP.detector.joint_scale:


joint_scale (bool)
""""""""""""""""""
   If true, only one scale hyperparameter is used for all input dimensions. This generally lowers the computational effort of the hyperparameter optimization. However, this option should be only be used if the observed data of all channels follows the same  Student-t distribution. 

   Default: ``false``

.. _ActiveLearning.GP.detector.optimization_step_min:


optimization_step_min (int)
"""""""""""""""""""""""""""
   Minimum number of observations of the objective   before the hyperparameters are optimized. 

   Default: Automatic choice according to number of dimensions.

   .. note:: Each derivative observation is counting as an independent observation.

.. _ActiveLearning.GP.detector.optimization_step_max:


optimization_step_max (int)
"""""""""""""""""""""""""""
   Maximum number of observations of the objective  after which no more hyperparameter optimization is performed. 

   Default: ``100``

   .. note:: Each derivative observation is counting as an independent  observation.

.. _ActiveLearning.GP.detector.min_optimization_interval:


min_optimization_interval (int)
"""""""""""""""""""""""""""""""
   Minimum number of observations of the objective  after which the hyperparameters are optimized. 

   Default: ``2``

   .. note:: Each derivative observation is counting as an independent observation.

.. _ActiveLearning.GP.detector.max_optimization_interval:


max_optimization_interval (int)
"""""""""""""""""""""""""""""""
   Maximum number of observations of the objective  after which the hyperparameters are optimized. 

   Default: ``20``

   .. note:: Each derivative observation is counting as an independent observation.

.. _ActiveLearning.GP.detector.optimization_level:


optimization_level (float)
""""""""""""""""""""""""""
   Controls how often the hyper-parameters are optimized. Small values (e.g. 0.01) lead to more frequent optimizations.  Large values (e.g. 1) lead to less frequent optimizations. 

   Default: ``0.2``

.. _ActiveLearning.GP.detector.num_samples_hyperparameters:


num_samples_hyperparameters (int)
"""""""""""""""""""""""""""""""""
   Number of local searches for optimal hyperparameters. 

   Default: Automatic choice ``min(15, max(5, 2 * num_dim))`` according to number of dimensions ``num_dim``.

.. _ActiveLearning.GP.detector.min_dist:


min_dist (float)
""""""""""""""""
   To speed up the hyperparameter optimization the surrogate model can be sparsified such that data points with a distance  (in terms of the length scales of the surrogate) below `min_dist` are neglected. 

   Default: ``0.0``