.. _OutlierDetector: Outlier detector ---------------- Gaussian process regression assumes that uncertain input data has Gaussian-distributed errors. This works well for noisy measurements, but not for outliers caused by events outside the Gaussian noise model, such as failed measurements. Such outliers can strongly reduce the prediction accuracy of Gaussian processes. The outlier detector uses a Student-t error model for the observed data :math:`\mathbf{y}` instead of a Gaussian error model. Since the Student-t distribution has heavier tails, it can model data sets that contain outliers. The Student-t distribution is parameterized by the degrees of freedom :math:`\text{dof}` and a scale parameter :math:`\sigma`. The most probable vector of function values :math:`\hat{\mathbf{f}}` (i.e. the mode) given the observed data :math:`\mathbf{y}` is determined by an expectation maximization (EM) algorithm. An observation :math:`y_i` is assumed to be an outlier if :math:`(\hat{f}_i - y_i)^2 > t\cdot \text{dof}\cdot \sigma^2`, where :math:`t` is the tolerance of the outlier detection. .. _ActiveLearning.GP.detector.detection_interval: detection_interval (int) """""""""""""""""""""""" Maximum number of observations of the objective after which outliers are identified. Default: ``5`` .. note:: Each derivative observation is counting as an independent observation. .. _ActiveLearning.GP.detector.prec: prec (float) """""""""""" Relative precision convergence criterion of the EM algorithm. Default: ``1e-08`` .. _ActiveLearning.GP.detector.max_iter: max_iter (int) """""""""""""" Maximum number of iterations of EM algorithm. Default: ``100`` .. _ActiveLearning.GP.detector.tolerance: tolerance (float) """"""""""""""""" Tolerance :math:`t` of the outlier detection. Default: ``1`` .. _ActiveLearning.GP.detector.joint_scale: joint_scale (bool) """""""""""""""""" If true, only one scale hyperparameter is used for all input dimensions. This generally lowers the computational effort of the hyperparameter optimization. However, this option should be only be used if the observed data of all channels follows the same Student-t distribution. Default: ``False`` .. _ActiveLearning.GP.detector.optimization_step_min: optimization_step_min (int) """"""""""""""""""""""""""" Minimum number of observations of the objective before the hyperparameters are optimized. Default: Automatic choice according to number of dimensions. .. note:: Each derivative observation is counting as an independent observation. .. _ActiveLearning.GP.detector.optimization_step_max: optimization_step_max (int) """"""""""""""""""""""""""" Maximum number of observations of the objective after which no more hyperparameter optimization is performed. Default: ``100`` .. note:: Each derivative observation is counting as an independent observation. .. _ActiveLearning.GP.detector.min_optimization_interval: min_optimization_interval (int) """"""""""""""""""""""""""""""" Minimum number of observations of the objective after which the hyperparameters are optimized. Default: ``2`` .. note:: Each derivative observation is counting as an independent observation. .. _ActiveLearning.GP.detector.max_optimization_interval: max_optimization_interval (int) """"""""""""""""""""""""""""""" Maximum number of observations of the objective after which the hyperparameters are optimized. Default: ``20`` .. note:: Each derivative observation is counting as an independent observation. .. _ActiveLearning.GP.detector.optimization_level: optimization_level (float) """""""""""""""""""""""""" Controls how often the hyper-parameters are optimized. Small values (e.g. 0.01) lead to more frequent optimizations. Large values (e.g. 1) lead to less frequent optimizations. Default: ``0.2`` .. _ActiveLearning.GP.detector.num_samples_hyperparameters: num_samples_hyperparameters (int) """"""""""""""""""""""""""""""""" Number of local searches for optimal hyperparameters. Default: Automatic choice ``min(15, max(5, 2 * num_dim))`` according to number of dimensions ``num_dim``. .. _ActiveLearning.GP.detector.min_dist: min_dist (float) """""""""""""""" To speed up the hyperparameter optimization the surrogate model can be sparsified such that data points with a distance (in terms of the length scales of the surrogate) below `min_dist` are neglected. Default: ``0.0``