.. _NeuralNetworkEnsemble: Neural Network Ensemble ----------------------- Ensembles of fully connected deep neural networks with an arbitrary number of hidden layers and neurons, different activation functions and weight initialization schemes are used for estimation of the uncertainty of the model. In this way we are able to approximate Bayesian posterior and use it in the context of active learning as well as effectively reduce the chances of overfitting by averaging over the predictions of the ensemble members. Deep ensemble surrogates are expected to perform better than Gaussian processes in higher-dimensional parameter spaces due to the large number of degrees of freedom associated with parametric models. The training time of the deep ensemble surrogate scales linearly with the number of observations, in contrast to quadratic or cubic complexity characteristic to Gaussian processes, which renders them more suitable for large datasets. .. _ActiveLearning.NN.name: name (str) """""""""" The name of the surrogate model. Default: ``'loss'`` .. _ActiveLearning.NN.output_dim: output_dim (int) """""""""""""""" The output dimensionality equals to the number of neurons in the last layer of each ensemble member. Default: ``1`` .. _ActiveLearning.NN.output_names: output_names (list[str]) """""""""""""""""""""""" Allows to assign names to each of the ``output_dim`` outputs of the surrogate. The length of ``output_names`` must be equal to ``output_dim``. By specifying output names they can be accessed in variables as ``{output_names[0]}, {output_names[1]}``, etc. Default: By default, the variables can be accessed as ``{name}0, {name}1``, etc. where ``{name}`` is the name of the surrogate. .. _ActiveLearning.NN.horizon: horizon (int) """"""""""""" Specifies the number `N` of the lastest observations which are used for surrogate model training. This decreases the computational effort of training at the cost of a decreased prediction accuracy. Default: Infinite horizon. .. _ActiveLearning.NN.transformation: transformation (str) """""""""""""""""""" Specifies the name of the transformation applied to the parameter space. Default: No transformation. .. _ActiveLearning.NN.num_NNs: num_NNs (int) """"""""""""" Number of neural networks in the deep ensemble. The bigger the number, the better the approximation of the underlying probability distribution related to the final objective. Default: ``10`` .. _ActiveLearning.NN.hidden_layers_arch: hidden_layers_arch (list[int]) """""""""""""""""""""""""""""" The architecture of the hidden layers of each ensemble member. The number of neurons in each hidden layer is specified by an entry in the corresponding list. The architecture for all the ensemble members has to be the same. Default: ``[200,200]`` .. _ActiveLearning.NN.activation_fun: activation_fun (str) """""""""""""""""""" Nonlinear activations for all neurons in the ensemble members, except for the last layer neurons which by default do not have any activations. This is because we are concerned with regression problems. Default: ``'sine'`` Choices: ``'sine'``, ``'tanh'``, ``'sigmoid'``, ``'relu'``. .. _ActiveLearning.NN.weight_prior: weight_prior (str) """""""""""""""""" Choice of the prior on ensemble member weights. Default: ``'Normal'`` Choices: ``'Normal'``, ``'Uniform'``. .. _ActiveLearning.NN.seed: seed (bool) """"""""""" Seed flag is used to enforce the same initial weight distribution over the whole deep ensemble. Used for the purpose of reproducibility. Default: ``False`` .. _ActiveLearning.NN.trainer: trainer (dict) """""""""""""" The training of the network can be either performed on all available data points (``full_data_trainer``) or on a subset of points while the best configuratio is determined based on a set of validation points (``validation_trainer``). Default: ``{'type': 'full_data_trainer', 'num_epochs': 800, 'num_expel_NNs': 0, 'learning_rate': 0.006, 'scale_grad_loss': 0.5, 'optimizer': 'Adam', 'weight_decay': 0.0, 'loss_function': 'MSE', 'batch_size_train': None, 'save_history_path': None, 'num_epochs_warm': 200, 'learning_rate_warm': 0.004, 'warm_start': False, 'shrink': 0.9, 'perturb': 0.1, 'cold_start_interval': 20}`` The dict entry 'type' specifies the type of the module. The remaining entries specify its properties. **Trainer module** (type ``'full_data_trainer'``): In this module, the ensemble is trained on the whole available dataset. This is convenient for active learning purposes since the points are expensive to obtain and the datasets are usually very small. The ensemble weights yielding the smallest loss are saved, which helps in reducing the oversampling over the existing training points. This can be further refined by expelling the worst performing ensemble members according to the training loss by adjusting num_expel_NNs parameter. The training of the ensemble in the context of active learning can be significantly accelerated in each iteration by invoking warm starting process. See :ref:`full_data_trainer configuration ` for details. **Trainer and validation module** (type ``'validation_trainer'``): In this module, the fixed dataset provided (usually by study.add_many) is split into training and testing subsets according to the ``split_ratio`` parameter. The ensemble is trained on the training subset by minimizing the corresponding loss, while the generalization performance is assessed according to the loss induced on the testing subset. The ensemble weights yielding the smallest loss on the testing subset are saved, effectively reducing the chances of overfitting. Worst performing ensemble members according to the loss on the testing subset can be expelled by adjusting num_expel_NNs parameter delivering a reduced ensemble with better generalization qualities. See :ref:`validation_trainer configuration ` for details. .. toctree:: :maxdepth: 100 :hidden: FullDataTrainer ValidationTrainer