.. _sensitivity_analysis: Variance-based sensitivity analysis for parameter reconstruction ========================================================================================== :Driver: :ref:`ActiveLearning` **Download script**: :download:`sensitivity_analysis.m` An important question in the context of parameter reconstruction is how to set up an experimental measurement in order to retrieve enough meaningful information for the accurate reconstruction of the parameters that are of interest. As a specific example the `MGH17 problem `_ from the `NIST statistical reference datasets `_ is considered. It consists of fitting a vectorial model :math:`\mathbf{f}(\mathbf{b}) \in \mathbb{R}^{33}` with .. math:: f_i(\mathbf{b}) = b_1 + b_2 \exp(-i \cdot b_4) + b_3 \exp(-i \cdot b_5),\, i = 0, \dots, 32 to a measurement vector with 33 entries. If a specific measurement :math:`f_k(\mathbf{b})` depends sensitively on a parameter :math:`b_j`, it enables a more accurate reconstruction of :math:`b_j` than a measurement that hardly depends on the parameter. A way to quantify this sensitivity is the `variance-based sensitivity analysis`, also referred to as `Sobol' method`. The first-order Sobol' coefficients :math:`S_{jk}` with :math:`j=1\dots,5` and :math:`k=1\dots,33` measure the variance of :math:`f_k(\mathbf{b})` when changing a specific parameter :math:`b_j`, averaged over the value of all other parameters. In out blog entry on `sensitivity analysis `_ we show that the relative variance .. math:: c_{jk}= \frac{\mathrm{Var}[f_k] S_{jk} }{\mathrm{Var}[b_j]} is a measure of how well a parameter :math:`b_j` can be reconstructed from the measurement channel :math:`k` on average (independent of a specific vector of measurements). The variance of a uniformly distributed parameter :math:`b_j \in [b_j^{\rm lower}, b_j^{\rm upper}]` is given as :math:`(b_j^{\rm upper} - b_j^{\rm lower})^2/12`. The model variance :math:`{\rm Var}[f_k]` and the Sobol' indices :math:`S_{jk}` can be determined using `Monte Carlo sampling `_. If the numerical model of the measurement is expensive, it is numerically more efficient to train a surrogate model of :math:`\mathbf{f}(\mathbf{b})` and do the Monte Carlo sampling on the trained model instead. Here, we train a Gaussian process by iteratively sampling the model at positions of the largest uncertainty of the Gaussian process prediction. .. literalinclude:: ./sensitivity_analysis.m :language: matlab :linenos: .. figure:: images/sensitivity_analysis/variance_sobol.svg :alt: Sensitivity analysis for parameter reconstruction Sensitivity analysis for parameter reconstruction The upper graph shows the first-order Sobol' coefficients for each parameter :math:`b_j` and each channel :math:`k.` Clearly, the largest variance of :math:`\mathbf{f}(\mathbf{b})` stems from variations of :math:`b_1`. Note that, due to the symmetric definition of :math:`b_2, b_3` and :math:`b_4, b_5`, respectively, some lines are on top of each other. The lower graph shows the relative variance :math:`c_{jk}` for each parameter :math:`b_j` and each channel :math:`k` reflecting the amount of information for the reconstruction of each parameter. Regarding the sum of information over :math:`k`, the parameters :math:`b_1, b_4` and :math:`b_5` can be reconstructed most accurately. The parameters :math:`b_2` and :math:`b_3` are harder to reconstruct. Most information is given by small channel numbers. These results agree well with the certified reconstruction uncertainties :math:`\Delta_1=2.07 \cdot 10^{-3}`, :math:`\Delta_2=0.220`, :math:`\Delta_3=0.222`, :math:`\Delta_4=4.49\cdot 10^{-3}`, :math:`\Delta_5=8.95 \cdot 10^{-3}` for a specific MGH17 measurement vector.