Variance-based sensitivity analysis for parameter reconstruction

Driver:: ActiveLearning
Download script:: sensitivity_analysis.py

An important question in the context of parameter reconstruction is how to set up an experimental measurement in order to retrieve enough meaningful information for the accurate reconstruction of the parameters that are of interest. As a specific example the MGH17 problem from the NIST statistical reference datasets is considered. It consists of fitting a vectorial model \(\mathbf{f}(\mathbf{b}) \in \mathbb{R}^{33}\) with

\[f_i(\mathbf{b}) = b_1 + b_2 \exp(-i \cdot b_4) + b_3 \exp(-i \cdot b_5),\, i = 0, \dots, 32\]

to a measurement vector with 33 entries.

If a specific measurement \(f_k(\mathbf{b})\) depends sensitively on a parameter \(b_j\), it enables a more accurate reconstruction of \(b_j\) than a measurement that hardly depends on the parameter. A way to quantify this sensitivity is the variance-based sensitivity analysis, also referred to as Sobol’ method. The first-order Sobol’ coefficients \(S_{jk}\) with \(j=1\dots,5\) and \(k=1\dots,33\) measure the variance of \(f_k(\mathbf{b})\) when changing a specific parameter \(b_j\), averaged over the value of all other parameters.

In out blog entry on sensitivity analysis we show that the relative variance

\[c_{jk}= \frac{\mathrm{Var}[f_k] S_{jk} }{\mathrm{Var}[b_j]}\]

is a measure of how well a parameter \(b_j\) can be reconstructed from the measurement channel \(k\) on average (independent of a specific vector of measurements). The variance of a uniformly distributed parameter \(b_j \in [b_j^{\rm lower}, b_j^{\rm upper}]\) is given as \((b_j^{\rm upper} - b_j^{\rm lower})^2/12\).

The model variance \({\rm Var}[f_k]\) and the Sobol’ indices \(S_{jk}\) can be determined using Monte Carlo sampling. If the numerical model of the measurement is expensive, it is numerically more efficient to train a surrogate model of \(\mathbf{f}(\mathbf{b})\) and do the Monte Carlo sampling on the trained model instead. Here, we train a Gaussian process by iteratively sampling the model at positions of the largest uncertainty of the Gaussian process prediction.

import sys,os
import numpy as np
import time
import torch
import matplotlib.pyplot as plt


jcm_optimizer_path = r"<JCM_OPTIMIZER_PATH>"
sys.path.insert(0, os.path.join(jcm_optimizer_path, "interface", "python"))
from jcmoptimizer import Server, Client, Study, Observation
server = Server()
client = Client(server.host)

# Definition of the search domain
design_space = [
    {'name': 'b1', 'type': 'continuous', 'domain': (0,10)}, 
    {'name': 'b2', 'type': 'continuous', 'domain': (0.1,4)},
    {'name': 'b3', 'type': 'continuous', 'domain': (-4,-0.1)},
    {'name': 'b4', 'type': 'continuous', 'domain': (0.05,1)},
    {'name': 'b5', 'type': 'continuous', 'domain': (0.05,1)}
]

# Creation of the study object with study_id 'sensitivity_analysis'
study = client.create_study(
    design_space=design_space,
    driver="ActiveLearning",
    name="Variance-based sensitivity analysis for parameter reconstruction",
    study_id="sensitivity_analysis"
)
#The vectorial model function of the MGH17 problem
def model(x: torch.Tensor) -> torch.Tensor:
    s = torch.arange(33)
    return x[0] + x[1]*torch.exp(-s*x[3]) + x[2]*torch.exp(-s*x[4])

study.configure(
    max_iter = 100,
    surrogates = [
        #A multi-output Gaussian process that learns the dependence of
        #the model on the design parameters.
        dict(type="GP", name="model_vector", output_dim=33,
             correlate_outputs=False)
    ],
    variables = [
        #The mean of the model vector.
        dict(type="LinearCombination", name="model_average",
             inputs=["model_vector"])
    ],
    objectives = [
        #The objective is to sample the model at positions of maximal
        #uncertainty of the model average.
        dict(type="Explorer", variable="model_average",
             penalize_boundaries=True, min_uncertainty=1e-3)
    ],
)

# Evaluation of the black-box function for specified design parameters
def evaluate(study: Study, b1: float, b2: float, b3: float, b4: float, b5: float) -> Observation:

    time.sleep(2) # make objective expensive
    observation = study.new_observation()
    #tensor of design values to reconstruct
    x = torch.tensor([b1, b2, b3, b4, b5])    
    observation.add(model(x).tolist())
    
    return observation

# Run the minimization
study.set_evaluator(evaluate)
study.run()
sobol_indices = study.driver.get_sobol_indices(
    object_type="surrogate",
    name="model_vector",
    max_uncertainty=0.001
)
variances = torch.tensor(sobol_indices["variance"])
sobol_values = torch.tensor(sobol_indices["first_order"])


fig = plt.figure(figsize=(10,5))
fig, (ax1,ax2) = plt.subplots(nrows=2, sharex=True,
                              figsize=(10,5))

for idx, info in enumerate(design_space):
    ax1.plot(sobol_values[idx], ".-", label=info['name'])

    #variance of uniform distribution of the parameter
    var_p = (info['domain'][1]-info['domain'][0])**2/12 
    scaled_var = variances*sobol_values[idx]/var_p
    ax2.plot(scaled_var, ".-", label=info['name'])
    
ax1.set_ylabel("First-order Sobol' index")
ax1.legend()
ax1.grid()

ax2.set_xlabel("Model vector index")
ax2.set_ylabel("Scaled variance")
ax2.grid()
plt.savefig("variance_sobol.svg", transparent=True) 

Sensitivity analysis for parameter reconstruction

The upper graph shows the first-order Sobol’ coefficients for each parameter \(b_j\) and each channel \(k.\) Clearly, the largest variance of \(\mathbf{f}(\mathbf{b})\) stems from variations of \(b_1\). Note that, due to the symmetric definition of \(b_2, b_3\) and \(b_4, b_5\), respectively, some lines are on top of each other.

The lower graph shows the relative variance \(c_{jk}\) for each parameter \(b_j\) and each channel \(k\) reflecting the amount of information for the reconstruction of each parameter. Regarding the sum of information over \(k\), the parameters \(b_1, b_4\) and \(b_5\) can be reconstructed most accurately. The parameters \(b_2\) and \(b_3\) are harder to reconstruct. Most information is given by small channel numbers.

These results agree well with the certified reconstruction uncertainties \(\Delta_1=2.07 \cdot 10^{-3}\), \(\Delta_2=0.220\), \(\Delta_3=0.222\), \(\Delta_4=4.49\cdot 10^{-3}\), \(\Delta_5=8.95 \cdot 10^{-3}\) for a specific MGH17 measurement vector.