.. _physics_informed_bayesian_optimization: Physics-informed Bayesian optimization ========================================================================================== :Driver: :ref:`ActiveLearning` :Download script: :download:`physics_informed_bayesian_optimization.py` In this tutorial we discuss physics-informed Bayesian optimization (PIBO) [SEK2025]_. In contrast to standard Bayesian optimization (BO), the surrogate model is not trained on scalar objective values :math:`f(\mathbf{x}) \in \mathbb{R}` but on the *physical response* :math:`\mathbf{r}(\mathbf{x}) \in \mathbb{R}^K` of a system (simulation or experiment). We assume that while obtaining :math:`\mathbf{r}(\mathbf{x}^*)` for some :math:`\mathbf{x}^*` is expensive, the mapping :math:`f(\mathbf{x}) = \Phi(\mathbf{r}(\mathbf{x}))` is an analytic function that can be efficiently evaluated. The mapping :math:`\Phi: \mathbb{R}^K \rightarrow \mathbb{R}` leads intuitively to an information loss. If moreover :math:`\Phi` is a non-linear function that adds some complexity, it can be easier to learn the behaviour of the channels :math:`\mathbf{r}(\mathbf{x})` than to learn the behavior of :math:`f(\mathbf{x})`. As an example we consider a spectral filter consisting of alternating layers of a dielectric layer :math:`S` made of silicon oxide (SiO\ :sub:`2`) and a metallic layer :math:`A` made of silver (Ag). The device has the general structure :math:`S(AS)^N` which is surrounded by air (refractive index :math:`n=1`). We consider :math:`N=4` layer pairs which leads to a :math:`D = 2N + 1 = 9` dimensional optimization problem. By varying the thicknesses, the objective is to have a uniform transmittance :math:`t(\lambda)` in a short-wavelength regime between :math:`\lambda_{\rm short} = 800` nm and :math:`\lambda_{\rm gap} = 1300` nm of at least 40% on average and to minimize the transmittance in a long-wavelength regime between :math:`\lambda_{\rm gap}` and :math:`\lambda_{\rm long}=1800` nm: .. math:: \text{minimize}\,\, \text{STD}_{\rm short}[t] + \mathbb{E}_{\rm long}[t] &= \sqrt{ \int_{\lambda_{\rm short}}^{\lambda_{\rm gap}} t^2(\lambda){\rm d}\lambda - \left( \int_{\lambda_{\rm short}}^{\lambda_{\rm gap}} t(\lambda){\rm d}\lambda \right)^2 } + \int_{\lambda_{\rm gap}}^{\lambda_{\rm long}} t(\lambda){\rm d}\lambda \text{such that}\,\, \mathbb{E}_{\rm short}[t] &= \int_{\lambda_{\rm short}}^{\lambda_{\rm gap}} t(\lambda){\rm d}\lambda > 0.4\,. Luckily, the transmission of light through the material stack can be simulated efficiently using the transfer-matrix method. We employ the Python package ``tmm-fast`` which also allows to compute derivatives with respect to the :math:`D=9` thicknesses using back propagation. We learn :math:`K=50` samples :math:`\mathbf{t} = [\mathbf{t}_{\rm short}, \mathbf{t}_{\rm long}] \in \mathbb{R}^K` of the transmission spectrum :math:`t(\lambda)` between :math:`\lambda_{\rm short}` and :math:`\lambda_{\rm long}` using a *multi-output* GP. To showcase the advantage of PIBO, we compare its convergence behaviour to that of a global heuristic optimization method, differential evolution (DE). Like other heuristic methods DE cannot make use of derivative information. This would explain a speedup of a factor of :math:`\leq D + 1 = 10` since 10 samples are required to estimate the gradient of the function. However, we observe a speedup factor of more than 100. The reason is that even after the very first evaluation, PIBO learns already a local linear model of the transmission spectrum and hence a **local quadratic model** of the loss function. After some more iterations it has a very good global model of the loss function and can identify local minima very efficiently. .. [SEK2025] Sekulic, I, *et al*. "Physics-informed Bayesian optimization of expensive-to-evaluate black-box functions". *Mach. Learn.: Sci. Technol.* **6** 040503 (2025) .. literalinclude:: ./physics_informed_bayesian_optimization.py :language: python :linenos: .. figure:: images/physics_informed_bayesian_optimization/optimized_filter.svg :alt: Optimized filter **Top:** Spectral response of optimized filter design. **Bottom:** Geometry of optimized filter design. .. figure:: images/physics_informed_bayesian_optimization/comparison_PIBO_DE.svg :alt: Comparison between physics-informed BO and differential evolution Comparison of the convergence behaviour of physics-informed Bayesian optimization (PIBO), which uses gradient information, and the heuristic global optimization method differential evolution (DE), which cannot exploit gradient information. PIBO converges **two orders of magnitude** faster to the optimal design than DE.