.. _changing_environment:

Optimal control of a system in a changing environment
==========================================================================================


:Driver: :ref:`ActiveLearning`


:Download script: :download:`changing_environment.py`


The target of the study is show how to control a system that depends on environment parameters such as temperature or humidity. While the environment parameters can be measured, their influence on the system's performance is often unknown. 

As an example objective the 2d `Rastrigin function <https://en.wikipedia.org/wiki/Rastrigin_function>`_

.. math::

   r(x_1, x_2) = 10\cdot 2 +x_1^2 + x_2^2 - 10\cos(2\pi x_1) - 10\cos(2\pi x_1)

is considered. The environment parameter :math:`\phi` acts as an additional phase offset to the first cosine function in the objective function

.. math::

   r(x_1, x_2, \phi) = 10\cdot 2 +x_1^2 + x_2^2 - 10\cos(2\pi x_1 + \phi) - 10\cos(2\pi x_1)

The phase shall slowly vary over time as

.. math::

   \phi(t) = 2\pi\sin\left(\frac{t}{3{\rm min}}\right).

Please note, that this specific time dependent behaviour is not exploited and is assumed to be unknown.

Before being able to control the system in an optimal way depending on the environment, one has to learn for many environment values, where the global minimum is located. To this end, a standard Bayesian optimization is performed for 500 iterations that explores the parameter space. In a second phase, the target is to evaluate the system in an optimal way, i.e. an exploration of the parameter space is not desired. This behaviour is mainly enforced by choosing a small :ref:`scaling <ActiveLearning.scaling>` value. 

The control phase could have an arbitrary number of iterations and it would be problematic to add all new observations to the study. On the one hand, this slows down the computation time of a suggestion. Since the environment value changes during the computation, this can lead to less optimal evaluation points. On the other had, adding more and more data points close to each other leads to an ill conditioned Gaussian process surrogate. To avoid these drawbacks, data points are not added in the control phase if the study predicts a value with very small uncertainty, which means that the observation would not add significant information.



.. literalinclude:: ./changing_environment.py
   :language: python
   :linenos:


 

.. figure:: images/changing_environment/training_and_control.svg
   :alt: training and control
         
   **Left:** During the initial training phase in the first 500 iterations, the parameter space is explored leading
   to small and large objective values. In the control phase, only small objective values are observed.   
   **Right:** The observed values (blue dots) agree well with the lowest achievable values (green line). Most of the deviations
   are due to the time offset between the request of a new suggestion for a given environment value :math:`\phi` and
   the actual evaluation of the Rastrigin function about a second later. To see this, the values that would have been
   observed at the time of request are shown as orange dots.