SARSA controller setup After we figured out the environment, it is important to understand how to apply RL concept on it. RL works in continuous as well as in discrete time state-space environments. MLE+ dictates the conditions in which we will operate: we will construct discrete state-space using variation of three parameters – temperature set point, actual temperature of a zone and electric heater. I did it in straightforward way – having the vectors of values that can be possible for each variable (e.g. heater can have loads between 0 and 12000KW); made combinations of all three vectors to define all state-spaces within which our building model can operate (more importantly we are trying to not define state-spaces that not possible for a real building systems). Along creation of states it creates Q matrix for three actions meaning increase, decrease and don't change temperature set point using reward function described bellow: As you may notice reward function appreciates...
Life, Study, Work and curiosity