Abstract:To solve the problem that the current indoor environment is affected by a variety of time-varying parameters with large uncertainty and the existing control equipment cannot adaptively adjust the operating power according to the indoor environment, which has caused a great waste of energy, the method of integrating the prioritized experience replay (PER) into the deep deterministic policy gradient (DDPG) is adopted. Prioritized experience replay (PER) in DDPG is used to rationally and optimally control the power of the equipment for indoor air quality (IAQ) and thermal comfort. Experiments show that the proposed DDPG-PER algorithm can control the indoor environment within the required range by combining multiple time-varying parameters under different outdoor air quality conditions in winter and summer seasons. Moreover, compared with the fixed-air-volume control system, it reduces the energy cost by 13.30%, and saves about 2 000 RMB of electricity cost in a whole year, which is valuable for China[DK]’s "carbon-neutral" strategy and the development of green and low-carbon buildings.