REFERENCES
1. Deng, X.; Zhang, Y.; Jiang, Y.; Zhang, Y.; Qi, H. A novel operation method for renewable building by combining distributed DC energy system and deep reinforcement learning. Appl. Energy. 2024, 353, 122188.
2. Mehigan, L.; Deane, J.; Gallachóir, B.; Bertsch, V. A review of the role of distributed generation (DG) in future electricity systems. Energy 2018, 163, 822-36.
3. Liu, R.; He, G.; Su, Y.; Yang, Y.; Ding, D. Solar energy for low carbon buildings: choice of systems for minimal installation area, cost, and environmental impact. City. Built. Enviro. 2023, 1, 16.
4. Shi, L.; Lao, W.; Wu, F.; Lee, K. Y.; Li, Y.; Lin, K. DDPG-based load frequency control for power systems with renewable energy by DFIM pumped storage hydro unit. Renew. Energy. 2023, 218, 119274.
5. Wang, X.; Kang, X.; An, J.; Chen, H.; Yan, D. Reinforcement learning approach for optimal control of ice-based thermal energy storage (TES) systems in commercial buildings. Energy. Build. 2023, 301, 113696.
6. Yu, D.; Zhou, X.; Qi, H.; Qian, F. Low-carbon city planning based on collaborative analysis of supply and demand scenarios. City. Built. Enviro. 2023, 1, 7.
7. Ghersi, D. E.; Amoura, M.; Loubar, K.; Desideri, U.; Tazerout, M. Multi-objective optimization of CCHP system with hybrid chiller under new electric load following operation strategy. Energy 2021, 219, 119574.
8. Lu, Y.; Wang, S.; Sun, Y.; Yan, C. Optimal scheduling of buildings with energy generation and thermal energy storage under dynamic electricity pricing using mixed-integer nonlinear programming. Appl. Energy. 2015, 147, 49-58.
9. Xiao, Y.; Sun, W.; Sun, L. Dynamic programming based economic day-ahead scheduling of integrated tri-generation energy system with hybrid energy storage. J. Energy. Storage. 2021, 44, 103395.
10. Wang, J.; Jing, Y.; Zhang, C. Optimization of capacity and operation for CCHP system by genetic algorithm. Appl. Energy. 2010, 87, 1325-35.
11. Dai, Y.; Zeng, Y. Optimization of CCHP integrated with multiple load, replenished energy, and hybrid storage in different operation modes. Energy 2022, 260, 125129.
12. Wang, Y.; Qin, Y.; Ma, Z.; Wang, Y.; Li, Y. Operation optimisation of integrated energy systems based on cooperative game with hydrogen energy storage systems. Int. J. Hydrogen. Energy. 2023, 48, 37335-54.
13. Xu, Y.; Yan, C.; Liu, H.; Wang, J.; Yang, Z.; Jiang, Y. Smart energy systems: A critical review on design and operation optimization. Sustain. Cities. Soc. 2020, 62, 102369.
14. Wang, Z.; Xiao, F.; Ran, Y.; Li, Y.; Xu, Y. Scalable energy management approach of residential hybrid energy system using multi-agent deep reinforcement learning. Appl. Energy. 2024, 367, 123414.
15. Ren, K.; Liu, J.; Wu, Z.; Liu, X.; Nie, Y.; Xu, H. A data-driven DRL-based home energy management system optimization framework considering uncertain household parameters. Appl. Energy. 2024, 355, 122258.
16. Salari, A.; Ahmadi, S. E.; Marzband, M.; Zeinali, M. Fuzzy Q-learning-based approach for real-time energy management of home microgrids using cooperative multi-agent system. Sustain. Cities. Soc. 2023, 95, 104528.
17. He, H.; Meng, X.; Wang, Y.; et al. Deep reinforcement learning based energy management strategies for electrified vehicles: Recent advances and perspectives. Renew. Sustain. Energy. Rev. 2024, 192, 114248.
18. Ming, F.; Gao, F.; Liu, K.; Li, X. A constrained DRL-based bi-level coordinated method for large-scale EVs charging. Appl. Energy. 2023, 331, 120381.
19. Xiao, H.; Fu, L.; Shang, C.; Bao, X.; Xu, X.; Guo, W. Ship energy scheduling with DQN-CE algorithm combining bi-directional LSTM and attention mechanism. Appl. Energy. 2023, 347, 121378.
20. Tian, W.; Fu, G.; Xin, K.; Zhang, Z.; Liao, Z. Improving the interpretability of deep reinforcement learning in urban drainage system operation. Water. Res. 2024, 249, 120912.
21. Li, H.; Yang, Y.; Liu, Y.; Pei, W. Federated dueling DQN based microgrid energy management strategy in edge-cloud computing environment. Sustain. Energy. Grids. Netw. 2024, 38, 101329.
22. Zheng, L.; Wu, H.; Guo, S.; Sun, X. Real-time dispatch of an integrated energy system based on multi-stage reinforcement learning with an improved action-choosing strategy. Energy 2023, 277, 127636.
23. Demir, S.; Kok, K.; Paterakis, N. G. Statistical arbitrage trading across electricity markets using advantage actor–critic methods. Sustain. Energy. Grids. Netw. 2023, 34, 101023.
24. Nakabi, T. A.; Toivanen, P. Deep reinforcement learning for energy management in a microgrid with flexible demand. Sustain. Energy. Grids. Netw. 2021, 25, 100413.
25. Yang, T.; Zhao, L.; Li, W.; Zomaya, A. Y. Dynamic energy dispatch strategy for integrated energy system based on improved deep reinforcement learning. Energy 2021, 235, 121377.
26. Brandi, S.; Gallo, A.; Capozzoli, A. A predictive and adaptive control strategy to optimize the management of integrated energy systems in buildings. Energy. Rep. 2022, 8, 1550-67.
27. Zhang, B.; Hu, W.; Cao, D.; Huang, Q.; Chen, Z.; Blaabjerg, F. Deep reinforcement learning–based approach for optimizing energy conversion in integrated electrical and heating system with renewable energy. Energy. Conver. Manage. 2019, 202, 112199.
28. Lingmin, C.; Jiekang, W.; Huiling, T.; Feng, J.; Yanan, W. A Q-learning based optimization method of energy management for peak load control of residential areas with CCHP systems. Electr. Power. Syst. Res. 2023, 214, 108895.
29. Ding, H.; Xu, Y.; Chew Si Hao, B.; Li, Q.; Lentzakis, A. A safe reinforcement learning approach for multi-energy management of smart home. Electr. Power. Syst. Res. 2022, 210, 108120.
30. Jiang, W.; Liu, Y.; Fang, G.; Ding, Z. Research on short-term optimal scheduling of hydro-wind-solar multi-energy power system based on deep reinforcement learning. J. Clean. Prod. 2023, 385, 135704.
31. Zhou, Y.; Huang, Y.; Mao, X.; Kang, Z.; Huang, X.; Xuan, D. Research on energy management strategy of fuel cell hybrid power via an improved TD3 deep reinforcement learning. Energy 2024, 293, 130564.
32. Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 2021. , 22, 1-8. https://github.com/DLR-RM/stable-baselines3 (accessed 2026-03-30).
33. Brockman, G.; Cheung, V.; Pettersson, L.; et al. OpenAI Gym. arXiv 2016. , arXiv:1606.01540. Available online: https://arxiv.org/abs/1606.01540 (accessed 30 March 2026).
34. PyTorch. PyTorch: Tensors and dynamic neural networks in Python with strong GPU acceleration. 2016. Retrieved from https://pytorch.org/. (accessed 2026-03-30).
35. State Grid Home Page. https://www.95598.cn/osgweb/index. (accessed 2026-03-30).
36. Shanghai Municipal Development & Reform Commission Home Page. https://fgw.sh.gov.cn/. (accessed 2026-03-30).


