REFERENCES

1. Boubertakh H, Tadjine M, Glorennec PY, Labiod S. Tuning fuzzy PD and PI controllers using reinforcement learning. ISA Transactions 2010;49:543-51.

2. Ziegler JG, Nichols NB, et al. Optimum settings for automatic controllers. Trans ASME 1942;64.

3. Borase RP, Maghade D, Sondkar S, Pawar S. A review of PID control, tuning methods and applications. International Journal of Dynamics and Control 2020:1-10.

4. Lee D, Lee SJ, Yim SC. Reinforcement learning-based adaptive PID controller for DPS. Ocean Engineering 2020;216:108053.

5. Zhong J, Li Y. Toward human-in-the-loop PID control based on CACLA reinforcement learning. In: International Conference on Intelligent Robotics and Applications. Springer; 2019. pp. 605-13.

6. Guan Z, Yamamoto T. Design of a reinforcement learning PID controller. IEEJ Transactions on Electrical and Electronic Engineering 2021;16:1354-60.

7. Teoh E, Yee Y. Implementation of adaptive controllers using digital signal processor chips. In: Intelligent Tuning and Adaptive Control. Elsevier; 1991. pp. 109-13.

8. Bucz Š, Kozáková A. Advanced methods of PID controller tuning for specified performance. PID Control for Industrial Processes 2018:73-119.

9. Seborg DE, Edgar TF, Mellichamp DA, Doyle â…¢ FJ. Process dynamics and control. John Wiley & Sons; 2016.

10. Abushawish A, Hamadeh M, Nassif AB. PID controller gains tuning using metaheuristic optimization methods: A survey. Journal of Huaqiao University (Natural Science) 2020;14:87-95.

11. Chang WD, Hwang RC, Hsieh JG. A multivariable on-line adaptive PID controller using auto-tuning neurons. Engineering Applications of Artificial Intelligence 2003;16:57-63.

12. Yu D, Chang T, Yu D. A stable self-learning PID control for multivariable time varying systems. Control Engineering Practice 2007;15:1577-87.

13. Zhou K, Zhen L. Optimal design of PID parameters by evolution algorithm. Journal of Huaqiao University (Natural Science) 2005;26:85-88.

14. Chen J, Huang TC. Applying neural networks to on-line updated PID controllers for nonlinear process control. Journal of Process Control 2004;14:211-30.

15. Hou Z, Chi R, Gao H. An overview of dynamic-linearization-based data-driven control and applications. IEEE Transactions on Industrial Electronics 2016;64:4076-90.

16. Wang XS, Cheng YH, Wei S. A proposal of adaptive PID controller based on reinforcement learning. Journal of China University of Mining and Technology 2007;17:40-44.

17. Sutton RS, Barto AG. Reinforcement learning: An introduction. MIT press; 2018.

18. Shin J, Badgwell TA, Liu KH, Lee JH. Reinforcement Learning-Overview of recent progress and implications for process control. Computers & Chemical Engineering 2019;127:282-94.

19. Watkins CJ, Dayan P. Q-learning. Machine Learning 1992;8:279-92.

20. Rummery GA, Niranjan M. On-line Q-learning using connectionist systems. vol. 37. Citeseer; 1994.

21. Sutton RS, McAllester DA, Singh SP, Mansour Y. Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems; 2000. pp. 1057-63.

22. Silver D, Lever G, Heess N, Degris T, Wierstra D, et al. Deterministic policy gradient algorithms. In: International Conference on Machine Learning. PMLR; 2014. pp. 387-95.

23. Lawrence NP, Stewart GE, Loewen PD, Forbes MG, Backstrom JU, et al. Reinforcement learning based design of linear fixed structure controllers. IFAC-PapersOnLine 2020;53:230-35.

24. Qin Y, Zhang W, Shi J, Liu J. Improve PID controller through reinforcement learning. In: 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC). IEEE; 2018. pp. 1-6.

25. Sedighizadeh M, Rezazadeh A. Adaptive PID controller based on reinforcement learning for wind turbine control. In: Proceedings of World Academy of Science, Engineering and Technology. vol. 27. Citeseer; 2008. pp. 257-62.

26. Boubertakh H, Glorennec PY. Optimization of a fuzzy PI controller using reinforcement learning. In: 2006 2nd International Conference on Information & Communication Technologies. vol. 1. IEEE; 2006. pp. 1657-62.

27. El Hakim A, Hindersah H, Rijanto E. Application of reinforcement learning on self-tuning PID controller for soccer robot multi-agent system. In: 2013 Joint International Conference on Rural Information & Communication Technology and Electric-vehicle Technology (rICT & ICeV-T). IEEE; 2013. pp. 1-6.

28. Park D, Yu H, Xuan-Mung N, Lee J, Hong SK. Multicopter PID attitude controller gain auto-tuning through reinforcement learning neural networks. In: Proceedings of the 2019 2nd International Conference on Control and Robot Technology; 2019. pp. 80-84.

29. Lawrence NP, Stewart GE, Loewen PD, Forbes MG, Backstrom JU, et al. Optimal PID and antiwindup control design as a reinforcement learning problem. IFAC-PapersOnLine 2020;53:236-41.

30. Omisore OM, Akinyemi T, Duan W, Du W, Wang L. A novel sample-efficient deep reinforcement learning with episodic policy transfer for PID-based control in cardiac catheterization robots. arXiv preprint arXiv: 211014941 2021.

31. Astrom KJ, Rundqwist L. Integrator windup and how to avoid it. In: 1989 American Control Conference. IEEE; 1989. pp. 1693-98.

32. Spielberg S, Tulsyan A, Lawrence NP, Loewen PD, Gopaluni RB. Deep reinforcement learning for process control: A primer for beginners. arXiv preprint arXiv: 200405490 2020.

33. Watkins CJCH. Learning from delayed rewards. King's College, Cambridge United Kingdom; 1989.

34. Degris T, White M, Sutton RS. Off-policy actor-critic. arXiv preprint arXiv: 12054839 2012.

35. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv: 150902971 2015.

36. Malmborg J, Bernhardsson B, Åström KJ. A stabilizing switching scheme for multi controller systems. IFAC Proceedings Volumes 1996;29:2627-32.

37. Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference On Machine Learning. PMLR; 2015. pp. 448-56.

38. Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv preprint arXiv: 160706450 2016.

39. Liu Y, Halev A, Liu X. Policy learning with constraints in model-free reinforcement learning: A survey. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence; 2021.

40. Chow Y, Ghavamzadeh M, Janson L, Pavone M. Risk-constrained reinforcement learning with percentile risk criteria. The Journal of Machine Learning Research 2017;18:6070-120.

41. Bohez S, Abdolmaleki A, Neunert M, Buchli J, Heess N, et al. Value constrained model-free continuous control. arXiv preprint arXiv: 190204623 2019.

42. Stewart GE. A pragmatic approach to robust gain scheduling. IFAC Proceedings Volumes 2012;45:355-62.

43. Ellis G. Four types of controllers. Control System Design Guide 2012.

44. Panda RC, Yu CC, Huang HP. PID tuning rules for SOPDT systems: Review and some new results. ISA Transactions 2004;43:283-95.

45. Rivera DE, Morari M, Skogestad S. Internal model control: PID controller design. Industrial & Engineering Chemistry Process Design and Development 1986;25:252-65.

46. Lee Y, Park S, Lee M, Brosilow C. PID controller tuning for desired closed-loop responses for SI/SO systems. AIChE Journal 1998;44:106-15.

47. Khalil HK. Nonlinear Systems (Third Edition). Patience Hall 2002;115.