REFERENCES
1. Hong Y, Hou B, Jiang H, Zhang J. Machine learning and artificial neural network accelerated computational discoveries in materials science. WIREs Comput Mol Sci 2020;10:e1450.
2. Sparks TD, Kauwe SK, Parry ME, Tehrani AM, Brgoch J. Machine learning for structural materials. Annu Rev Mater Res 2020;50:27-48.
3. Hwang J, Rao RR, Giordano L, Katayama Y, Yu Y, Shao-Horn Y. Perovskites in catalysis and electrocatalysis. Science 2017;358:751-6.
4. Butler K T, Davies D W, Cartwright H, et al. Machine learning for molecular and materials science. Nature 2018;559:547-55.
5. Juan Y, Dai Y, Yang Y, Zhang J. Accelerating materials discovery using machine learning. J Mater Sci Mater Med 2021;79:178-90.
6. Fischer CC, Tibbetts KJ, Morgan D, Ceder G. Predicting crystal structure by merging data mining with quantum mechanics. Nat Mater 2006;5:641-6.
7. Raccuglia P, Elbert KC, Adler PD, et al. Machine-learning-assisted materials discovery using failed experiments. Nature 2016;533:73-6.
8. Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: generative models for matter engineering. Science 2018;361:360-5.
9. Ong SP. Accelerating materials science with high-throughput computations and machine learning. Computational Materials Science 2019;161:143-50.
10. Balachandran PV. Machine learning guided design of functional materials with targeted properties. Computational Materials Science 2019;164:82-90.
11. Peña MA, Fierro JL. Chemical structures and performance of perovskite oxides. Chem Rev 2001;101:1981-2017.
12. Lino A, Rocha Á, Sizo A, Rocha Á. Virtual teaching and learning environments: automatic evaluation with symbolic regression. IFS 2016;31:2061-72.
13. Yuan S, Jiao Z, Quddus N, Kwon JS, Mashuga CV. Developing quantitative structure-property relationship models to predict the upper flammability limit using machine learning. Ind Eng Chem Res 2019;58:3531-7.
14. Ouyang R, Curtarolo S, Ahmetcik E, Scheffler M, Ghiringhelli LM. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Phys Rev Materials 2018;2:083802.
15. Zhang Y, Xu X. Machine learning lattice constants for cubic perovskite compounds. Chemistry Select 2020;5:9999-10009.
16. de Franca FO, de Lima MZ. Interaction-transformation symbolic regression with extreme learning machine. Neurocomputing 2021;423:609-19.
17. Li Z, Xu Q, Sun Q, et al. Stability engineering of halide perovskite via machine learning. arXiv preprint arXiv 2018;1803.06042.
18. Li W, Jacobs R, Morgan D. Predicting the thermodynamic stability of perovskite oxides using machine learning models. Computational Materials Science 2018;150:454-63.
19. Deng Q, Lin B. Exploring structure-composition relationships of cubic perovskite oxides via extreme feature engineering and automated machine learning. Materials Today Communications 2021;28:102590.
20. Gardner S, Golovidov O, Griffin J, et al. Constrained multi-objective optimization for automated machine learning. 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA) Washington, DC, USA. IEEE; 2019. pp. 364-73.
21. Masrom S, Mohd T, Jamil N S, et al. Automated machine learning based on genetic programming: a case study on a real house pricing dataset. 2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS) Ipoh, Malaysia. IEEE; 2019. pp. 48-52.
22. Chauhan K, Jani S, Thakkar D, et al. Automated machine learning: the new wave of machine learning. 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) Bangalore, India. IEEE; 2020. pp. 205-12.
23. Ge P. Analysis on approaches and structures of automated machine learning frameworks. 2020 International Conference on Communications, Information System and Computer Engineering (CISCE) Kuala Lumpur, Malaysia. IEEE; 2020. pp. 474-7.
24. Han J, Park KS, Lee KM. An automated machine learning platform for non-experts. Proceedings of the International Conference on Research in Adaptive and Convergent Systems. Association for Computing Machinery, New York, NY, USA; 2020. pp. 84-6.
25. Umamahesan A, Babu DMI. From zero to AI hero with automated machine learning. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA; 2020. p. 3495.
26. Waring J, Lindvall C, Umeton R. Automated machine learning: review of the state-of-the-art and opportunities for healthcare. Artif Intell Med 2020;104:101822.
27. Zeineddine H, Braendle U, Farah A. Enhancing prediction of student success: automated machine learning approach. Computers & Electrical Engineering 2021;89:106903.
28. Sun Y, Yang G. Feature engineering for search advertising recognition. 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) Chengdu, China. IEEE; 2019. pp. 1859-64.
29. Li Z, Ma X, Xin H. Feature engineering of machine-learning chemisorption models for catalyst design. Catalysis Today 2017;280:232-8.
30. Emery AA, Wolverton C. High-throughput DFT calculations of formation energy, stability and oxygen vacancy formation energy of ABO3 perovskites. Sci Data 2017;4:170153.
31. Kotsiantis SB, Kanellopoulos D, Pintelas PE. Data preprocessing for supervised leaning. International journal of computer science 2006;1:111-7.
32. Liu N, Gao G, Liu G. Data Preprocessing based on partially supervised learning. Proceedings of the 6th International Conference on Information Engineering for Mechanics and Materials Atlantis Press; 2016. pp. 678-83.
33. Zainuddin Z, Lim E A. A comparative study of missing value estimation methods: which method performs better? 2008 International Conference on Electronic Design Penang, Malaysia. IEEE; 2008. pp. 1-5.
35. Li J, Cheng K, Wang S, et al. Feature Selection: A Data Perspective. ACM Comput Surv 2018;50:1-45.
36. Mangal A, Holm E A. A comparative study of feature selection methods for stress hotspot classification in materials. Integrating Materials and Manufacturing Innovation 2018;7:87-95.
37. Zhou H, Deng Z, Xia Y, Fu M. A new sampling method in particle filter based on Pearson correlation coefficient. Neurocomputing 2016;216:208-15.
38. Hauke J, Kossowski T. Comparison of values of Pearson’s and Spearman’s correlation coefficients on the same sets of data. Quaestiones Geographicae 2011;30:87-93.
39. Khurana U, Samulowitz H, Turaga D. Feature engineering for predictive modeling using reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence. Thirty-Second AAAI Conference on Artificial Intelligence 2018. pp. 3407-14.
40. Heaton J. An empirical analysis of feature engineering for predictive modeling. SoutheastCon. 2016 Norfolk, VA, USA. IEEE; 2016. pp. 1-6.
41. Zheng A, Casari A. Feature engineering for machine learning: principles and techniques for data scientists. ‘O’Reilly Media, Inc.; 2018.
42. Dai D, Xu T, Wei X, et al. Using machine learning and feature engineering to characterize limited material datasets of high-entropy alloys. Computational Materials Science 2020;175:109618.
43. Nargesian F, Samulowitz H, Khurana U, et al. Learning feature engineering for classification. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence Main track. IJCAI 2017. pp. 2529-35.
44. Hou J, Pelillo M. A simple feature combination method based on dominant sets. Pattern Recognition 2013;46:3129-39.
45. Ghiringhelli LM, Vybiral J, Levchenko SV, Draxl C, Scheffler M. Big data of materials science: critical role of the descriptor. Phys Rev Lett 2015;114:105503.
46. Fox J. Regression diagnostics: an introduction. Sage publications; 2019.
47. Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. John Wiley & Sons; 2021.
48. Weisberg S. Applied linear regression. John Wiley & Sons; 2005.
49. Dai D, Liu Q, Hu R, et al. Method construction of structure-property relationships from data by machine learning assisted mining for materials design applications. Materials & Design 2020;196:109194.
50. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. The Journal of machine Learning research 2011;12:2825-30.
51. Abraham A, Pedregosa F, Eickenberg M, et al. Machine learning for neuroimaging with scikit-learn. Front Neuroinform 2014;8:14.
52. Jović A, Brkić K, Bogunović N. A review of feature selection methods with applications. 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO) Opatija, Croatia. IEEE; 2015. pp. 1200-5.
53. Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med 2019;112:103375.
55. Chandrashekar G, Sahin F. A survey on feature selection methods. Computers & Electrical Engineering 2014;40:16-28.
56. Rückstieß T, Osendorfer C, van der Smagt P. Sequential feature selection for classification. In: Wang D, Reynolds M, editors. AI 2011: advances in artificial intelligence. Berlin: Springer Berlin Heidelberg; 2011. pp. 132-41.
57. Lee CY, Chen BS. Mutually-exclusive-and-collectively-exhaustive feature selection scheme. Applied Soft Computing 2018;68:961-71.
58. Su R, Liu X, Wei L. MinE-RFE: determine the optimal subset from RFE by minimizing the subset-accuracy-defined energy. Brief Bioinform 2020;21:687-98.
59. Yang F, Wang D, Xu F, Huang Z, Tsui K. Lifespan prediction of lithium-ion batteries based on various extracted features and gradient boosting regression tree model. J Power Sources 2020;476:228654.
61. Schapire RE. The boosting approach to machine learning: an overview. In: Denison DD, Hansen MH, Holmes CC, Mallick B, Yu B, editors. Nonlinear estimation and classification. New York: Springer; 2003. pp. 149-71.
62. Oza N C. Online bagging and boosting. 2005 IEEE International Conference on Systems, Man and Cybernetics Waikoloa, HI, USA. IEEE; 2005. pp. 2340-5.
63. Elith J, Leathwick JR, Hastie T. A working guide to boosted regression trees. J Anim Ecol 2008;77:802-13.
65. Abe GL, Sasaki JI, Katata C, et al. Fabrication of novel poly(lactic acid/caprolactone) bilayer membrane for GBR application. Dent Mater 2020;36:626-34.
66. Sharafati A, Asadollah SBHS, Hosseinzadeh M. The potential of new ensemble machine learning models for effluent quality parameters prediction and related uncertainty. Process Safety and Environmental Protection 2020;140:68-78.
67. Friedman JH. Stochastic gradient boosting. Computational Statistics & Data Analysis 2002;38:367-78.
69. Lu S, Zhou Q, Guo Y, Zhang Y, Wu Y, Wang J. Coupling a crystal graph multilayer descriptor to active learning for rapid discovery of 2D Ferromagnetic Semiconductors/Half-Metals/Metals. Adv Mater 2020;32:e2002658.
70. Bartel CJ, Sutton C, Goldsmith BR, et al. New tolerance factor to predict the stability of perovskite oxides and halides. Sci Adv 2019;5:eaav0693.