# Domain knowledge-guided interpretive machine learning: formula discovery for the oxidation behavior of ferritic-martensitic steels in supercritical water

*J Mater Inf*2022;2:4.

## Abstract

A general formula with high generalization and accurate prediction power is highly desirable for science, technology and engineering. In addition to human beings, artificial intelligence algorithms show great promise for the discovery of formulas. In this study, we propose a domain knowledge-guided interpretive machine learning strategy and demonstrate it by studying the oxidation behavior of ferritic-martensitic steels in supercritical water. The oxidation Cr equivalent is, for the first time, proposed in the present work to represent all contributions of alloying elements to oxidation, derived by our domain knowledge and interpretive machine learning algorithms. An open-source tree classifier for linear regression algorithm is also, for the first time, developed to materialize the formula with collected data. This algorithm effectively captures the linear correlation between compositions, testing environments and oxidation behaviors from the data. The sure independence screening and sparsifying operator algorithm finally assembles the information derived from the tree classifier for linear regression algorithm, resulting in a general formula. The general formula with the determined parameters has the power to predict, quantitatively and accurately, the oxidation behavior of ferritic-martensitic steels with multiple alloying elements exposed to various supercritical water environments, thereby providing guidance for the design of anti-oxidation steels and hence promoting the development of power plants with improved safety. The present work demonstrates the power of domain knowledge-guided interpretive machine learning with respect to the data-driven discovery of physics-informed formulas and the acceleration of materials informatics development.

## Keywords

*,*interpretive

*,*oxidation Cr equivalent

*,*tree classifier for linear regression (TCLR)

*,*general formula

## INTRODUCTION

The rapid development of materials informatics ^{[1-5]}, artificial intelligence (AI) and machine learning (ML) techniques has led to a new paradigm of data-driven discovery of novel materials, state-of-the-art experimental and computational methods and scientific laws and formulas. The number of publications on materials informatics has increased exponentially in the past decade and materials informatics has achieved great success in many areas ^{[6-10]}. For example, Xue *et al*. ^{[11]} proposed an adaptive design iteration strategy by tightly coupling ML with experiments, which sequentially identifies the next experiments by using efficient global optimization to balance the trade-off between exploitation and exploration. This adaptive design strategy, also known as active learning, starts from an initial dataset of 22 alloys, runs nine feedback loops in the search space of *et al*. ^{[12]} used a convolutional neural network-based ML model to automatically obtain the key features for the accurate prediction of catalytic properties, such as adsorption energies. The ML model yields a DOSnet, which has the capacity to provide physically meaningful predictions and insights by predicting responses to external perturbations to the electronic structure without additional calculations.

Attia *et al*. ^{[13]} developed an ML methodology to efficiently optimize a parameter space specifying the current and voltage profiles of six-step, 10-min fast-charging protocols for maximizing battery cycle life. They trained an elastic net ML model to predict battery charging/discharging life using data only from the first few cycles and employed a Bayesian optimization algorithm to reduce the number of experiments by balancing exploration and exploitation to efficiently probe the parameter space of charging protocols. With such an approach, they identified and validated high-cycle-life charging protocols among 224 candidates in 16 days. Saito *et al*. ^{[14]} conducted an image process by using U-Net based on a convolutional encoder-decoder network to segment and identify the thickness of atomic layer flakes from optical microscopy images, achieving a success rate of 70–80% in distinguishing monolayer and bilayer MoS_{2} and graphene.

ML is achieving remarkable success in materials science and engineering ^{[15, 16]} and will achieve even greater success if it can become more transparent and interpretive. Theoretically, AI and ML are based on statistics without utilizing any other scientific laws, principles and (physical) equations and most AI and ML algorithms perform as "black-box" systems ^{[17-23]}. Considerable efforts, such as physics-informed neural networks ^{[24]}, symbolic regression and Shapley additive explanations (SHAP) ^{[25]}, are being carried out to enhance the interpretability of ML models. Obviously, significant further endeavors are required to make ML models interpretive. The strategy proposed in the present work, i.e., domain knowledge-guided interpretive ML, might pave the way for the discovery of mathematical formulas.

In the present work, we propose a domain knowledge-guided interpretive ML strategy to make ML models interpretable and have more physical sense and apply this strategy to the data-driven discovery of formulas regarding the oxidation of ferritic-martensitic (FM) steels in supercritical water (SCW). Although the use of SCW in power plants can achieve enhanced thermal efficiency with simplified plant design and improved safety, it requires high anti-oxidation materials because SCW is a strong oxidant ^{[26]} beyond the supercritical point (at 374 ℃ and 22.1 MPa). FM steels are some of the most promising structural materials for use in SCW-cooled power plants, owing to their high elevated temperature strength, high creep resistance, high thermal conductivity, low swelling behavior under irradiation, low thermal expansion coefficient, and low susceptibility to stress oxidation cracking up to 600 ℃ ^{[27, 28]}.

The oxidation behavior of FM steels in SCW environments has been investigated extensively through experimental approaches ^{[29-34]}. The current understanding of the corrosion occurring in high-temperature water environments is associated with the chemistry and physics of the water (density and dielectric constant of the medium). In high-temperature water with a low density/dielectric constant (^{[30]}. The oxidation behavior of FM alloys in SCW depends on the alloy composition and oxidation environment. Ampornrat and Was ^{[29]} experimentally investigated the corrosion behavior of three FM alloys (T91, HCM12 A and HT-9) in SCW at temperatures ranging from 400 to 600 ℃ with dissolved oxygen concentrations of ^{[35]}. Dong *et al*. ^{[36]} reported that with increasing Mn content, the oxide scale becomes discontinuous and thicker, and thus Mn might be harmful to the oxidation of FM steels in SCW.

Significant progress has been achieved in the investigation and understanding of FM steel oxidation in various SCW environments ^{[29, 30, 37-39]}, as evidenced by the Arrhenius equation of ^{[29]}, where

Oxidation is clearly a thermally activated process, and the associated thermodynamics and kinetics are greatly dependent on the material compositions and environmental variables. In experimental investigations, individual researchers adjust only one or a few experimental conditions and the obtained result and formula are valid only for the FM steels and SCW environments and periods tested. To the best of our knowledge, no generalized formula has been established for the description and/or prediction of the oxidation of FM steels with any given alloying elements exposed to various SCW environmental conditions. The present work adopts domain knowledge-guided interpretive ML to discover a generalized formula for the oxidation of FM steels in SCW, which will promote the development of green and safe power plants. In addition to exposure time, the investigated FM steels cover 11 alloying elements, and the studied SCW environments include temperature, dissolved oxygen concentration (DOC) and pressure.

Our domain knowledge of oxidation suggests a dimensionless Arrhenius equation of ^{[25]}, extreme gradient boosting (Xgboost) ^{[40]} and the sure independence screening and sparsifying operator (SISSO) ^{[41]}, and more significantly, a newly developed classifier model, the tree classifier for linear regression (TCLR) ^{[42]}, to discover a generalized formula from data for the oxidation of FM steels in SCW. Recently, the SHAP algorithm has been widely used to calculate quantitatively the contribution of each feature to a particular task ^{[5, 21]}. Xiong *et al*. ^{[5]} found that critical SHAP values exist in some features when studying the hardness and ultimate tensile strength of complex concentrated alloys (CCAs) by ML. The critical feature value separates the SHAP values into positive and negative regions. This means that the feature values in the positive/negative SHAP value region improve/impair the mechanical properties of CCAs, thereby providing a straightforward assessment of the design of CCAs with high hardness and strength. Obviously, the application of SHAP, including pure and interaction SHAP values, will further promote the development of materials informatics.

The generalized formula established in this study accurately predicts the oxidation behavior of experimental FM steels with different alloying elements in various SCW testing conditions. Figure 1 outlines the domain knowledge-guided interpretive ML strategy, where the hub is the domain knowledge suggested Arrhenius equation. The feature importance of SHAP ^{[41]} assembles the information from TCLR into a generalized formula [Figure 1D]. The Arrhenius equation is the starting point of the domain knowledge-guided interpretive ML strategy, which is a prior suggested based on our knowledge. The Arrhenius equation is also the posterior after the mining and evaluation of the ML algorithms with the experimental data, and thus transfers to a generalized formula.

Figure 1. Domain knowledge-guided interpretive ML strategy. A: Feature selection with

## METHODS

### Dataset

A total of 184 oxidation data of FM steels in SCW are collected from the literature and given at the online Supplementary Information. Every datum in the FM steel oxidation (FMO) dataset includes oxidation caused weight gain in units of mg/dm

Fifteen features of FM steel oxidation data

Category | Feature name | Description |

Alloying elements | Cr | Chromium (wt.%) |

Si | Silicon (wt.%) | |

Mn | Manganese (wt.%) | |

C | Carbon (wt.%) | |

Ni | Nickel (wt.%) | |

Mo | Molybdenum (wt.%) | |

Nb | Niobium (wt.%) | |

W | Tungsten (wt.%) | |

V | Vanadium (wt.%) | |

P | Phosphorus (wt.%) | |

Cu | Copper (wt.%) | |

Testing conditions | T | Absolute temperature (K) |

Pressure | SCW pressure (MPa) | |

t | Exposure time (h) | |

DOC | Dissolved oxygen concentration (ppb) |

### Xgboost and SHAP values

Xgboost ^{[40]} is a powerful tree-based boosting ensemble algorithm. The present work employs the Xgboost algorithm to regress the oxidation data of FM steels in SCW, and the values of hyperparameters involved are optimized by cross-validation and/or a grid search in the hyperparameter space with the open python library scikit-learn ^{[43]}. Table S1 at Section 2.1 of the Supplementary Information lists all optimized values of the hyperparameters.

The SHAP algorithm is developed based on game theory ^{[25]}. A SHAP value, whether positive or negative, reflects the contribution of a feature to a predicted response in one datum, and the predicted response is given by an ML model. In the present work, SHAP values are calculated with Xgboost models. If there are

where

where the SHAP value

### Integration of SHAP values with domain knowledge

The game theory-based SHAP value is an additive feature attribution method, where the output is a sum of contributions of each input feature ^{[44]}. If the contributions of variables to a function are not additive in the original variable space, but additive in a mapped space, the SHAP value will be calculated in the mapped space. For the oxidation of FM steels in SCW, there are 15 features and each feature contributes to the oxidation weight gain

It might be inaccurate to calculate the reasonable SHAP values in the original space. Based on the domain knowledge of oxidation, we take a dimensionless Arrhenius equation of

### Feature ranking, selection and data screening

An Xgboost model is first developed with all features via ten-fold cross-validation (10-CV) and is used to calculate all SHAP values

where

The data screening is carried out by evaluating the errors

## RESULTS AND DISCUSSION

### Feature selection and data screening

A total of 184 data on FM steel oxidation in SCW are collected from the literature and provided in the Supplementary Information. Fifteen features are employed here and categorized into two groups, namely, alloying elements and testing conditions. The feature analysis is carried out within each of the groups. The SHAP values

Figure 2. SHAP analysis of features. A-D: SHAP values of testing conditions, i.e., temperature, time, DOC and pressure. E-I: SHAP values of alloy compositions, i.e., V, Si, Cr, Ni and Mn. J: Feature importance ranking by

As expected, Figure 2A shows that the lower the value of

The SHAP value of

It is somewhat surprising that vanadium plays the most important role among the studied 11 alloying elements in the oxidation of FM steels in SCW, as shown in Figure 2E, where the SHAP value of ^{[45]} and thus alloying Ni with a content higher than 0.13 wt.% into FM steels enhances the oxidation resistance of FM steels in SCW, as shown in Figure 2H. Manganese might be detrimental to the oxidation of FM steels in SCW when its content is high. Figure 2I indicates that the SHAP value of

The feature importance defined in the SHAP method (see Methods),

The feature selection is then conducted by the sequential backward selector wrapped with Xgboost and 10-CV, which yields the three testing features of

### Pure SHAP values and oxidation Cr equivalent

The SHAP value of each feature is decomposed into its pure SHAP value and the interaction SHAP values, as stated in Eq. (1b). Figure 3A-H shows the pure SHAP values of the selected eight features and there are 178 pure SHAP values in each figure. A comparison of Figure 3A-H to the corresponding Figure 2A-C and Figure E-I indicates that for a certain feature value, the pure SHAP value scattering is much smaller than the SHAP values. This is an expected result because a pure SHAP value eliminates all interaction SHAP values from its parent SHAP value. If pure SHAP values are calculated from a perfect model of a single function of variables, the pure SHAP value of a feature will correspond to the feature value one-to-one, i.e., for one feature value, there is only one pure SHAP value.

Figure 3. Pure SHAP value analysis of important features. A-C: Pure SHAP values of testing conditions, i.e., temperature, time and DOC. D-H: Pure SHAP values of alloy compositions, i.e., V, Ni, Cr, Si and Mn.

Figure 3A shows almost ideal pure SHAP values of feature

There are a few reasons causing multiple pure SHAP values at a given feature. The first reason might be experimental errors, which measure the degree of the experimental scattering of repeated tests. The second reason might be attributed to the Xgboost model, which approximately estimates the response from the input feature data rather than a perfect function. The third reason might be the method used to calculate the SHAP values from a tree-based algorithm (Tree-Explainer model). Figure 3C shows that the pure SHAP values of the DOC feature can be expressed by a logarithm function of

From the pure SHAP value of an individual feature, we defined the joint SHAP value of two features as

which measures the joint contribution of two features. In general,

where

Figure 4. Joint SHAP value of two features and the derived oxidation Cr equivalent concentration. A-D: Joint SHAP value of two features, i.e., Cr with Si, Mn, Ni and V, respectively. E: Predicted values of Xgboost model versus experimental values with the transferred four features. F: Pure SHAP analysis of oxidation Cr equivalent concentration feature.

Hereafter, we use one feature of the oxidation Cr equivalent concentration to replace the five element features. Thus, the total number of features is reduced to four, one chemical composition feature and three testing condition features. With the four features, the Xgboost model is retrained with 10-CV. The predictions on the 178 data are plotted in Figure 4E, showing a perfect fitting with

### Activation energy and time exponents

The oxidation mechanism of FM steels in SCW is embodied in the activation energy and time exponent ^{[29, 46]}, which are the coefficients of

The TCLR tree must be pruned, otherwise, many leaves contain only two data per leaf, which destroys the model generalization considerably. A threshold of data number might also be introduced to prune the tree, i.e., the amount of data in a leaf should not be smaller than a pre-set threshold (default minsize

The entire feature space is estimated from the regions of the four features, i.e., *vs.*

Figure 5. Data located on the leaf of TCLR. (A), (B) One passed leaf and one failed leaf on TCLR of activation energy. (C), (D) One passed leaf and one failed leaf on TCLR of time exponents.

Figure 5A shows one passed leaf for

The TCLR yields the values of activation energy Q and time exponent

The SISSO algorithm ^{[41]}, with minimization of mean absolute percentage error (MAPE, see Supplementary Information) is carried out to find analytic expressions of activation energy Q and time exponent

Note that

### Oxidation kinetic equations

As mentioned above, the pure SHAP values of the DOC feature suggest that the oxidation kinetic in logarithmic space yield in the form of

To have an analytic expression of

Putting all the analytic expressions together gives

The analytic formula of Eq. (5d) has strong predictive power, as shown in Figure 6, with a fitting performance of

Figure 6. Predicted values of Eq. 5(d) versus experimentally measured values. Each dot represents an FM sample, and the dots on the dotted line indicate the equation predicted values are consistent with experimental observations. The light blue region covers

## CONCLUDING REMARKS

In this study, we develop a domain knowledge-guided interpretive ML strategy and demonstrate it by the discovery of the generalized formula for FM steel oxidation in SCW. The domain knowledge suggests the generalized Arrhenius oxidation formula of

The oxidation chromium equivalent concentration

The developed TCLR algorithm is scientifically significant to materials informatics because it captures linear relationships between tasks and features. It is expected that when an original feature space is mapped to a high-dimensional space, the TCLR algorithm is able to capture linear relationships in the high-dimensional space. More affords are needed to further develop the TCLR algorithm.

The generalized Arrhenius oxidation formula has very high prediction accuracy with a Pearson correlation coefficient

## DECLARATIONS

### Authors' contributions

Performed the research, analysed data, wrote the programmers, and drafted the manuscript: Cao B

Collected the data. Tong-Yi Zhang and Ziqiang Dong supervised the project: Yang S, Sun A

Studied and revised the manuscript more on the oxidation mechanism: Dong Z

Designed the study, performed the research, analysed data, revised, and finalized the manuscript: Zhang TY

Discussed the results: Cao B, Yang S, Sun A, Dong Z, Zhang TY

### Availability of data and materials

All experimental data collected in the study are contained in Supplementary Information (FM steel oxidation dataset) and are also available at: https://github.com/Bin-Cao/TCLRmodel. The ML methodology described in the present work was implemented in Python. Source codes of the programmers and algorithms are available at: https://github.com/Bin-Cao/TCLRmodel.

### Financial support and sponsorship

This work was sponsored by the National Key Research and Development Program of China (No. 2018YFB0704400), Key Program of Science and Technology of Yunnan Province (No. 202002AB080001-2), Key Research Project of Zhejiang Laboratory (No. 2021PE0AC02), and Shanghai Pujiang Program (Grant No. 20PJ1403700).

### Conflicts of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

### Ethical approval and consent to participate

Not applicable.

### Consent for publication

Not applicable.

### Copyright

© The Author(s) 2022.

### Supplementary Materials

## REFERENCES

1. Wei QH, Xiong J, Sun S, Zhang T-Y. Multi-objective machine learning of four mechanical properties of steels. *Sci Sin -Tech* 2021;51:722-36.

2. Xiong J, Zhang T-Y, Shi S. Machine learning of mechanical properties of steels. *Sci China Technol Sci* 2020;63:1247-55.

3. Leitherer A, Ziletti A, Ghiringhelli LM. Robust recognition and exploratory analysis of crystal structures via Bayesian deep learning. *Nat Commun* 2021;12:6234.

4. Sun S, Ouyang R, Zhang B, Zhang T-Y. Data-driven discovery of formulas by symbolic regression. *MRS Bull* 2019;44:559-64.

5. Xiong J, Shi S, Zhang T-Y. Machine learning of phases and mechanical properties in complex concentrated alloys. *Journal of Materials Science & Technology* 2021;87:133-42.

6. Xie SR, Quan Y, Hire AC, et al. Machine learning of superconducting critical temperature from Eliashberg theory. *npj Comput Mater* 2022;8.

7. Levämäki H, Tasnádi F, Sangiovanni DG, Johnson LJS, Armiento R, Abrikosov IA. Predicting elastic properties of hard-coating alloys using ab-initio and machine learning methods. *npj Comput Mater* 2022;8.

8. Roy Chowdhury P, Ruan X. Unexpected thermal conductivity enhancement in aperiodic superlattices discovered using active machine learning. *npj Comput Mater* 2022;8.

9. Zhu YQ, Xu T, Wei Q, et al. Linear-superelastic Ti-Nb nanocomposite alloys with ultralow modulus via high-throughput phase-field design and machine learning. *npj Comput Mater* 2021;7.

10. Wang JH, Jia J, Sun S, Zhang T-Y. Statistical learning of small data with domain knowledge-sample size-and pre-notch length- dependent strength of concrete. *Engineering Fracture Mechanics* 2022;259:108160.

11. Xue D, Balachandran PV, Hogden J, Theiler J, Xue D, Lookman T. Accelerated search for materials with targeted properties by adaptive design. *Nat Commun* 2016;7:11241.

12. Fung V, Hu G, Ganesh P, Sumpter BG. Machine learned features from density of states for accurate adsorption energy prediction. *Nat Commun* 2021;12:88.

13. Attia PM, Grover A, Jin N, et al. Machine learned features from density of states for accurate adsorption energy prediction. *Nat Commun* 2021;12:88.

14. Saito Y, Shin K, Terayama K, et al. Deep-learning-based quality filtering of mechanically exfoliated 2D crystals. *npj Comput Mater* 2019;5.

15. Li X, Zhao J, Cong J, et al. Machine learning guided automatic recognition of crystal boundaries in bainitic/martensitic alloy and relationship between boundary types and ductile-to-brittle transition behavior. *Journal of Materials Science & Technology* 2021;84:49-58.

16. Dai F, Wen B, Sun Y, Xiang H, Zhou Y. Theoretical prediction on thermal and mechanical properties of high entropy (Zr0.2Hf0.2Ti0.2Nb0.2Ta0.2)C by deep learning potential. *Journal of Materials Science & Technology* 2020;43:168-74.

18. Lookman T, Balachandran PV, Xue D, Yuan R. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. *npj Comput Mater* 2019;5.

19. Wen C, Zhang Y, Wang C, et al. Machine learning assisted design of high entropy alloys with desired property. *Acta Materialia* 2019;170:109-17.

20. Balachandran PV, Kowalski B, Sehirlioglu A, Lookman T. Experimental search for high-temperature ferroelectric perovskites guided by two-step machine learning. *Nat Commun* 2018;9:1668.

21. Yan L, Diao Y, Lang Z, Gao K. Corrosion rate prediction and influencing factors evaluation of low-alloy steels in marine atmosphere using machine learning approach. *Sci Technol Adv Mater* 2020;21:359-70.

22. Jablonka KM, Jothiappan GM, Wang S, Smit B, Yoo B. Bias free multiobjective active learning for materials design and discovery. *Nat Commun* 2021;12:2312.

23. Garrido Torres JA, Gharakhanyan V, Artrith N, Eegholm TH, Urban A. Augmenting zero-Kelvin quantum mechanics with machine learning for the prediction of chemical reactions at high temperatures. *Nat Commun* 2021;12:7012.

24. Lu L, Meng X, Mao Z, Karniadakis GE. DeepXDE: A deep learning library for solving differential equations. *SIAM Rev* 2021;63:208-28.

25. Lundberg SM, Lee SI. A unified approach to interpreting model predictions, Proceedings of the 31st international conference on neural information processing systems, 2017, pp. 4768-4777. Available from: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html [Last accessed on 20 Apr 2022].

26. Schulenberg T, Leung LK, Oka Y. Review of R & D for supercritical water cooled reactors. *Progress in Nuclear Energy* 2014;77:282-99.

27. Zhong X, Wu X, Han E. Effects of exposure temperature and time on corrosion behavior of a ferritic-martensitic steel P92 in aerated supercritical water. *Corrosion Science* 2015;90:511-21.

28. Klueh R, Nelson A. Ferritic/martensitic steels for next-generation reactors. *Journal of Nuclear Materials* 2007;371:37-52.

29. Ampornrat P, Was GS. Oxidation of ferritic-martensitic alloys T91, HCM12A and HT-9 in supercritical water. *Journal of Nuclear Materials* 2007;371:1-17.

30. Li Y, Xu T, Wang S, et al. Modelling and Analysis of the Corrosion Characteristics of Ferritic-Martensitic Steels in Supercritical Water. *Materials (Basel)* 2019;12:409.

31. Tan L, Ren X, Allen T. Corrosion behavior of 9-12% Cr ferritic-martensitic steels in supercritical water. *Corrosion Science* 2010;52:1520-8.

32. Li H, Cao Q, Zhu Z. High temperature oxidation behavior of ferritic steel in supercritical water at 550-700 ℃. *Materials at High Temperatures* 2018;36:111-6.

33. Zhu Z, Xu H, Jiang D, Mao X, Zhang N. Influence of temperature on the oxidation behaviour of a ferritic-martensitic steel in supercritical water. *Corrosion Science* 2016;113:172-9.

34. Li Y, Wang S, Sun P, et al. Investigation on early formation and evolution of oxide scales on ferritic-martensitic steels in supercritical water. *Corrosion Science* 2018;135:136-46.

35. Liu Z. Corrosion behavior of designed ferritic-martensitic steels in supercritical water Canada: ProQuest Dissertations Publishing; 2013.

36. Dong Z, Li M, Behnamian Y, et al. Effects of Si, Mn on the corrosion behavior of ferritic-martensitic steels in supercritical water (SCW) environments. *Corrosion Science* 2020;166:108432.

37. Sun L, Yan W. Estimation of oxidation kinetics and oxide scale void position of ferritic-martensitic steels in supercritical water. *Advances in Materials Science and Engineering* 2017;2017:1-12.

38. Bischoff J, Motta AT. Oxidation behavior of ferritic-martensitic and ODS steels in supercritical water. *Journal of Nuclear Materials* 2012;424:261-76.

39. Zhang N, Xu H, Li B, Bai Y, Liu D. Influence of the dissolved oxygen content on corrosion of the ferritic-martensitic steel P92 in supercritical water. *Corrosion Science* 2012;56:123-8.

40. Chen T, Guestrin C. Xgboost: A scalable tree boosting system; proceedings of the Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, F, 2016.

41. Ouyang R, Curtarolo S, Ahmetcik E, Scheffler M, Ghiringhelli LM. SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. *Phys Rev Materials* 2018;2.

42. Zhang TY, Cao B, Zhang SY, Sun S. Tree-classifier for linear regression software [No. 2021SR1951267], 2021. Available from: https://register.ccopyright.com.cn/ [Last accessed on 20 Apr 2022].

43. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python, the journal of machine learning research 12 (2012) 2825-2830. Available from: https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf?ref=https://githubhelp.com [Last accessed on 20 Apr 2022].

44. Lundberg SM, Erion GG, Lee SI. Consistent individualized feature attribution for tree ensembles. *arXiv preprint arXiv* 2018; 1802.03888.

45. Uusitalo M, Vuoristo P, Mäntylä T. High temperature corrosion of coatings and boiler steels below chlorine-containing salt deposits. *Corrosion Science* 2004;46:1311-31.

## Cite This Article

Export citation file: **BibTeX** | **RIS**

**OAE Style**

Cao B, Yang S, Sun A, Dong Z, Zhang TY. Domain knowledge-guided interpretive machine learning: formula discovery for the oxidation behavior of ferritic-martensitic steels in supercritical water. *J Mater Inf* 2022;2:4. http://dx.doi.org/10.20517/jmi.2022.04

**AMA Style**

Cao B, Yang S, Sun A, Dong Z, Zhang TY. Domain knowledge-guided interpretive machine learning: formula discovery for the oxidation behavior of ferritic-martensitic steels in supercritical water. *Journal of Materials Informatics*. 2022; 2(2): 4. http://dx.doi.org/10.20517/jmi.2022.04

**Chicago/Turabian Style**

Cao, Bin, Shuang Yang, Ankang Sun, Ziqiang Dong, Tong-Yi Zhang. 2022. "Domain knowledge-guided interpretive machine learning: formula discovery for the oxidation behavior of ferritic-martensitic steels in supercritical water" *Journal of Materials Informatics*. 2, no.2: 4. http://dx.doi.org/10.20517/jmi.2022.04

**ACS Style**

Cao, B.; Yang S.; Sun A.; Dong Z.; Zhang T.Y. Domain knowledge-guided interpretive machine learning: formula discovery for the oxidation behavior of ferritic-martensitic steels in supercritical water. *J. Mater. Inf.* **2022**, *2*, 4. http://dx.doi.org/10.20517/jmi.2022.04

## About This Article

### Copyright

**Open Access**This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Data & Comments

### Data

### Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.

**81**clicks

**36**likes

^{0}