Open Access  |  Research Article
J Mater Inf 2023;3:23. 10.20517/jmi.2023.31 © The Author(s) 2023.

Physics infused machine learning force fields for 2D materials monolayers

Views: 234 |  Downloads: 20 |  Cited:  0

State Key Laboratory for Mechanical Behavior of Materials, Xi’an Jiaotong University, Xi’an 710049, Shaanxi, China.

*Correspondence to: Prof. Hongxiang Zong, State Key Laboratory for Mechanical Behavior of Materials, Xi’an Jiaotong University, No.28 Xianning West Road, Xi’an 710049, Shaanxi, China. E-mail:

© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (, which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.


Large-scale atomistic simulations of two-dimensional (2D) materials rely on highly accurate and efficient force fields. Here, we present a physics-infused machine learning framework that enables the efficient development and interpretability of interatomic interaction models for 2D materials. By considering the characteristics of chemical bonds and structural topology, we have devised a set of efficient descriptors. This enables accurate force field training using a small dataset. The machine learning force fields show great success in describing the phase transformation and domain switching behaviors of monolayer Group IV monochalcogenides, e.g., GeSe and PbTe. Notably, this type of force field can be readily extended to other non-transition 2D systems, such as hexagonal boron nitride (hBN), leveraging their structural similarity. Our work provides a straightforward but accurate extension of simulation time and length scales for 2D materials.


2D materials, mechanical properties, machine learning force fields, structural evolution


Two-dimensional (2D) materials have attracted tremendous research interest in the past decades because of their various physical properties[1-7]. As an important member, 2D ferroic materials show great potential applications in emerging 2D sensors and actuators, and their functionalities are attributed to reversible domain switching and phase transformation triggered by external fields[8-11]. However, the kinetics of ferroic-order change in 2D ferroic materials remain largely elusive, making them hard to control in devices. To address this issue, one needs not only preparation of high-quality 2D material samples but also characterization of high temporal and spatial resolution, both of which are challenging for experimental studies[8,12].

Atomic simulations are alternative effective methods. However, there is a trade-off between the accuracy and efficiency of calculations. On one hand, ab initio molecular dynamics (AIMD) simulations are ideal tools for accurately studying the dynamic behaviors of 2D ferroic materials in response to external stimuli[13-16]; however, the limited size of supercells and simulation time makes AIMD simulations unable to capture the rare events that are critical for ferroic phase transformation processes, e.g., the nucleation of new domain/phase. On the other hand, molecular dynamics (MD) provides a more efficient approach to studying the kinetics of 2D ferroic materials. Based on a semiempirical force field, MD simulations can reproduce the microstructural evolution within 2D materials at the time scale of several nanoseconds. Nevertheless, there remains a lack of high-fidelity force fields for ferroic 2D systems. For example, there is still not a satisfactory force field that can capture the strongly coupled ferroelastic-ferroelectric orders in monolayer Group IV monochalcogenides[16].

Recently, machine learning (ML) force fields emerged as a new type of force field that can capture the interatomic interactions with quantum mechanical precision[17-20], opening new ways for MD simulations of 2D ferroic materials. Starting from high throughput Density functional theory (DFT) calculations, ML algorithms can help map the local atomistic structures into the corresponding energy and force. Consequently, the force fields generated via ML methods share the accuracy of DFT calculations and the efficiency of semiempirical force fields simultaneously. Nevertheless, most ML force fields, so far, are developed based on the strategy of “data-driven”, i.e., directly learned from massive DFT data of various atomic configurations[18]. Preparing such a training dataset is computationally demanding. More importantly, such data-driven force fields usually lack physics information and interpretability and are quite different from classical force fields that depend on the physical and chemical models chosen to describe interatomic bonding.

Here, we present a physics-infused ML framework that enables the efficient development and interpretability of interatomic interaction models for 2D materials. By integrating the physical model of interatomic bonding into the Gaussian approximation potential, we successfully construct a series of ML force fields for monolayer GeSe and PbTe. For each case, only less than ~10,000 DFT is required to achieve a highly accurate force field, which can well capture the ferroic phase transformation and domain switching processes. Strikingly, we find the force field model can be easily transferred to other non-phase transition 2D materials, e.g., hexagonal boron nitride (hBN), which have similar topologies and chemical bonding characters with GeSe and PbTe. Finally, we show that the transferred model successfully describes the fracture behavior of monolayer hBN.


There are two basic assumptions to construct a ML force field[17,21] in order to map the structure and its corresponding energy. First, atomistic potential energy relates only to its local configuration, and second, the total energy of a system equals the sum of single atomistic energies. In general, there are four important steps, namely the generation of a database by DFT calculations, featurization, ML regression, and benchmarks, to train a ML force field [Figure 1]. They will be discussed in detail in the following sections.

Physics infused machine learning force fields for 2D materials monolayers

Figure 1. Strategy to generate a ML force field. DFT: Density functional theory; MD: molecular dynamics.

Generation of the training database

AIMD simulations were performed at varying temperatures with different strains for monolayer GeSe and PbTe [Table 1]. The high-temperature configurations are used for sampling structures that are outside of local equilibrium states, while the medium- and low-temperature configurations are important to capture the potential energy surface near the equilibrium state. Besides, DFT references related to phase transformation and ferroelastic domain switching are also used to expand the training database. Here, we select 101 configurations during the ferroelastic domain switching for the GeSe dataset and 404 configurations of stress-induced ferroic phase transition for PbTe.

Table 1

Generation of DFT datasets of monolayer GeSe and PbTe

Materials Selected temperatures Strain range Additional configurations Total number
GeSe 30 K, 50 K, 100 K, 300 K, 500 K, 800 K -10%~10%
(uniaxial and biaxial)
101 configurations of the pathway of ferroelastic transition 11,893
PbTe 50 K, 100 K, 300 K, 500 K, 800 K -10%~10%
(uniaxial and biaxial)
404 configurations of the pathway of phase change 10,148

All the referenced configurations are calculated based on first-principles DFT[22,23] via the Vienna ab initio simulation package (VASP)[24,25] by using a Perdew-Burke-Ernzerhof (PBE)[26] form of exchange-correlation function with the generalized gradient approximation (GGA)[27,28]. The models of GeSe and PbTe contain 64 atoms. For a given temperature and strain state, we performed an AIMD simulation over 6,000 steps with a timestep of 2 fs. The plane wave-basis cutoff energy is 300 eV for both GeSe and PbTe. A dense k-point mesh of 3 × 3 × 1 is used to determine accurate forces and energies.


Coding atomic configurations into feature space needs to satisfy the requirements of continuity and symmetry[29]. Besides, bonding information between different elemental pairs should also be considered independently. For a binary system, we considered three kinds of pairwise interactions and eight kinds of three-body interactions [Table 2].

Table 2

Interaction groups in monolayer GeSe and PbTe

Materials Pairwise interaction Three-body interaction
GeSe Ge-Ge Ge-Se Se-Se Ge-Ge-Ge Ge-Se-Ge Ge-Ge-Se Ge-Se-Se Se-Ge-Ge Se-Se-Ge Se-Ge-Se Se-Se-Se
PbTe Pb-Pb Pb-Te Te-Te Pb-Pb-Pb Pb-Te-Pb Pb-Pb-Te Pb-Te-Te Te-Pb-Pb Te-Te-Pb Te-Pb-Te Te-Te-Te

Pairwise interactions are related to the nature of covalent bonds; i.e., covalent bonding is a short-range interaction, and the strength of bonding decreases sharply above a critical bond length. Pairwise interactions relate to the bond length between two atoms, i and j. It can be easily captured by the Fourier series of a step function. Therefore, after distinguishing the interacting atomic types, pairwise descriptors can be expressed as:

$$ V_{i}^{2 b}=\sum_{j \neq i} \cos \left(k \cdot r_{i j}\right) e^{-\frac{r_{i j}}{\eta}} \cdot f_{c}\left(r_{i j}\right) $$

where rij is the bond length between atoms i and j. fc(rij) = 0.5[1 + cos(πrij/Rc)] is a damping function for atoms within the cutoff distance Rc, and it is zero elsewhere. k and η are related to the damping of an exponential function. Eight k and η values [Table 3] are used to construct the descriptor vector for the pairwise descriptor; that is, $$ V_{i}^{2 b}=\left\{v_{i}^{2 b}\left(k_{0}, \eta_{0}\right), \ldots, v_{i}^{2 b}\left(k_{7}, \eta_{7}\right)\right\} $$.

Table 3

Adjustable parameters for the machine learning model in this work

Number 0 1 2 3 4 5 6 7
η 1.0 1.329 1.766 2.347 3.120 4.146 5.510 7.333
k 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
η3 1.0 1.766 3.120 5.510 - - - -
f(cosθjim) cosθjim3 cosθjim2 - 4cosθjim 1 - 4/3cosθjim2

Three-body interactions relate to the bond angle and bond length among atoms i, j, and k. Descriptors can be expressed as:

$$ V_{i}^{3 b}=\sum_{j, m \neq i} f\left(\cos \theta_{j i m}\right) \cdot e^{-\left(\frac{r_{i j}^{2}+r_{i m}^{2}}{\eta_{3}^{2}}\right)} \cdot f_{c}\left(r_{i j}\right) \cdot f_{c}\left(r_{i m}\right) $$

where rij and rik represent the bond length from atom j and m to atom i, respectively. We use four different η3 [Table 3]. θjik is the angle between atoms i, j, and m centered on atom i. We choose three different forms of f(cosθjik) in this work [Table 3]. These parameters should be not far from the cutoff distance. A set of too large or too small η and k can greatly reduce the model accuracy. As shown in Supplementary Table 1, an order of magnitude greater or smaller than our parameters will lead to a poor fitting performance.

Kernel ridge regression

The ML force field is based on the formulation by Botu and Ramprasad[18]. Kernel ridge regression (KRR)[30,31], one of the Gaussian process-type ML approach[32,33], maps the structural descriptors and the total energy. A linear kernel function, K(Vinput, Vref), is used to measure the similarity between the input (Vinput) and the referenced structural descriptors (Vinput). We can express the target energy E as:

$$ E=\sum_{t} \omega_{t} K\left(V_{\text {input }}, V_{r e f}^{t}\right)+\text { const } $$

where ωt denotes the weighting coefficient during fitting. t labels each referenced atomic environment and $$ V_{r e f}^{t} $$ is its corresponding descriptor. All the descriptors are normalized to [0, 1] before the training is produced. The ML fitting is performed by an open-source Python code named scikit-learn[34]. In our work, we do not include atomic forces during the training process. However, the atomic forces are used in the model validation. The root mean square errors (RMSE) of atomic forces for monolayer GeSe, PbTe, and hBN are 0.09, 0.06, and 0.10 eV/ Å, respectively. The small errors justify the accuracy of our force fields. More importantly, for a kernel-function-based ML potential model, incorporating atomic forces into the training dataset could potentially lead to overfitting, resulting in poor generalization ability.


To evaluate the quality of the ML interatomic force fields for monolayers GeSe and PbTe, we first check the errors, e.g., RMSE and mean absolute error (MAE) in potential energy [Figure 2]. Predicted potential energy by ML interatomic force fields agrees well with the referenced DFT calculations. Small errors (< 2 meV/atom in RMSE) and large R2 (> 0.99) indicate a good performance of our force fields.

Physics infused machine learning force fields for 2D materials monolayers

Figure 2. Comparison of the predicted atomistic energy and the true values in monolayer (A) GeSe and (B) PbTe. The training dataset contains 11,893 and 10,148 configurations for GeSe and PbTe, respectively. DFT: Density functional theory; MAE: mean absolute error; RMSE: root mean square errors.

Then, we checked the quality of the predicted atomistic force. Atomistic force can be easily calculated by differentiating the expression of total energy E (Equation 3) due to the selection of linear kernel function. The RMSE and MAE of atomic force for monolayer GeSe and PbTe are shown in Figure 3. The small value of RMSE (< 0.1 eV/Å) and the R2 indicate the good agreement in force prediction. Thus, we believe that the ML force fields can well describe microstructural evolution at finite temperatures and external straining.

Physics infused machine learning force fields for 2D materials monolayers

Figure 3. Comparison of the predicted atomistic force and the DFT counterpart of monolayer (A) GeSe and (B) PbTe. The testing dataset contains ~0.86 million and ~1.5 million points for GeSe and PbTe, respectively. We calculate the RMSE, MAE, and R2 of the ML model for the testing data, and only 10% of the data are shown in the plots. DFT: Density functional theory; MAE: mean absolute error; ML: machine learning; RMSE: root mean square errors.

Accurate prediction in energy and force results in the reliable prediction of phonons [Figure 4] and the elastic constant [Table 4]. Figure 4 shows a reasonable agreement in the phonon spectrum between ML prediction and DFT calculations, especially for the acoustic branches. All the above benchmarks indicate that the ML force fields have a good predicting ability for monolayer GeSe and PbTe. We do more critical benchmarks by comparing the defect formation energy from our ML force fields with DFT calculations [Supplementary Table 2]. Actually, these two force fields can also describe the phase transformation/domain switching behaviors, which will be shown in Section 3.

Physics infused machine learning force fields for 2D materials monolayers

Figure 4. Comparison of the phonons between the predicted and the DFT calculations for monolayer (A) GeSe and (B) PbTe. DFT: Density functional theory; ML: machine learning.

Table 4

Comparison between the ML force fields and DFT Calculations on elastic constants

Materials Predicted DFT
GeSe C11 (N/m) 48.49 47.08
C22 (N/m) 11.72 18.01
C12 (N/m) 12.66 18.37
PbTe C11 (N/m) 41.36 40.24
C22 (N/m) 40.67 40.24
C12 (N/m) 23.82 21.07

Details of MD simulation

Stress-induced ferroic domain switching in monolayer GeSe and stress-induced ferroic phase transition in monolayer PbTe are performed by the learned ML force fields. The technical details of these MD simulations are given here.

A periodic boundary condition is applied along the x and y directions. All models are relaxed at 50 K in an isothermal-isobaric ensemble for 40 pico-seconds before any perturbations are applied to it. Averaged stress along both x and y directions remains zero after the relaxation. An isothermal-isobaric ensemble is used with the help of the Nosé-Hoover thermostat[35,36] and the Parrinello-Rahman barostat[37]. Tensile deformation of GeSe and PbTe along the x direction was applied with a strain rate of 5×108 per second at 50 K. The corresponding tensile stress was calculated by the total force divided by the effective thickness. The effective thicknesses for GeSe and PbTe are 9.61 and 9.41 Å, respectively. All MD simulations are performed by using the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) code[38]. Atomic configurations are visualized by AtomEye[39].


We then apply the ML force fields to different cases, i.e., temperature/stress-induced ferroic phase transformation/domain switching in monolayer GeSe and PbTe.

Temperature-induced ferroic phase transformation in monolayer GeSe

Figure 5 shows the potential energy as a function of temperature upon a heating-cooling cycle in a monolayer GeSe. The discontinuity in potential energy upon heating (cooling) indicates the phase transformation, in which the long-range spontaneous ferroic orders (strain or polarization) disappear (appear). The insets in Figure 5 schematically illustrate the typical domain structure in our heating-cooling cycle. Starting from a single-domain ferroelastic phase with a spontaneous strain along the y direction, the model transforms into a para-phase, where the long-range spontaneous strain and polarization disappear when the temperature is higher than ~310 K. The para-phase GeSe will transform back into the ferroic one upon cooling, while the transition temperature is ~260 K. After cooling down, the monolayer GeSe shows a multi-domain structure of the low temperature phase. All the detailed microstructure evolution can be accessed from our previous works[40,41].

Physics infused machine learning force fields for 2D materials monolayers

Figure 5. Temperature-induced phase transformation in monolayer GeSe.

Stress-induced ferroic domain switching in monolayer GeSe

The switching between different domain variants in monolayer GeSe can also be reproduced by MD simulations of its response to external stress or electrical fields. Figure 6 shows a tensile deformation-induced ferroic domain switching in monolayer GeSe at 50 K. Figure 6A shows the evolution of tensile stress as a function of tensile strain (ε) in monolayer GeSe at 50 K. Starting from an undeformed structure, the sample undergoes an elastic deformation and then yields at ε = 4% with a yielding strength of ~1.2 GPa. After yielding, the stress drops down to ~0.3 GPa. Then, it increases linearly again until a second drop occurs at ε = 10.0%. The stress drops down to 0.7 GPa at ε = 11.8%, which is followed by a third linear increase at ε = 14.0%.

Physics infused machine learning force fields for 2D materials monolayers

Figure 6. Stress-induced domain switching in monolayer GeSe. (A) Tensile stress evolution as a function of tensile strain; (B)-(G) Typical atomistic configurations of GeSe upon tensile-induced domain switching. Here, we use different colors to indicate the ferroic orders, where the magenta means domains with (ηy, -Py), cyan means domains with (ηx, -Px), and red means domains with (ηy, Py).

Figure 6B-G shows the typical atomistic configurations upon tension, where we use an arrow, which starts from a Ge atom and ends in an adjacent Se atom, to represent the Ge-Se pairs. We colored the arrows based on their directions, which could help to identify different domains. Figure 6B shows the undeformed monolayer GeSe, in which all the ferroic orders are along the same direction (the spontaneous strain η along y direction and the spontaneous polarization P is along -y direction, marked as (ηy, -Py) and magenta arrows). At the yielding point (ε = 4.0%), a new domain with ferroic order (ηx, -Px) nucleates (cyan arrows in Figure 6C) and then grows up fast [Figure 6D]. A stripped pattern forms in the sample and keeps stable until ε = 10.0%, when some secondary domains (ηx, Px) nucleate (red regions in Figure 6E) with the secondary drop of the tensile stress [Figure 6A]. However, in this stage, the secondary domains are not stable. Instead of the secondary domains, the primary domain (ηx, -Px) grows up at the end of this stage (ε = 11.8%) [Figure 6F]. The final drop of the tensile stress at a highly deformed state (ε = 14.0%) is induced by the re-appearance of secondary domains [Figure 6G].

The 2D domain switching is also realized by a motion of the domain boundary and kinks inside the boundary. Therefore, the 2D domain switch can be resolved into one-dimensional manners, e.g., the domain boundary motion, and zero-dimension manners, e.g., the kinks inside the domain boundary. Hierarchical domain structure forms with a high domain boundary density, which shows the great potential of 2D ferroics in domain boundary engineering.

Tensile deformation induced ferroic phase transformation in PbTe

Figure 7A shows the strain-stress curve of monolayer PbTe upon loading and unloading at 50 K. Based on the structure evolution upon loading and unloading, we divide the entire process into six ranges labeled as R1~R6. There are three typical phase structures named para-phase, dual-phases, and ferro-phase [Figure 7B-D] that appeared or disappeared in the loading-unloading loop. The para-phase is the equilibrium state for monolayer PbTe, where the centers of Pb atoms and Te atoms coincide [Figure 7B]. Thus, there are no spontaneous ferroic orders. However, the ferroic phase appears under some specific external fields, e.g., tensile stress. Spontaneous strain and polarization appear in the ferroic phase [Figure 7D]. Dual phases are an intermediated state during the phase transformation [Figure 7C].

Physics infused machine learning force fields for 2D materials monolayers

Figure 7. Stress-induced phase transformation in monolayer PbTe. (A) Tensile stress-strain curve upon loading and unloading; (B)-(D) Typical phase structures of PbTe upon loading and unloading.

Starting from a perfect structure, the monolayer PbTe undergoes an elastic deformation in R1 (0.0 < ε < 0.07), in which the structure remains in a para-phase [Figure 7B]. Yielding occurs at ε = 0.07, with a yielding strength of ~1.5 GPa. The yielding is induced by the nucleation of the ferroic phase. The ferroic phase grows up while the para-phase shrinks via the motion of phase boundaries in R2 (0.07 < ε < 0.16). The tensile stress remains in a platform with large fluctuations, which is induced by the pinning and depinning of the phase boundary. This process is similar to the pinning-depinning of dislocation motion in metals. More details can be seen in Supplementary Figure 1 and Supplementary Movie 1. All the para-phase is transformed into the ferroic phase at the end of R2. The elastic deformation of the ferroic phase in R3 (0.16 < ε < 0.25) induces a further increase of the tensile stress [Figure 7A].

Upon loading in R4 (0.25 > ε > 0.12), the elastic energy stored in the ferroic phase is first released, while the tensile stress decreased from 1.80 to < 1.0 GPa. Then, the reverse phase transformation occurs and remains in R5 (0.12 > ε > 0.05), after which the system recovers back to the para-phase. Further unloading releases the elastic energy of para-phase and induces a full recovery in R6 (0.05 > ε > 0.0). The monolayer PbTe sustains a recoverable strain as large as 0.25 with low hysteresis.

Physics-infused descriptor designing for monolayer GeSe and PbTe

Phase transformation and domain switching in 2D ferroic materials are accompanied by the repetition of bonding and debonding. Different from that in metallic materials, covalent bonding is a short-range interaction, i.e., there exists a sudden drop (increment) of bond strength or bonding energy as a function of distance (red line in Figure 8A). Here, we select an oscillated attenuation function to describe the pairwise interaction [Figure 8B] rather than a Gaussian function[17-19], which is widely used in fingerprinting pairwise interaction of metals. As shown in Equation 1, η controls the decay rate, and k affects the oscillation period of $$ V_{i}^{2 b} $$. Here, we choose eight different (η, k) pairs [Table 3] to represent the structural characters, and their combinations make the descriptors suitable for describing the covalent bonding process.

Physics infused machine learning force fields for 2D materials monolayers

Figure 8. Oscillated attenuation features to describe the covalent bonding or debonding. (A) Schematic for the differences between covalent and metallic bonds; (B) Evolution of $$ V_{i}^{2 b} $$ with bond length rij.

As illustrated in Figure 8A, the interactions between two atoms are rather weak once the bond breaks. Often, the effective distance of covalent bonding interactions is slightly larger than the equilibrium bond length of the first neighbors (a0). This suggests a much smaller cutoff for covalent systems than metals. Figure 9 shows the evolution of the RMSE and R2 as functions of cutoffs in monolayer GeSe and PbTe. By normalizing Rc by a0, a similar trend appears. Once the cutoff is larger than the second nearest neighbor (e.g., > 1.5a0), the RMSE drops into a value smaller than 2 meV/atom, and the R2 rises towards a value of 1. This encourages us to select cutoff values tailored to specific materials.

Physics infused machine learning force fields for 2D materials monolayers

Figure 9. Fitting quality evolution as a function of Rc.

Structural similarity of monolayer GeSe and PbTe

Besides the similar bonding information in monolayer GeSe and PbTe, their structural similarity also holds on for the good performance of the descriptors. As shown in Figure 10, monolayer GeSe and PbTe have the same structural topology. Taking GeSe as an example, Ge and Se atoms occupy different sites with a zigzag structure but different planes [Figure 10A]. We use four dihedral angles, i.e., θ1, θ2, θ3, and θ4, to describe the relationship between these atomic planes. The unit cell of monolayer GeSe can be seen in Figure 10B. All the dihedral angles are within the range of between 90° and 180°, leading to the eccentricity between Ge atoms and Se atoms, which is the origin of spontaneous polarization. Changing all the dihedral angles to 90° gives rise to the typical structure of PbTe without spontaneous polarization [Figure 10C]. At the other extreme, i.e., all four dihedral angles are equal to 180°, all atoms lie within the same plane [Figure 10D]. It is the situation of monolayer hBN. Therefore, we infer that the descriptors should also work for developing the force field of hBN, the validation of which will be presented in the subsequent sections.

Physics infused machine learning force fields for 2D materials monolayers

Figure 10. Structure similarity of monolayer GeSe, PbTe, and hBN. (A) Atomic structure of monolayer GeSe; (B) Basic structure of monolayer GeSe with all dihedral angles in the range between 90° and 180°; (C) Basic structure of monolayer PbTe with all dihedral angles equal to 90°; (D) Basic structure of monolayer hBN with all dihedral angles equal to 180°. hBN: Hexagonal boron nitride; 2D: two-dimensional.

Structural similarity can also be manifested through the distribution of bond length in GeSe, PbTe, and hBN.Figure 11 shows the distribution of normalized distance between atomic pairs in monolayer GeSe and PbTe. Considering the atomic types, we use X to represent Ge/Pb/B atoms and Y to represent Se/Te/N atoms. The distances between atomic pairs are normalized by their respective nearest bond lengths a0. These distributions in GeSe, PbTe, and hBN share the same shape, although the monolayer GeSe has the widest spectrum (cyan points). The more scattered distribution of bond length in GeSe originates from the slightly shifted 90° bond angle.

Physics infused machine learning force fields for 2D materials monolayers

Figure 11. Distribution of bond length in monolayer GeSe, PbTe, and hBN. hBN: Hexagonal boron nitride.

ML force field describing the fracture of hBN

The descriptors were then transferred into the force field model of monolayer hBN. As expected, the performance of ML force fields is good. The RMSEs in energy and force prediction are only 0.84 meV/atom and 0.1 eV/Å, respectively [Figure 12A and B]. The phonon spectrum in Figure 12C also shows great agreement with the DFT data. With this force field, we run MD simulations to study the fracture behavior of monolayer hBN.

Physics infused machine learning force fields for 2D materials monolayers

Figure 12. Benchmarks of ML force fields for monolayer hBN in the aspect of (A) energy; (B) atomistic force; and (C) phonon spectrum. DFT: Density functional theory; hBN: hexagonal boron nitride; MAE: mean absolute error; RMSE: root mean square errors; 2D: two-dimensional.

Firstly, a molecular static simulation was applied to the tension deformation via energy minimization. We use the same model for the DFT calculations, which contains 60 atoms with a dimension of 1.25 nm × 1.30 nm. The model is stretched along the armchair [Figure 13A] and zigzag directions [Figure 13B], respectively. The good agreement in the stress-strain curve for the ML prediction and the DFT calculations indicates the high reliability of our physics-infused ML models.

Physics infused machine learning force fields for 2D materials monolayers

Figure 13. Tensile deformation induced fracture in monolayer hBN. (A) and (B) Comparisons of the stress-strain curve upon static tensile loading along the armchair and zigzag direction, respectively; (C) and (D) Size effects on mechanical response along the armchair and zigzag direction, respectively. DFT: Density functional theory; hBN: hexagonal boron nitride; ML: machine learning.

Figure 13C and D shows the tensile strain-stress curves along the armchair and zigzag directions at 50 K, respectively. In each direction, we studied the tensile response of different monolayer hBN supercells with periodic boundary conditions. The size of the supercells varies from 1.25 nm × 1.30 nm to 12.5 nm × 13.0 nm. We find that in the near-equilibrium state, there is almost no size effect. The elastic modulus is ~710 GPa in the armchair direction while ~798 GPa in the zigzag direction. However, the fracture strain and strength exhibit a size-dependent behavior. We find the mechanical response converges when the sample size is larger than 5.2 nm × 5.0 nm. The converged fracture strain along the armchair direction is ~0.18 with a tensile strength of 81 GPa, while ~0.21 with a tensile strength of 96 GPa when loading along the zigzag direction. All these results show good agreement with previous simulations and experiments[42,43].


In summary, we show a physics-infused ML force field with a small training dataset to describe the microstructural evolution of monolayer 2D materials. The developed force field is suitable for describing the atomic process of phase transformation, domain switching, and fracture, which are all related to a unified bonding/debonding process. The high transferability of the force fields originates from the strategy of physics-infused feature engineering, which integrated the nature of covalent bonds and structure topology. These force fields provide an efficient path for understanding the dynamics of 2D materials up to nanoseconds.


Authors’ contributions

Designed this project: Zong H

Finished all the calculations: Yang Y, Xu B

All the authors contributed to the writing of this paper.

Availability of data and materials

Not applicable.

Financial support and sponsorship

This work was supported by the National Key Research and Development Program of China (No. 2022YFB3707600), the National Natural Science Foundation of China (12104355, 51320105014, 51871177, and 51931004), and China Postdoctoral Science Foundation (No. 2022M722508, No. 2020M673385).

Conflicts of interest

All authors declared that there are no conflicts of interest.

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.


© The Author(s) 2023.

Supplementary Materials


1. Saito Y, Nojima T, Iwasa Y. Highly crystalline 2D superconductors. Nat Rev Mater 2017;2:16094.

2. Wang QH, Kalantar-Zadeh K, Kis A, Coleman JN, Strano MS. Electronics and optoelectronics of two-dimensional transition metal dichalcogenides. Nat Nanotechnol 2012;7:699-712.

3. Balandin AA, Ghosh S, Bao W, et al. Superior thermal conductivity of single-layer graphene. Nano Lett 2008;8:902-7.

4. Zhao LD, Lo SH, Zhang Y, et al. Ultralow thermal conductivity and high thermoelectric figure of merit in SnSe crystals. Nature 2014;508:373-7.

5. Prashantha Kumar HG, Anthony Xavior M. Graphene reinforced metal matrix composite (GRMMC): a review. Procedia Eng 2014;97:1033-40.

6. Lee C, Li Q, Kalb W, et al. Frictional characteristics of atomically thin sheets. Science 2010;328:76-80.

7. Li S, Li Q, Carpick RW, et al. The evolving quality of frictional contact with graphene. Nature 2016;539:541-5.

8. Chang K, Liu J, Lin H, et al. Discovery of robust in-plane ferroelectricity in atomic-thick SnTe. Science 2016;353:274-8.

9. Fei Z, Zhao W, Palomaki TA, et al. Ferroelectric switching of a two-dimensional metal. Nature 2018;560:336-9.

10. Guan Z, Hu H, Shen X, et al. Recent progress in two-dimensional ferroelectric materials. Adv Elect Mater 2020;6:1900818.

11. Li W, Qian X, Li J. Phase transitions in 2D materials. Nat Rev Mater 2021;6:829-46.

12. Bao Y, Song P, Liu Y, et al. Gate-tunable in-plane ferroelectricity in few-layer SnS. Nano Lett 2019;19:5109-17.

13. Fei R, Li W, Li J, Yang L. Giant piezoelectricity of monolayer Group IV monochalcogenides: SnSe, SnS, GeSe, and GeS. Appl Phys Lett 2015;107:173104.

14. Fei R, Kang W, Yang L. Ferroelectricity and phase transitions in monolayer group-IV monochalcogenides. Phys Rev Lett 2016;117:097601.

15. Mehboudi M, Fregoso BM, Yang Y, et al. Structural phase transition and material properties of few-layer monochalcogenides. Phys Rev Lett 2016;117:246802.

16. Wang H, Qian X. Two-dimensional multiferroics in monolayer group IV monochalcogenides. 2D Mater 2017;4:015042.

17. Botu V, Batra R, Chapman J, Ramprasad R. Machine learning force fields: construction, validation, and outlook. J Phys Chem C 2017;121:511-22.

18. Botu V, Ramprasad R. Adaptive machine learning framework to accelerate ab initio molecular dynamics. Int J Quantum Chem 2015;115:1074-83.

19. Zong H, Pilania G, Ding X, Ackland GJ, Lookman T. Developing an interatomic potential for martensitic phase transformations in zirconium by machine learning. npj Comput Mater 2018;4:48.

20. Yang Y, Zhao L, Han CX, et al. Taking materials dynamics to new extremes using machine learning interatomic potentials. J Mater Inf 2021;1:10.

21. Botu V, Ramprasad R. Learning scheme to predict atomic forces and accelerate materials simulations. Phys Rev B 2015;92:094306.

22. Hohenberg P, Kohn W. Inhomogeneous electron gas. Phys Rev 1964;136:B864-71.

23. Kohn W, Sham LJ. Self-consistent equations including exchange and correlation effects. Phys Rev 1965;140:A1133-8.

24. Kresse G, Furthmüller J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput Mater Sci 1996;6:15-50.

25. Kresse G, Hafner J. Ab initio molecular dynamics for liquid metals. Phys Rev B Condens Matter 1993;47:558-61.

26. Perdew JP, Burke K, Ernzerhof M. Generalized gradient approximation made simple. Phys Rev Lett 1996;77:3865-8.

27. Becke AD. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys Rev A 1988;38:3098-100.

28. Langreth DC, Mehl MJ. Beyond the local-density approximation in calculations of ground-state electronic properties. Phys Rev B 1983;28:1809-34.

29. Huan TD, Batra R, Chapman J, Krishnan S, Chen L, Ramprasad R. A universal strategy for the creation of machine learning-based atomistic force fields. npj Comput Mater 2017;3:37.

30. Hofmann T, Schölkopf B, Smola AJ. Kernel methods in machine learning. Ann Statist 2008;36:1171-220.

31. Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B. An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 2001;12:181-201.

32. Deringer VL, Csányi G. Machine learning based interatomic potential for amorphous carbon. Phys Rev B 2017;95:094203.

33. Seko A, Takahashi A, Tanaka I. First-principles interatomic potentials for ten elemental metals via compressed sensing. Phys Rev B 2015;92:054113.

34. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. J Mach Learn Res 2011;12:2825-30. Available from:[Last accessed on 9 Nov 2023]

35. Nosé S. A unified formulation of the constant temperature molecular dynamics methods. J Chem Phys 1984;81:511-9.

36. Hoover WG. Canonical dynamics: equilibrium phase-space distributions. Phys Rev A Gen Phys 1985;31:1695-7.

37. Parrinello M, Rahman A. Polymorphic transitions in single crystals: a new molecular dynamics method. J Appl Phys 1981;52:7182-90.

38. Plimpton S. Fast parallel algorithms for short-range molecular dynamics. J Comput Phys 1995;117:1-19.

39. Li J. AtomEye: an efficient atomistic configuration viewer. Modelling Simul Mater Sci Eng 2003;11:173.

40. Yang Y, Zong H, Ding X, Sun J. Size-dependent ferroic phase transformations in GeSe nanoribbons. Appl Phys Lett 2022;121:122903.

41. Yang Y, Zong H, Sun J, Ding X. Rippling ferroic phase transition and domain switching in 2D materials. Adv Mater 2021;33:2103469.

42. Falin A, Cai Q, Santos EJG, et al. Mechanical properties of atomically thin boron nitride and the role of interlayer interactions. Nat Commun 2017;8:15815.

43. Ahmed T, Zhang Z, Mcdermitt C, Hossain ZM. Strength and toughness anisotropy in hexagonal boron nitride: an atomistic picture. J Appl Phys 2018;124:185108.

Cite This Article

OAE Style

Yang Y, Xu B, Zong H. Physics infused machine learning force fields for 2D materials monolayers. J Mater Inf 2023;3:23.

AMA Style

Yang Y, Xu B, Zong H. Physics infused machine learning force fields for 2D materials monolayers. Journal of Materials Informatics. 2023; 3(4): 23.

Chicago/Turabian Style

Yang, Yang, Bo Xu, Hongxiang Zong. 2023. "Physics infused machine learning force fields for 2D materials monolayers" Journal of Materials Informatics. 3, no.4: 23.

ACS Style

Yang, Y.; Xu B.; Zong H. Physics infused machine learning force fields for 2D materials monolayers. J. Mater. Inf. 2023, 3, 23.



Open Access Review
Characterizing the wetting behavior of 2D materials: a review
Available online: 13 Sep 2023
Open Access Research Article
Composition- and temperature-dependence of β to ω phase transformation in Ti-Nb alloys
Available online: 5 Jul 2023
Open Access Review
Data-driven design of eutectic high entropy alloys
Available online: 27 Apr 2023
Open Access Research Article
Development of an accurate “composition-process-properties” dataset for SLMed Al-Si-(Mg) alloys and its application in alloy design
Available online: 27 Mar 2023
Open Access Research Article
Machine learning-accelerated first-principles predictions of the stability and mechanical properties of L12-strengthened cobalt-based superalloys
Available online: 19 Sep 2022
Open Access Research Article
Investigation of dual atom doped single-layer MoS2 for electrochemical reduction of carbon dioxide by first-principle calculations and machine-learning
Available online: 20 Nov 2023
Open Access Research Article
Regulating the electrocatalytic performance for nitrogen reduction reaction by tuning the N contents in Fe3@NxC20-x (x = 0~4): a DFT exploration
Available online: 2 Nov 2023
Open Access Review
Recent advances in the interface structure prediction for heteromaterial systems
Available online: 17 Oct 2023
Open Access Research Article
Understanding oxidation resistance of Pt-based alloys through computations of Ellingham diagrams with experimental verifications
Available online: 11 Oct 2023
Open Access Review
Characterizing the wetting behavior of 2D materials: a review
Available online: 13 Sep 2023


Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at

Cite This Article 5 clicks
Commentary 0 comments
Like This Article 1 likes
Share This Article
Scan the QR code for reading!
See Updates
Hot Topics
machine learning |
Journal of Materials Informatics
ISSN 2770-372X (Online)
Follow Us


All published articles are preserved here permanently:


All published articles are preserved here permanently: