Machine learning for predictive design and optimization of high-performance thermoelectric materials: a review
Abstract
Thermoelectric materials enabling direct interconversion between thermal and electrical energy hold transformative potential for sustainable energy technologies, particularly in solid-state power generation and precision refrigeration systems. The pursuit of high-performance thermoelectric materials with exceptional energy conversion efficiency has remained a persistent challenge in materials science, primarily constrained by the resource-intensive nature of traditional experimental approaches and computationally demanding first-principles simulations. The emergence of machine learning (ML) techniques has revolutionized this field by enabling rapid screening of material candidates and establishing quantitative structure-property relationships. This comprehensive review systematically examines cutting-edge methodologies in ML-driven thermoelectric materials research, with particular emphasis on three pivotal aspects: (1) predictive modeling of key performance parameters including electrical conductivity, Seebeck coefficient, and lattice thermal conductivity through advanced feature engineering and algorithm selection; (2) inverse design strategies for optimizing carrier concentration and phonon scattering mechanisms; (3) application-specific material optimization frameworks integrating multi-objective constraints. Furthermore, we critically analyze prevailing challenges in data quality, model interpretability, and cross-scale prediction accuracy, while proposing future research directions encompassing active learning paradigms, generative adversarial networks for virtual material synthesis, and hybrid physics-informed ML architectures.
Keywords
INTRODUCTION
Thermoelectric materials are a novel type of energy material capable of converting thermal energy into electrical energy and vice versa through the Seebeck effect and Peltier effect [1-8]. They are widely applied across various fields such as waste heat recovery, solid-state cooling, portable power sources, remote sensor power supply, thermal management, space exploration, automotive industry, medical equipment, environmental monitoring, and small-scale power generation, demonstrating their high efficiency and versatility in energy conversion and temperature control [9-15]. Thermoelectric devices demonstrate their unique value and potential across various applications with advantages such as noiselessness, vibration-free operation, compact design, high reliability, environmental friendliness, high efficiency, multifunctionality, ease of control, and low maintenance costs[16-24]. Typically, the performance of a thermoelectric material is measured by its figure of merit, known as $$z T$$, which is formulated as follows:
where S is the Seebeck coefficient[25-29]; T is the absolute temperature in Kelvin;
In past explorations, experiments have been the primary means by which researchers have investigated the performance of thermoelectric materials. Current experimental approaches to improve the performance of thermoelectric materials include optimizing carrier concentration, modulating band structure, facilitating multi-scale phonon scattering, employing defect engineering, optimizing lattice dynamics, conducting interface engineering, regulating electronic structure, designing thermoelectric modules, implementing dynamic atomic control, and exploring new materials, among other strategies[44-54], Experimental methods and outcomes of these approaches are shown in Supplementary Figure 1. Experimental approaches can often be costly, time-consuming, and prone to issues such as reproducibility challenges, limited data availability, and human error. To address these challenges, theoretical calculations based on physical principles have been increasingly employed over the past few decades to simulate experimental conditions and derive methods for assessing materials' thermal performance. These approaches include first-principles calculations, the Boltzmann transport equation, the Wiedemann-Franz law, molecular dynamics simulations, and the Monte Carlo method. Such techniques enable a comprehensive prediction and optimization of the electronic structure, transport properties, and thermoelectric conversion efficiency of materials[55–59]. In recent years, the rapid advancement of machine learning (ML) technology has significantly impacted materials science, emerging as a critical driver for the discovery and optimization of new materials[60–62], ML pipeline for discovering novel thermoelectric material as shown in Figure 1. The role of ML in material exploration is increasingly prominent, introducing transformative changes to the field. In particular, ML applications have produced substantial results. For example, ML is utilized to predict the mechanical properties of alloys, guiding lightweight design in the aviation and automotive industries[63, 64]. In the energy sector, ML technology optimizes the electronic structure of solar cell materials, improving energy conversion efficiency[64, 65]. In addition, intelligent algorithms predict and control the stability of perovskite materials, which advances the development of efficient optoelectronic devices[66–69]. Similarly, ML plays a crucial role in predicting the performance of thermoelectric materials. Using experimental and computational data, ML models can forecast promising thermoelectric materials, guiding experimental and theoretical research. This approach significantly improves research efficiency, accelerating the discovery and optimization of advanced thermoelectric materials. The number of articles published from 2014 to 2024 with "Thermoelectric" as a keyword and those with "Thermoelectric-Machine Learning" as keywords is shown in Figure 2; it can be seen that the number of articles exploring thermoelectric materials using ML methods has increased year by year over this period. This indicates that the importance of ML in exploring the properties of thermoelectric materials is growing increasingly.
ML APPLICATION IN PREDICTING ELECTRICAL PROPERTIES
ML models have demonstrated accurate forecasting capabilities for critical thermoelectric parameters including electrical conductivity, Seebeck coefficient, and carrier concentration. Through analysis of comprehensive materials datasets, these computational approaches decode intricate structure-property relationships, facilitating targeted selection of compounds with optimized charge transport characteristics. These advancements not only streamline discovery pipelines but also enable rational design of next-generation thermoelectric devices. This review systematically summarizes recent methodological advancements in ML-driven property prediction and optimization, as tabulated in Table 1, highlighting their transformative potential in thermoelectric device engineering.
Representative ML studies for electrical property prediction
Authors | Year | Samples | Features | Targets | Algorithms |
ML: Machine learning; DFT: density functional theory; RF: random forest; NN: neural network; LS-SVM: least squares support vector machine; BP-ANN: back propagation artificial neural network; PF: power factor; DNN: deep neural network; SISSO: sure independence screening and sparsifying operator; GBR: gradient boosting regressor; SVR: support vector regression; KRR: kernel ridge regression; ETR: extra trees regressor; MLP: multilayer perceptron. | |||||
Miller et al.[70] | 2018 | 127 compounds | Structural parameters, DFT results, periodic properties | Carrier concentration (n) | Linear regression, RF, NN |
Wan et al.[79] | 2021 | 242 compounds (121 p/n-type) | Physical descriptors | Band gap (Eg) | LS-SVM, BP-ANN |
Antunes et al.[82] | 2023 | 47, 737 compounds | Composition, DFT features | Seebeck coefficient (S), electrical conductivity (σ), PF | Attention model |
Furmanchuk et al.[84] | 2018 | 927 materials | Compositional features | Seebeck coefficient | RF |
Yuan et al.[85] | 2022 | 151 Heuslers(122 half/29 full) | Z, χ, Natoms | Seebeck coefficient | DNN, SISSO |
Gaultois et al.[87] | 2016 | 25k data points | Crystallographic data | Seebeck coefficient | RF |
Sheng et al.[88] | 2020 | 482 compounds | Elemental descriptors | PF | GBR, SVR, RF, KRR, AdaBoost |
Graziosi et al.[89] | 2022 | 3, 000 band structures | Eg, Δm*, e-ph asymmetry | PF | GBR, ETR, MLP |
Carrier concentration
The carrier concentration constitutes a critical parameter for thermoelectric material optimization, exerting significant influences on both electrical conductivity and the Seebeck coefficient. ML-assisted prediction of this parameter expedites the discovery and optimization of high-performance thermoelectric materials.
Experimental carrier concentration data for 127 compounds were systematically compiled from established literature sources [70]. Multidimensional feature engineering incorporated chemical composition descriptors, crystallographic parameters, and quantum mechanical calculations derived from authoritative databases including the Open Quantum Materials Database (OQMD) and Materials Project (MP) [71-75]. Through rigorous comparative analysis of linear regression, random forest, and NN architectures, the linear regression model demonstrated optimal balance between predictive accuracy [mean absolute error (MAE) = 1.19 via leave-one-out cross-validation] and interpretability [76-78]. Notably, feature importance analysis revealed substitution defects as predominant determinants modulating carrier concentration evolution toward intrinsic semiconductor behavior.
Band gap
The band gap emerges as another critical determinant in thermoelectric material optimization, fundamentally governing both charge carrier transport mechanisms and thermal management processes. ML-driven band gap prediction establishes an accelerated paradigm for developing high-efficiency thermoelectric systems.
In a seminal methodology development, a computational framework was established for band gap prediction through systematic feature engineering [79]. The workflow initiated with first-principles electronic structure data, employing valence electron count, Pauling electronegativity, and relative atomic mass as foundational parameters to generate 242 material descriptors. Multivariate stepwise regression analysis identified five dominant features strongly correlated with band gap characteristics. Subsequent evaluation of 19 ML architectures revealed the least squares support vector machine (LS-SVM) as the optimal predictor, achieving robust performance metrics. This methodology not only enables rapid band gap estimation but also provides theoretical guidance for rational material design. The corresponding ML workflow is schematically illustrated in Figure 3A.
Electrical conductivity
Electrical conductivity optimization presents a critical challenge in thermoelectric engineering, requiring precise balance between enhanced PF and suppressed thermal conductivity [80]. Recent advances in deep learning demonstrate remarkable capability in resolving this trade-off through computational prediction of charge transport properties.
A breakthrough study employing attention-based neural networks achieved state-of-the-art conductivity prediction accuracy
Tiryaki et al. developed an iterative artificial neural network (ANN) framework for predicting thermoelectric material properties, comparing 26 ML models before selecting ANN as the optimal architecture due to its exceptional predictive accuracy (
Seebeck coefficient
The Seebeck coefficient (
Furmanchuk et al. developed ensemble learning models for half-Heusler (HH) compounds, achieving
PF
The PF (
An active learning framework integrating ML with automated DFT calculations was developed for
Complementary research on HH compounds demonstrated bipolar transport enhancement mechanisms [89]. Theoretical analysis of 3, 000+ band structures established critical descriptors for ML guidance:
● Effective mass asymmetry (
● Electron-phonon scattering asymmetry
● Narrow band gap (
These parameters enable unconventional
The integration of ML into thermoelectric materials research has revolutionized the optimization of critical transport parameters, enabling unprecedented precision in predicting carrier concentration, band gap, electrical conductivity, Seebeck coefficient, and PF. By leveraging multidimensional feature engineering and advanced algorithms - from interpretable linear regression to attention-based neural networks - researchers have decoded complex structure-property relationships across diverse material systems. Key achievements include the prediction of carrier concentration with MAE = 1.19 via defect-sensitive models, band gap optimization through LS-SVM-driven descriptor selection, and electrical conductivity mapping at 0.968 \(R^2\) accuracy using 18.6 million data points. Ensemble learning and deep neural networks (DNNs) further demonstrated robust capabilities in Seebeck coefficient prediction (\(R^2 = 0.94\)-0.95) and PF enhancement through bipolar transport mechanisms. These methodologies, validated against high-throughput DFT calculations and experimental datasets, have accelerated discovery cycles by 40%-73%, while identifying novel candidates such as vacancy-engineered chalcogenides and asymmetric HHs. Future advancements will require tighter integration of generative inverse design, multi-scale modeling, and autonomous experimentation to overcome residual challenges in predicting ultrahigh \(zT\) systems and bridging the accuracy gap between computational predictions and real-world performance. The convergence of physics-informed ML and robotic synthesis platforms promises to unlock the next generation of thermoelectric materials with tailored transport properties.
Discussion
In the optimization of thermoelectric material electrical properties, distinct ML models exhibit characteristic trade-offs between interpretability and predictive accuracy. Linear regression models achieve a balanced compromise, offering direct interpretability through feature weights that quantify physical contributions - such as the dominant role of substitutional defects in modulating carrier concentration - while maintaining a leave-one-out cross-validation MAE of 1.19 for carrier concentration prediction. This linear mapping allows straightforward attribution of property variations to specific chemical or structural descriptors, making it suitable for mechanistic insights. Tree-based ensemble models (e.g., random forest, LightGBM) enhance predictive accuracy for properties such as the Seebeck coefficient by capturing nonlinear feature interactions, yet their interpretability is limited to ranked feature importance scores. Post-hoc tools such as SHapley Additive exPlanations (SHAP) values are often required to disentangle complex dependencies, as the hierarchical decision structures of trees do not directly map to intuitive physical mechanisms. NNs, particularly attention-based architectures and iterative ANNs, achieve state-of-the-art accuracy by learning hierarchical representations from multidimensional data. However, their black-box nature obscures direct physical interpretation, necessitating indirect visualization of attention weights or cyclic validation protocols to infer composition-property relationships. LS-SVMs strike a middle ground in bandgap prediction, leveraging stepwise regression to select five dominant descriptors from 242 candidates, thus balancing feature complexity with model transparency
Regarding robust input features and feature engineering strategies, the robustness of ML models in thermoelectric research hinges on physically meaningful feature engineering. Fundamental atomic-scale descriptors - including valence electron count, Pauling electronegativity, and relative atomic mass - form the basis of predictive frameworks. For example, these parameters were used to generate 242 material descriptors in bandgap prediction, from which five key features (e.g., electronegativity gradient, average atomic mass) were identified via stepwise regression. Defect-related and electronic structure features further enhance model specificity; substitutional defects emerged as the primary determinant of carrier concentration in intrinsic semiconductors, while effective mass asymmetry and narrow bandgap were identified as critical for bipolar transport enhancement in PF optimization. Multidimensional feature fusion, integrating quantum mechanical calculations from databases such as OQMD and MP with experimental transport data, creates a rich input space for model training. Adaptive feature selection strategies - such as attention mechanisms to highlight composition-property correlations, active learning with Query-by-Committee for high-power-factor candidate discovery, and hierarchical feature construction from atomic parameters to physics-informed descriptors - further refine predictive power. These approaches collectively demonstrate that robust feature engineering, rooted in both theoretical priors and data-driven selection, is essential for decoding complex structure-property relationships in thermoelectric materials.
ML APPLICATION IN PREDICTING THERMAL PROPERTIES
Thermoelectric material performance is intrinsically linked to thermal transport characteristics, with ultralow thermal conductivity (
Thermal conductivity
Recent advancements in ML have significantly enhanced the prediction accuracy and efficiency of thermal conductivity in thermoelectric materials. Qin et al. conducted a comprehensive study comparing 15 ML algorithms, focusing on fundamental material properties such as atomic number and elastic modulus[90]. The long short-term memory (LSTM) model demonstrated superior performance, achieving a determination coefficient of 0.96 and a root mean square error of 0.15 W/(m
Ren et al. developed a gradient boosting regressor (GBR) model to analyze Zintl phase compounds, combining ML with first-principles calculations[91]. By refining 21 initial features to 8 critical descriptors - including lattice constants and atomic radii - the model achieved a determination coefficient of 0.988 and identified novel compounds such as Ba
In the study of bismuth telluride-based systems, Wudil et al. implemented an AdaBoost-enhanced decision tree regression model[92]. Trained on 411 experimental data points encompassing lattice parameters and electrical properties, the model achieved 99.4% correlation with experimental measurements. The optimized synthesis conditions identified through this approach - including specific selenium doping levels (0.25 at.%) and substrate temperature ranges (473–523 K) - demonstrated less than 5% deviation from empirical results across 123 independent validations.
Tewari et al. introduced a dual-phase screening strategy for transition metal oxides, combining classification and regression models[93]. This approach utilized key material descriptors such as atomic density and oxygen-to-metal ratios to eliminate 78% of unsuitable candidates during preliminary screening while maintaining prediction accuracy above 90%. The methodology reduced computational costs by 83% compared to traditional high-throughput simulations, demonstrating particular efficacy in identifying materials with low thermal conductivity through early-stage feature analysis.
ML-assisted phonon engineering plays a pivotal role in enabling ML to predict the thermal properties of thermoelectric materials. In this context, Al-Fahdi et al. introduced two innovative chemical bonding descriptors: normalized - Integrated Crystal Orbital Hamiltonian Population (ICOHP) and normalized Integrated Crystal Orbital Bond Index (ICOBI)[94]. These descriptors serve to quantify the bonding strength and directional characteristics between atoms in crystalline structures. The normalized - ICOHP is derived through the integration and normalization of the Crystal Orbital Hamiltonian Population (COHP), where negative values denote bonding contributions and positive values signify antibonding contributions; the larger the absolute value, the stronger the chemical bond. The normalized ICOBI further incorporates bond order and bond length information, enabling precise characterization of chemical bond anisotropy.
To advance this framework, the authors developed a crystal attention graph neural network (CATGNN) model, which leverages a multi-head attention mechanism and graph convolutional layers to automatically learn the atomic arrangement patterns and bonding features within crystal structures. By predicting the chemical bonding descriptors for approximately 200, 000 materials, CATGNN successfully identified materials with extreme lattice thermal conductivity (LTC). First-principles validation revealed that 106 materials with low descriptor values exhibited an LTC below 5 W/(m
Collectively, these studies establish ML as a transformative tool for thermal transport optimization, enabling rapid identification of high-performance thermoelectric materials while revealing fundamental structure-property relationships. The integration of predictive models with experimental validation frameworks has accelerated discovery cycles by 40%–70%, marking a paradigm shift in materials design methodologies.
Predicting phonon scattering for better thermal conductivity prediction
Recent advancements in ML have revolutionized the prediction of thermal conductivity through enhanced phonon scattering analysis. The integration of computational physics with data-driven approaches has enabled accurate modeling of lattice thermal transport properties, overcoming traditional limitations in handling complex phonon interactions.
A multi-method framework combining DFT, finite element analysis (FEM), and supervised ML was developed by Dong et al. for anisotropic phononic crystals[95]. The study revealed significant challenges in predicting relative thermal conductivity (
Building on this foundation, Guo et al. introduced an advanced ML methodology for phonon scattering rate prediction, addressing computational challenges associated with skewed scattering rate distributions [Figure 4B][96]. Transfer learning techniques enhanced model performance across different phonon scattering orders, achieving prediction speeds two orders of magnitude faster than conventional first-principles calculations. Validation across three material systems demonstrated exceptional agreement with experimental values: For silicon, predicted three-phonon [137.9 ± 3.6 W/(m·K)] and four-phonon [120.5 ± 0.2 W/(m·K)] conductivities closely matched experimental measurements [139.7 W/(m·K)]. Similar accuracy was observed in magnesium oxide [predicted: 46.79 ± 0.30 W/(m·K) vs. experimental: 47.4 W/(m·K)] and lithium cobalt oxide [predicted: 16.82 ± 0.42 W/(m·K) vs. experimental: 17.01 W/(m·K)].
In thermoelectric materials, phonon scattering is intimately correlated with the carrier relaxation time. Zhou et al. developed a physically interpretable descriptor model using the SISSO algorithm, integrated with first-principles calculations based on deformation potential theory[97]. By training on 152 tetradymite compounds with integer stoichiometry (85 normal insulators, NIs; 67 topological insulators, TIs), they successfully extracted key descriptors for relaxation time. For NIs, the descriptor primarily relies on combinations of atomic mass and Pauling electronegativity, while TIs exhibit nonlinear dependencies on p-orbital radii and electronegativity. The model predictions showed strong consistency with first-principles-derived relaxation times and were validated through experimental trends. Furthermore, extending the model to 16 million tetradymites with fractional stoichiometry, the study identified tens of thousands of candidates with ultralow (
Furthermore, Al-Fahdi et al. employed the CATGNN to predict the phonon density of states (DOS) for 4, 994 inorganic structures, and proposed a high-throughput screening strategy for candidate substrates in wide-band gap electronic cooling by integrating the physical mechanisms of interfacial thermal conductance[98]. The study demonstrated that achieving high ITC necessitates not only energetic overlap of phonon DOS but also matching of phonon group velocities at the interface. Through Pearson correlation analysis, simple material descriptors negatively correlated with ITC were identified, including the proportion of low-frequency optical phonon modes and the gradient of phonon DOS, which serve as critical indicators for thermal management material design.
Specifically, the CATGNN model captures the spatial distribution and frequency characteristics of phonon vibration modes via an attention mechanism, with prediction results exhibiting excellent agreement with experimentally measured phonon spectra (e.g., inelastic neutron scattering data). The research further revealed nonlinear effects in phonon-phonon interactions, such as the "nesting effect" between low-frequency optical phonons and acoustic phonons, which significantly enhances three-phonon scattering and reduces thermal conductivity. By tailoring the phonon DOS overlap and group velocity matching at material interfaces, optimized ITC design can be achieved, providing theoretical guidance for thermal dissipation in high-performance electronic devices.
The computational paradigm was further advanced by You et al. through the development of ML interatomic potentials (MLIP) with message-passing neural networks[99]. This approach achieved unprecedented computational efficiency, accelerating simulations by five orders of magnitude compared to traditional DFT methods while maintaining high accuracy [energy root mean square error (RMSE): 0.4 meV/atom, force RMSE: 19.5 meV/Å]. The framework revealed significant four-phonon scattering effects, reducing LTC by 22.5% at 300 K and 26.7% at 900 K in Mg
To address the current lack of standardized databases and publicly available models for MLIPs, Yang et al. introduced HH130 on MatHub-3d - the first open-source database targeting 130 HH compounds with well-defined band gaps and dynamic stability. Constructed via a dual adaptive sampling (DAS) method, the database integrates 31, 891 high-fidelity configurations and 390 MLIP models based on moment tensor potentials (MTP), achieving unprecedented accuracy in predicting energies (MAE
HH130 enables high-throughput screening of LTC, revealing that 8-valence electron count (VEC) HH compounds exhibit significantly lower thermal conductivity than 18-VEC counterparts, attributed to weak second-order interatomic force constants (IFCs) and enhanced phonon scattering phase spaces. Notably, MLIP models with root-mean-square errors
By bridging ML and atomistic simulations, HH130 provides a robust platform for decoding complex phonon dynamics, accelerating the discovery of next-generation thermoelectrics with optimized
These investigations collectively demonstrate the transformative potential of ML in deciphering phonon scattering dynamics and optimizing LTC. Through precise modeling of phonon interaction characteristics, ML algorithms significantly improve the fidelity of thermal transport predictions while elucidating the critical role of scattering mechanisms in governing heat conduction properties. The evolving sophistication of ML methodologies promises expanded applications in thermal transport analysis, particularly in:
● Multi-phonon process characterization
● Temperature-dependent scattering regime identification
● Anisotropic thermal behavior prediction
This technological progression is driving novel discoveries in functional material design, as comprehensively documented in recent advancements (see Table 2 for comparative analysis of ML-enabled
Recent advances in ML for thermoelectric property prediction
Authors | Years | Samples | Features | Targets | Algorithms |
ML: Machine learning; DFT: density functional theory; SVR: support vector regression; DTR: decision tree regressor; LSTM: long short-term memory; GBR: gradient boosting regressor; CATGNN: crystal attention graph neural network; ETR: extra trees regressor; MLP: multilayer perceptron; DNN: deep neural network; SISSO: sure independence screening and sparsifying operator; ITC: interfacial thermal conductivity; MLIP: ML interatomic potential; LightGBM: light gradient boosting machine; ANN: artificial neural network. | |||||
Qin et al.[90] | 2023 | 350 compounds | Structural parameters, DFT results, periodic properties | Thermal conductivity (κ) | SVR, DTR, LSTM network |
Ren et al.[91] | 2024 | 30 Zintl-phase compounds | Compositional descriptors, crystallographic parameters | lattice thermal conductivity (κL) | GBR |
Wudil et al.[92] | 2023 | 411 Bi2Te3-based materials | Charge transport properties, structural descriptors, temperature | Thermal conductivity (κ) | DTR, SVR, AdaBoost |
Tewari et al.[93] | 2020 | 315 oxide materials | Compositional attributes, crystal structure | Lattice thermal conductivity (κL) | XGBoost |
Al-Fahdi et al.[94] | 2025 | 4, 994 inorganic crystals | Gaussian expansion, spherical harmonics | Lattice thermal conductivity (κL) | CATGNN |
Dong et al.[95] | 2024 | 18 semiconductor systems | Elastic moduli, temperature | Thermal conductivity (κ) | ETR, MLP |
Guo et al.[96] | 2023 | 2, 000 phonon datasets | Phonon frequencies, wavevectors, eigenvectors, group velocities | Scattering rates (Γ), thermal conductivity (κ) | DNN |
Zhou et al.[97] | 2020 | 152 tetradymite compounds | Gaussian expansion, spherical harmonics | Relaxation time | SISSO |
Al-Fahdi et al.[98] | 2024 | 4, 994 inorganic crystals | First-principles calculation data | ITC | CATGNN |
You et al.[99] | 2024 | 1, 200 atomic configurations | Thermal transport parameters, electronic transport coefficients | Thermal conductivity (κ) | MLIP |
Li et al.[101] | 2022 | 5, 038 materials | Physicochemical descriptors | $$z T$$ | LightGBM |
Xu et al.[102] | 2024 | 7, 000 compounds | Compositional fingerprints | $$z T$$ | Autoencoder + LightGBM |
Wang et al.[103] | 2025 | 5, 226 datasets | Physical descriptors, coordination numbers | $$z T$$ | Stacked ensemble |
Madavali et al.[104] | 2024 | 209 experimental datasets | Chemical composition, temperature, transport parameters | $$z T$$ | ANN |
Discussion
In the prediction of thermal conductivity in thermoelectric materials, ML models exhibit distinct trade-offs between interpretability and predictive accuracy. LSTM networks demonstrate high accuracy in thermal conductivity prediction [determination coefficient R2 = 0.96, RMSE = 0.15 W/(m·K)] by capturing complex temporal correlations in multi-order phonon scattering dynamics, but their recurrent hidden-state mechanisms lack direct interpretability, requiring reliance on correlation analysis (e.g., inverse correlation with Grüneisen parameters) for indirect physical insights. GBRs achieve higher accuracy R2 = 0.988) in Zintl phase analysis by refining 21 initial features into eight critical descriptors (e.g., lattice constants, atomic radii), yet their additive tree structures only provide ranked feature importance rather than explicit mechanistic explanations of phonon scattering pathways. AdaBoost-enhanced decision tree models achieve 0.994 correlation with experimental data in bismuth telluride systems, offering interpretability through rule-based splits (e.g., selenium doping levels, substrate temperature ranges), but face overfitting risks and reduced generalization in complex material systems. Physically informed models such as the SISSO algorithm, integrated with deformation potential theory, extract interpretable descriptors from atomic mass, electronegativity, and orbital radii, ensuring prediction consistency while maintaining mechanistic clarity. In contrast, transfer learning and message-passing neural networks (MLIPs) optimize multi-phonon scattering prediction speeds 105 acceleration) with high accuracy (energy RMSE: 0.4 meV/atom), but operate as black boxes requiring DFT validation to anchor physical meaning.
Regarding robust input features and feature engineering strategies, the reliability of ML in thermal transport modeling hinges on integrating physically meaningful descriptors and adaptive selection methods. Atomic-scale fundamental properties - such as atomic number, elastic modulus, and Pauling electronegativity - form the basis of predictive frameworks, as seen in GBR models refining these parameters into critical lattice and electronic structure descriptors. Phonon-specific features (e.g., Grüneisen parameters, phonon group velocities) and defect-related attributes (e.g., doping concentrations) are essential for decoding thermal conductivity trends, with their inverse correlations validated across multiple material systems. Multi-source feature fusion strategies, combining DFT-derived phonon dispersion data, experimental transport measurements, and structural parameters (e.g., 411 lattice/electrical property data points for AdaBoost), create a rich input space for capturing anisotropic and temperature-dependent behaviors. Adaptive feature selection methods - including stepwise feature pruning (21$$\boxtimes$$8 descriptors), attention mechanisms for weighting scattering pathway contributions, and transfer learning across phonon orders - enhance model generalization. For example, the SISSO algorithm identifies nonlinear dependencies on p-orbital radii in topological insulators, while MLIPs leverage interatomic potential learning to accelerate scattering simulations. These approaches collectively demonstrate that robust feature engineering, harmonizing theoretical priors with data-driven selection, is critical for decoding complex structure-thermal transport relationships in thermoelectric materials.
ML FOR PREDICTING $$ zT $$ VALUES>
The thermoelectric figure of merit (
Wang et al. further advanced the field through stacked ensemble learning, integrating five regression models (random forest, decision tree, k-nearest neighbors, XGBoost, and LightGBM) across 5, 226 data points[103]. Their ensemble architecture achieved superior accuracy (
The convergence of computational and experimental approaches marks a paradigm shift in thermoelectric materials research. Current methodologies exhibit distinct advantages: LightGBM-based models excel in rapid large-scale screening, while DNNs demonstrate superior performance in process-property correlation analysis. Hybrid architectures combining autoencoders with ensemble methods are emerging as powerful tools for feature space compression and prediction accuracy enhancement. Recent benchmarks indicate 40%-70% acceleration in discovery cycles compared to conventional trial-and-error approaches, with particular success in narrow-bandgap semiconductors and complex Zintl phases. Table 2 summarizes these methodological advancements, highlighting performance metrics and material systems where ML has driven significant
ML-AIDED DESIGN OF HIGH-PERFORMANCE THERMOELECTRIC MATERIALS
The integration of ML into thermoelectric materials research has catalyzed a paradigm shift from serendipitous discovery to rational design, fundamentally transforming every stage from computational screening to experimental optimization. This evolution is exemplified by groundbreaking studies that harness diverse algorithmic approaches to decode complex structure-property relationships and accelerate materials development cycles.
Jia et al. pioneered unsupervised learning applications through systematic analysis of 456 HH compounds from the MP database[105]. Their seven-algorithm clustering framework (K-means, DBSCAN, AGNES, etc.) processed 484 descriptors spanning electronic band structures (effective mass
Figure 5. Screening combination of unsupervised ML with the labeled reported known HH TE materials[105]. ML: Machine learning; HH: half-Heusler; TE: thermoelectric.
Building on this foundation, Vaitesswar et al. established supervised learning benchmarks through comparative analysis of 12 ML models[106]. Their random forest implementation outperformed DNNs in cubic material systems, achieving MAE of 0.12 vs. DNN's 0.18 in
Xu et al. advanced feature engineering through entropy-based selection, reducing 130, 000+ material systems to 6, 476 high-potential candidates[107]. Their ExtraTree algorithm identified four critical descriptors: weighted phonon velocity
Fan and Oganov [108] revolutionized high-throughput screening through integration of first-principles calculations (796 chalcogenides) with ensemble learning. Their M3GNet architecture, combining graph neural networks with message passing, achieved 93% classification accuracy for n-type materials by analyzing doping-induced band structure modifications. The model identified 17 novel candidates including Tl
Chen's gene expression programming (GEP) framework [109] represents the cutting edge in microstructure design. By simulating evolutionary pressure on Bi
The collective advances in ML applications demonstrate transformative multidimensional impacts across thermoelectric materials research: Discovery cycles have accelerated 5–10
INVERSE DESIGN OF THERMOELECTRIC MATERIALS
Inverse design is driving a paradigm shift in the discovery of thermoelectric materials. Compared to forward ML predictions based on structure-property mapping, inverse design demonstrates three pivotal advantages: (1) Elimination of redundant iterative processes by establishing end-to-end "target property $$\boxtimes$$ material configuration" generative models, thereby avoiding inefficient inverse deduction required in forward approaches; (2) Dynamic multi-parameter co-optimization through constraint satisfaction algorithms and Pareto frontier analysis, enabling simultaneous optimization of competing parameters (e.g., electrical conductivity vs. thermal conductivity) while embedding experimental constraints (synthesis temperature, elemental cost) into the generation workflow; (3) Integration of global exploration and local refinement mechanisms, combining reinforcement learning for broad material space screening with variational autoencoders (VAEs) for atomic-level tuning of lattice defects and carrier concentration to surpass performance limits of empirical design. Currently evolving from an auxiliary tool to a core paradigm, inverse design is transforming thermoelectric materials development through its "target-driven/experimental-constrained" framework, marking the transition from trial-and-error approaches to intelligent customization and heralding the advent of on-demand materials design era.
Long et al. proposed a conditional generative adversarial network (CVAEGAN) framework combined with a ResNet-enhanced encoder and diversity-driven loss function, as shown in Figure 6A. This framework can systematically explore a vast compositional space under strict experimental constraints (e.g., synthesis temperature below 1, 200 ℃ and elemental cost thresholds). The key methodological advances are: A dataset of 3, 000 thermoelectric materials, covering eight major systems (e.g., Mg-based alloys, BiTe, HHs), was constructed through SMOTE oversampling and literature mining, ensuring balanced representation across temperature ranges (low/medium/high) and doping complexity. By encoding
While the CVAEGAN framework proposed by Long et al. achieves constrained generation under experimental constraints such as synthesis temperature and elemental cost thresholds, challenges persist in encoding complex physical constraints (e.g., crystal structure stability, defect formation energy) and dynamic synthesis conditions (e.g., cooling rate, doping uniformity). Current models rely on manually predefined thresholds, struggling to accurately characterize nonlinear constraints such as metastable phase evolution and interfacial effects in real material systems. Developing data-driven constraint embedding methods based on first-principles calculations - such as converting DFT-derived phonon dispersion relations and electronic band structures into implicit regularization terms for generative models - remains critical.
Integrating inverse design with DFT workflows also faces bottlenecks, as existing frameworks depend on post-hoc DFT validation for key parameters such as electron-phonon coupling and thermal transport anisotropy, leading to high computational costs in the "generate-validate" cycle. Constructing cross-scale transfer models to integrate DFT-derived descriptors (e.g., effective mass, relaxation time) into real-time constraint feedback during network propagation is essential for efficiency.
Experimental validation further encounters challenges in high-throughput screening and characterization, where only a fraction of generated high-zT candidates (e.g., Mg
At present, there are many blanks in the inverse design of thermoelectric materials, while the inverse design has been very popular in the design of other types of materials. Here, we will introduce several effective inverse design methods to provide new ideas for the inverse design of thermoelectric materials.
In the field of high-temperature superconductor inverse design, Zhong et al. proposed a deep generative model that combines VAE and generative adversarial networks (GANs), as shown in Figure 6B. This model maps superconductor compositions into a low-dimensional latent space via the encoder and applies a conditional generative adversarial mechanism to achieve precise regulation of the critical temperature (\(T_c\)). The research team extracted data for 7, 375 superconductors from the SuperCon database and categorized them into three groups based on \(T_c\): high (\(>77 \, \text{K}\)), medium (\(40\text{–}77 \, \text{K}\)), and low (\(20\text{–}40 \, \text{K}\)), which were used as generation conditions. Through adversarial learning, the model optimized the authenticity of the generated samples and their \(T_c\) alignment, successfully predicting hundreds of potential superconductor compositions with \(T_c > 77 \, \text{K}\). Notably, the model revealed a relationship between copper concentration and \(T_c\) in copper-based superconductors, finding that when the copper concentration approximates 2.41 (e.g., Hg
To address the limitations of traditional generative models in designing doped superconductors, Zhong et al. further proposed the Supercon-Diffusion method, which is based on a diffusion model and three-channel matrix representation. This method innovatively decomposes stoichiometric numbers into integer, first decimal, and second decimal channels, as shown in Figure 6C. Through a stepwise noise addition and denoising process, combined with \(T_c\)-condition constraints, it achieves high-precision control of doping ratios. Training on 7, 315 doped superconductor data points, the model generated samples with improvements in charge neutrality (55%) and doping effectiveness (55%), exceeding traditional GANs by over ten times. Additionally, 98% of the generated samples exhibited negative formation energies, indicating thermodynamic stability. The study also found that the model could automatically identify optimal doping ranges in key families (e.g., YBa
CONCLUSION
In this paper, we provided a comprehensive introduction to the application of ML in the field of thermoelectric materials. We can utilize ML to predict various properties of thermoelectric materials and also employ it to assist in the design of novel thermoelectric materials. Figure 7A shows some existing material features and Figure 7B shows deep learning methods used in thermoelectric material ML. Although some achievements have been made in the application of ML in the field of thermoelectric materials, there are still many limitations. Here, we offer some potential directions.
Figure 7. (A) Current utilized features (take PbTe as example); (B) Current utilized deep learning methods.
Structural prediction and optimization
In order to achieve efficient design and development of thermoelectric materials, the development of advanced ML models to predict their crystal structures and atomic arrangements is of paramount importance. These models, trained on large-scale databases of known crystal structures, are capable of learning complex structural features and patterns, thereby significantly enhancing the accuracy of predicting new structural configurations. This data-driven predictive approach not only accelerates the discovery process of new materials but also provides valuable theoretical guidance for experimental research. Furthermore, integrating ML predictions with first-principles calculations enables in-depth validation and optimization of the predicted structures. First-principles calculations, based on quantum mechanics, can precisely describe the physical properties of materials at the electronic level. Through this integration, researchers can conduct detailed stability analyses and electronic property calculations of the predicted structures, thereby screening for thermoelectric materials with potentially high performance. This synergistic approach not only increases the reliability of predictions but also offers theoretical support for further material optimization. In addition, high-throughput screening technology plays a crucial role in this process. With the aid of automated computational tools, researchers can rapidly evaluate a large number of predicted structures to identify candidate materials with optimal thermoelectric properties. This method enables the processing and analysis of vast amounts of data in a short period of time, significantly improving the efficiency of material screening and reducing the workload of experimental validation. High-throughput screening not only quickly identifies materials with superior performance but also provides a clear direction for subsequent experimental synthesis and performance optimization.
Multi-scale modeling and simulation
With the continuous development of science and technology, the demand for high-performance thermoelectric materials is steadily increasing. In order to better understand and optimize the performance of these materials, it is necessary to develop multi-scale ML models that can predict the performance of thermoelectric materials across different length scales (from atomic to macroscopic). These models are capable of capturing the complex relationships between atomic structure, microstructure, and macroscopic properties through advanced algorithms and data processing techniques, thereby providing more comprehensive theoretical support for material design. Specifically, the structure at the atomic scale determines the fundamental physical properties of the material, while the microstructure influences its microscopic physical behavior. Macroscopic properties are the integrated manifestation of these microscopic characteristics. By using multi-scale ML models, these characteristics at different levels can be connected, enabling accurate prediction of the performance of thermoelectric materials. Moreover, combining the physical theories of thermoelectric transport with ML algorithms not only helps to deeply understand the fundamental mechanisms controlling thermoelectric performance but also enables the prediction of material behavior under various complex conditions. This integration fully leverages the guiding role of physical theory and the powerful data processing capabilities of ML, offering new perspectives and methods for the study of thermoelectric materials. Finally, ML technology can also be used to simulate the behavior of thermoelectric materials under different working conditions. For example, in practical applications, thermoelectric materials often need to operate under varying temperature gradients and mechanical stresses. Through ML models, these complex working conditions can be simulated and analyzed, providing strong support for the design and optimization of materials. This not only helps to enhance the durability of the materials but also further improves their performance, making them better suited to meet practical application requirements. Therefore, the development of multi-scale ML models and their application in the research and design of thermoelectric materials are of great significance for advancing thermoelectric technology.
Inverse design and feedback loop
In recent years, with the rapid development of artificial intelligence technologies, inverse design models have shown great potential in the discovery and design of novel thermoelectric materials. Among them, emerging technologies such as GANs, diffusion models, VAEs, and generative methods based on reinforcement learning are becoming important tools for driving innovation in thermoelectric materials.
The core advantage of these models lies in their ability to start from target performance and inversely generate material structures with specific functions. GANs optimize the generated material structures through the adversarial training between the generator and discriminator, bringing their thermoelectric performance close to or even beyond that of existing materials. Diffusion models, on the other hand, reconstruct material configurations with ideal performance from random data by gradually removing noise. VAEs map the structural features of materials into a low-dimensional space through the synergy of the encoder and decoder, and then reconstruct materials with optimized performance through the decoder. Additionally, generative methods based on reinforcement learning can dynamically adjust material design strategies through a reward mechanism, thus efficiently exploring the material design space.
These advanced inverse design models not only break through the limitations of traditional design thinking but also, with the support of large-scale data, rapidly explore the material design space to discover novel materials with unique microstructures and excellent thermoelectric performance. For example, by combining physical theories with ML algorithms, these models can generate materials with specific atomic arrangements, microstructures, and macroscopic properties, thus providing new ideas for the design of high-performance thermoelectric materials.
Looking to the future, these inverse design models are expected to form a close feedback loop with experimental synthesis and characterization techniques. ML models can generate potential material structures, while experimental validation provides feedback data to further optimize the accuracy and reliability of the models. Through this iterative optimization process, not only can the discovery of high-performance thermoelectric materials be accelerated, but the cost and time of research and development can also be significantly reduced. Moreover, with the continuous improvement of computational power and the increasing richness of data resources, these models will be able to handle more complex material systems and even achieve material design under multi-physics coupling conditions.
Ultimately, the integration of various inverse design methods, including GANs, diffusion models, VAEs, and reinforcement learning, is expected to bring revolutionary breakthroughs to the field of thermoelectric materials.
DECLARATIONS
Authors' contributions
Writing – review: Wang, Y.
Editing, writing: Zhong, C.
Conceptualization: Zhang, J.
Review: Liu, J.
Data curation: Hu, K.
Writing – review: Chen, J.
Editing: Lin, X.
Availability of data and materials
Not applicable.
Financial support and sponsorship
The authors appreciate financial support from the Guangdong Basic and Applied Basic Research Foundation (2022A1515110676, 2024A1515011845), the Shenzhen Science and Technology Program (JCYJ20220531095404009; RCBS20221008093057027; JCYJ2023080 7094313028, JCYJ20230807094318038), the Sunrise (Xiamen) Photovoltaic Industry Co., Ltd. (Development of Artificial Intelligence Technology for Perovskite Photovoltaic Materials, No. HX20230176), the Natural Science Foundation of China (62102118), and the Shenzhen Colleges and Universities Stable Support Program (GXWD20 220811170504001).
Conflicts of interest
Liu, J. is affiliated with Sunrise (Xiamen) Photovoltaic Industry Co., Ltd, while the other authors have declared that they have no conflicts of interest.
Ethical approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Copyright
© The Author(s) 2025.
REFERENCES
1. Cao Y., Sheng Y., Li X., Xi L., Yang J.. Application of materials genome methods in thermoelectrics. Front. Mater. 2022;9:861817.
2. Wan X., Feng W., Wang Y., et al. Materials discovery and properties prediction in thermal transport via materials informatics: a mini review. Nano Lett. 2019;19:3387-95.
3. Wang T., Zhang C., Snoussi H., Zhang G.. Machine learning approaches for thermoelectric materials research. Adv. Funct. Mater. 2020;30:1906041.
4. Zhang Z., Jiang Y., Shu M., Li L., Dong Z., Xu J.. Artificial photosynthesis over metal halide perovskites: achievements, challenges, and prospects. J. Phys. Chem. Lett. 2021;12:5864-70.
5. Yang J., Xi L., Qiu W., et al. On the tuning of electrical and thermal transport in thermoelectrics: an integrated theory–experiment perspective. NPJ Comput. Mater. 2016;2:15015.
6. Uchida K., Takahashi S., Harii K., et al. Observation of the spin Seebeck effect. Nature. 2008;455:778-81.
9. Crane D. T., Jackson G. S.. Optimization of cross flow heat exchangers for thermoelectric waste heat recovery. Energy Convers. Manag. 2004;45:1565-82.
10. Qin Y., Qin B., Wang D., Chang C., Zhao L. D.. Solid-state cooling: thermoelectrics. Energy Environ. Sci. 2022;15:4527-41.
11. Chen W. Y., Shi X. L., Zou J., Chen Z. G.. Thermoelectric coolers for on-chip thermal management: materials, design, and optimization. Mater. Sci. Eng. R. Rep. 2022;151:100700.
12. Yang, J. Potential applications of thermoelectric waste heat recovery in the automotive industry. In ICT 2005. 24th International Conference on Thermoelectrics, 2005, Clemson, USA. Jun 19-23, 2005. IEEE; 2005. pp. 170–4.
13. Xie H., Zhang Y., Gao P.. Thermoelectric-powered sensors for Internet of Things. Micromachines. 2022;14:31.
14. Bonin R., Boero D., Chiaberge M., Tonoli A.. Design and characterization of small thermoelectric generators for environmental monitoring devices. Energy Convers. Manag. 2013;73:340-9.
15. Date A., Date A., Dixon C., Akbarzadeh A.. Progress of thermoelectric power generation systems: prospect for small to medium scale power generation. Renew. Sustain. Energy Rev. 2014;33:371-81.
16. He R., Schierning G., Nielsch K.. Thermoelectric devices: a review of devices, architectures, and contact optimization. Adv. Mater. Technol. 2018;3:1700256.
17. Zhang Q., Deng K., Wilkens L., Reith H., Nielsch K.. Micro-thermoelectric devices. Nat. Electron. 2022;5:333-47.
18. Zhang Q. H., Huang X. Y., Bai S. Q., Shi X., Uher C., Chen L. D.. Thermoelectric devices for power generation: recent progress and future challenges. Adv. Eng. Mater. 2016;18:194-213.
19. Sajid M., Hassan I., Rahman A.. An overview of cooling of thermoelectric devices. Renew. Sustain. Energy Rev. 2017;78:15-22.
20. Belsky A. A., Glukhanich D. Y.. Standalone power system with photovoltaic and thermoelectric installations for power supply of remote monitoring and control stations for oil pipelines. Renew. Energy Focus. 2023;47:100493.
21. Palaporn D., Tanusilp S., Sun Y., Pinitsoontorn S., Kurosaki K.. Thermoelectric materials for space explorations. Mater. Adv. 2024;5:5351-64.
22. Venkatasubramanian R., Siivola E., Colpitts T., O'Quinn B.. Thin-film thermoelectric devices with high room-temperature figures of merit. Nature. 2001;413:597-602.
23. Yang S., Qiu P., Chen L., Shi X.. Recent developments in flexible thermoelectric devices. Small Sci. 2021;1:2100005.
24. Snyder G. J., Snyder A. H.. Figure of merit ZT of a thermoelectric device defined from materials properties. Energy Environ. Sci. 2017;10:2280-3.
25. Kim H. S., Gibbs Z. M., Tang Y., Wang H., Snyder G. J.. Characterization of Lorenz number with Seebeck coefficient measurement. APL Mater. 2015;3:041506.
26. Martin J., Tritt T., Uher C.. High temperature Seebeck coefficient metrology. J. Appl. Phys. 2010;108:121101.
27. de Boor J., Müller E.. Data analysis for Seebeck coefficient measurements. Rev. Sci. Instrum. 2013;84:065102.
28. Snyder G. J., Pereyra A., Gurunathan R.. Effective mass from Seebeck coefficient. Adv. Funct. Mater. 2022;32:2112772.
29. Iwanaga S., Toberer E. S., LaLonde A., Snyder G. J.. A high temperature apparatus for measurement of the Seebeck coefficient. Rev. Sci. Instrum. 2011;82:063905.
30. Mott N. F.. The electrical conductivity of transition metals. Proc. R. Soc. Lond. A. 1936;153:699-717.
31. Radzuan N. A. M., Sulong A. B., Sahari J.. A review of electrical conductivity models for conductive polymer composite. Int. J. Hydrogen Energy. 2017;42:9262-73.
32. Ebbesen T. W., Lezec H. J., Hiura H., Bennett J. W., Ghaemi H. F., Thio T.. Electrical conductivity of individual carbon nanotubes. Nature. 1996;382:54-6.
35. Venkatasubramanian R.. Lattice thermal conductivity reduction and phonon localizationlike behavior in superlattice structures. Phys. Rev. B. 2000;61:3091.
36. Zapata-Arteaga O., Perevedentsev A., Marina S., Martin J., Reparaz J. S., Campoy-Quiles M.. Reduction of the lattice thermal conductivity of polymer semiconductors by molecular doping. ACS Energy Lett. 2020;5:2972-8.
37. Murakami T., Shiga T., Hori T., Esfarjani K., Shiomi J.. Importance of local force fields on lattice thermal conductivity reduction in PbTe1-xSex alloys. EPL. 2013;102:46002.
38. Wan C., Wang Y., Wang N., Norimatsu W., Kusunoki M., Koumoto K.. Development of novel thermoelectric materials by reduction of lattice thermal conductivity. Sci. Technol. Adv. Mater. 2010;11:044306.
39. Kim T. Y., Park C. H., Marzari N.. The electronic thermal conductivity of graphene. Nano Lett. 2016;16:2439-43.
40. Graf M. J., Yip S. K., Sauls J. A., Rainer D.. Electronic thermal conductivity and the Wiedemann-Franz law for unconventional superconductors. Phys. Rev. B. 1996;53:15147.
41. Lee S., Hippalgaonkar K., Yang F., et al. Anomalously low electronic thermal conductivity in metallic vanadium dioxide. Science. 2017;355:371-4.
42. Ambegaokar V., Tewordt L.. Theory of the electronic thermal conductivity of superconductors with strong electron-phonon coupling. Phys. Rev. 1964;134:A805.
43. Burger N., Laachachi A., Ferriol M., Lutz M., Toniazzo V., Ruch D.. Review of thermal conductivity in composites: mechanisms, parameters and theory. Prog. Polym. Sci. 2016;61:1-28.
44. Wang D. Z., Liu W. D., Shi X. L., et al. Se-alloying reducing lattice thermal conductivity of Ge0.95Bi0.05Te. J. Mater. Sci. Technol. 2022;106:249-56.
45. Zhang Q., Song Q., Wang X., et al. Deep defect level engineering: a strategy of optimizing the carrier concentration for high thermoelectric performance. Energy Environ. Sci. 2018;11:933-40.
46. Pei Y., Wang H., Snyder G. J.. Band engineering of thermoelectric materials. Adv. Mater. 2012;24:6125-35.
47. He J., Sootsman J. R., Girard S. N., et al. On the origin of increased phonon scattering in nanostructured PbTe based thermoelectric materials. J. Am. Chem. Soc. 2010;132:8669-75.
48. Zheng Y., Slade T. J., Hu L., et al. Defect engineering in thermoelectric materials: what have we learned? Chem. Soc. Rev. 2021;50:9022-54.
49. Xie H., Zhao L. D., Kanatzidis M. G.. Lattice dynamics and thermoelectric properties of diamondoid materials. Interdiscip. Mater. 2024;3:5-28.
50. Qin D., Shi W., Lu Y., Cai W., Liu Z., Sui J.. Roles of interface engineering in performance optimization of skutterudite-based thermoelectric materials. Carbon Neutraliz. 2022;1:233-46.
51. Chen J., Li K., Liu C., et al. Enhanced efficiency of thermoelectric generator by optimizing mechanical and electrical structures. Energies. 2017;10:1329.
52. Lineykin S., Ben-Yaakov S.. Modeling and analysis of thermoelectric modules. IEEE Trans. Ind. Appl. 2007;43:505-12.
53. Yang Y., Hu H., Chen Z., et al. Stretchable nanolayered thermoelectric energy harvester on complex and dynamic surfaces. Nano Lett. 2020;20:4445-53.
54. Tritt T. M.. Thermoelectric phenomena, materials, and applications. Ann. Rev. Mater. Res. 2011;41:433-48.
55. Freysoldt C., Grabowski B., Hickel T., et al. First-principles calculations for point defects in solids. Rev. Mod. Phys. 2014;86:253.
56. Li W., Carrete J., Katcho N. A., Mingo N.. ShengBTE: a solver of the Boltzmann transport equation for phonons. Comput. Phys. Commun. 2014;185:1747-58.
57. Jonson M., Mahan G. D.. Mott's formula for the thermopower and the Wiedemann-Franz law. Phys. Rev. B. 1980;21:4223.
58. Binder K., Horbach J., Kob W., Paul W., Varnik F.. Molecular dynamics simulations. J. Phys. Condens. Matter. 2004;16:S429.
59. Kroese D. P., Brereton T., Taimre T., Botev Z. I.. Why the Monte Carlo method is so important today. Wiley Interdiscip. Rev. Comput. Stat. 2014;6:386-92.
60. Wei J., Chu X., Sun X. Y., et al. Machine learning in materials science. InfoMat. 2019;1:338-58.
61. Ramprasad R., Batra R., Pilania G., Mannodi-Kanakkithodi A., Kim C.. Machine learning in materials informatics: recent applications and prospects. npj Comput. Mater. 2017;3:54.
62. Morgan D., Jacobs R.. Opportunities and challenges for machine learning in materials science. Ann. Rev. Mater. Res. 2020;50:71-103.
63. Challapalli A., Patel D., Li G.. Inverse machine learning framework for optimizing lightweight metamaterials. Mater. Design. 2021;208:109937.
64. Sliwa, B.; Piatkowski, N.; Wietfeld, C. LIMITS: lightweight machine learning for IoT systems with resource limitations. In ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland. Jun 07-11, 2020. IEEE; 2020. p. 1–7.
65. Mahmood A., Wang J. L.. Machine learning for high performance organic solar cells: current scenario and future prospects. Energy Environ. Sci. 2021;14:90-105.
66. Zhang L., He M.. Unsupervised machine learning for solar cell materials from the literature. J. Appl. Phys. 2022;131:064902.
67. Tao Q., Xu P., Li M., Lu W.. Machine learning for perovskite materials design and discovery. Npj Comput. Mater. 2021;7:23.
68. Zhang L., He M., Shao S.. Machine learning for halide perovskite materials. Nano Energy. 2020;78:105380.
69. Al-Sabana O., Abdellatif S. O.. Optoelectronic devices informatics: optimizing DSSC performance using random-forest machine learning algorithm. Optoelectron. Lett. 2022;18:148-51.
70. Miller S. A., Dylla M., Anand S., Gordiz K., Snyder G. J., Toberer E. S.. Empirical modeling of dopability in diamond-like semiconductors. npj Comput. Mater. 2018;4:71.
71. Saal J. E., Kirklin S., Aykol M., Meredig B., Wolverton C.. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (OQMD). JOM. 2013;65:1501-9.
72. Kirklin S., Saal J. E., Meredig B., et al. The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies. npj Comput. Mater. 2015;1:15010.
73. Jain A., Ong S. P., Hautier G., et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013;1:011002.
74. de Jong M., Chen W., Angsten T., et al. The high-throughput highway to computational materials design. Sci. Data. 2013;2:150009.
75. de Jong M., Chen W., Geerlings H., Asta M., Persson K. A.. A database to enable discovery and design of piezoelectric materials. Sci. Data. 2015;2:150053.
76. Sun J., Zhong G., Huang K., Dong J.. Banzhaf random forests: cooperative game theory based random forests with consistency. Neural Netw. 2018;106:20-9.
77. Schmidt A. F., Finan C.. Linear regression and the normality assumption. J. Clin. Epidemiol. 2018;98:146-51.
78. Agatonovic-Kustrin S., Beresford R.. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharm. Biomed. Anal. 2000;22:717-27.
79. Wan Z., Wang Q. D., Liu D., Liang J.. Machine learning prediction of the optimal carrier concentration and band gap of quaternary thermoelectric materials via element feature descriptors. Int. J. Quantum Chem. 2021;121:e26752.
80. Goldsmid H. J.. The electrical conductivity and thermoelectric power of bismuth telluride. Proc. Phys. Soc. 1958;71:633.
81. Ricci F., Chen W., Aydemir U., et al. An ab initio electronic transport database for inorganic materials. Sci. Data. 2017;4:170085.
82. Antunes L. M., Butler K. T., Grau-Crespo R.. Predicting thermoelectric transport properties from composition with attention-based deep learning. Mach. Learn. Sci. Technol. 2023;4:015037.
83. Tiryaki H., Yusuf A., Ballikaya S.. Determination of electrical and thermal conductivities of n-and p-type thermoelectric materials by prediction iteration machine learning method. Energy. 2024;292:130597.
84. Furmanchuk A., Saal J. E., Doak J. W., Olson G. B., Choudhary A., Agrawal A.. Prediction of seebeck coefficient for compounds without restriction to fixed stoichiometry: a machine learning approach. J. Comput. Chem. 2018;39:191-202.
85. Yuan H. M., Han S. H., Hu R., et al. Machine learning for accelerated prediction of the Seebeck coefficient at arbitrary carrier concentration. Mater. Today Phys. 2022;25:100706.
86. Gaultois M. W., Sparks T. D., Borg C. K. H., Seshadri R., Bonificio W. D., Clarke D. R.. Data-driven review of thermoelectric materials: performance and resource considerations. Chem. Mater. 2013;25:2911-20.
87. Gaultois M. W., Oliynyk A. O., Mar A., Sparks T. D., Mulholland G. J., Meredig B.. Perspective: Web-based machine learning models for real-time screening of thermoelectric materials properties. APL Mater. 2016;4:053213.
88. Sheng Y., Wu Y., Yang J., Lu W., Villars P., Zhang W.. Active learning for the power factor prediction in diamond-like thermoelectric materials. npj Comput. Mater. 2020;6:171.
89. Graziosi P., Li Z., Neophytou N.. Bipolar conduction asymmetries lead to ultra-high thermoelectric power factor. Appl. Phys. Lett. 2022;120:072102.
90. Qin G., Wei Y., Yu L., et al. Predicting lattice thermal conductivity from fundamental material properties using machine learning techniques. J. Mater. Chem, A. 2023;11:5801-10.
91. Ren Q., Chen D., Rao L., Lun Y., Tang G., Hong J.. Machine-learning-assisted discovery of 212-Zintl-phase compounds with ultra-low lattice thermal conductivity. J. Mater. Chem. A. 2024;12:1157-65.
92. Wudil Y. S.. Ensemble learning-based investigation of thermal conductivity of Bi2Te2.7Se0.3-based thermoelectric clean energy materials. Results Eng. 2023;18:101203.
93. Tewari A., Dixit S., Sahni N., Bordas S. P. A.. Machine learning approaches to identify and design low thermal conductivity oxides for thermoelectric applications. Data Centric Eng. 2020;1:e8.
94. Al-Fahdi M., Lin C., Shen C., Zhang H., Hu M.. Rapid prediction of phonon density of states by crystal attention graph neural network and high-throughput screening of candidate substrates for wide bandgap electronic cooling. Mater. Today Phys. 2025;50:101632.
95. Dong L., Li W., Bu X. H.. Predicting thermal transport properties in phononic crystals via machine learning. Appl. Phys. Lett. 2024;124:162201.
96. Guo Z., Roy Chowdhury P., Han Z., et al. Fast and accurate machine learning prediction of phonon scattering rates and lattice thermal conductivity. npj Comput. Mater. 2023;9:95.
97. Zhou Z., Cao G., Liu J., Liu H.. High-throughput prediction of the carrier relaxation time via data-driven descriptor. npj Comput. Mater. 2020;6:149.
98. Al-Fahdi, M.; Rurali, R.; Hu, J.; Wolverton, C.; Hu, M. Accelerating Discovery of extreme lattice thermal conductivity by crystal attention graph neural network (CATGNN) using chemical bonding intuitive descriptors. arXiv 2024; arXiv: 2410.16066.
99. You, H. J.; Chiang, Y. T.; Bansil, A.; Lin, H. Effects of four-phonon scattering and wave-like phonon tunneling effects on thermoelectric properties of Mg2GeSe4 using machine learning. arXiv 2024; arXiv: 2411.10605.
100. Yang Y., Lin Y., Dai S., et al. HH130: a standardized database of machine learning interatomic potentials, datasets, and its applications in the thermal transport of half-Heusler thermoelectrics. Digit. Discov. 2024;3:2201-10.
101. Li Y., Zhang J., Zhang K., Zhao M., Hu K., Lin X.. Large data set-driven machine learning models for accurate prediction of the thermoelectric figure of merit. ACS Appl. Mater. Interfaces. 2022;14:55517-27.
102. Xu Y., Liu X., Wang J.. Prediction of thermoelectric-figure-of-merit based on autoencoder and light gradient boosting machine. J. Appl. Phys. 2024;135:074901.
103. Wang Y., Zhong C., Zhang J., Yao H., Chen J., Lin X.. High-Performance stacking ensemble learning for thermoelectric figure-ofmerit prediction. Mater. Design. 2025;249:113552.
104. Madavali B., Nagarjuna C., Dewangan S. K., Ahn B., Hong S. J.. Predicting the thermoelectric figure of merit in p-type BiSbTe-based alloys using artificial neural network modeling. Mater. Today Commun. 2024;40:109396.
105. Jia X., Deng Y., Bao X., et al. Unsupervised machine learning for discovery of promising half-Heusler thermoelectric materials. npj Comput. Mater. 2022;8:34.
106. Vaitesswar U. S., Bash D., Huang T., et al. Machine learning based feature engineering for thermoelectric materials by design. Digit. Discov. 2024;3:210-20.
107. Xu Y., Jiang L., Qi X.. Machine learning in thermoelectric materials identification: feature selection and analysis. Comput. Mater. Sci. 2021;197:110625.
108. Fan T., Oganov A. R.. Combining machine learning models with first-principles high-throughput calculation to accelerate the search of promising thermoelectric materials. J. Mater. Chem. C. 2025;13:1439-48.
109. Chen C., Ong S. P.. A universal graph deep learning interatomic potential for the periodic table. Nat. Comput. Sci. 2022;2:718-28.
110. Long Y., Zhong C., Ma X., et al. Inverse design of high-performance thermoelectric materials via a generative model combined with experimental verification. ACS Appl. Mater. Interfaces. 2025;17:19856-67.
111. Zhong C., Zhang J., Wang Y., et al. High-performance diffusion model for inverse design of high Tc superconductors with effective doping and accurate stoichiometry. InfoMat. 2024;6:e12519.
Cite This Article

How to Cite
Download Citation
Export Citation File:
Type of Import
Tips on Downloading Citation
Citation Manager File Format
Type of Import
Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.
Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.
About This Article
Copyright
Data & Comments
Data

Comments
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at [email protected].