REFERENCES
1. Jumper, J.; Evans, R.; Pritzel, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583-9.
2. Camps-Valls, G.; Fernández-Torres, M. Á.; Cohrs, K. H.; et al. Artificial intelligence for modeling and understanding extreme weather and climate events. Nat. Commun. 2025, 16, 1919.
3. Zhang, D.; Li, H. Digital catalysis platform (DigCat): a gateway to big data and AI-powered innovations in catalysis. ChemXiv 2024. Available online: https://doi.org/10.26434/chemrxiv-2024-9lpb9 (accessed 10 December 2025).
4. Jia, X.; Zhou, Z.; Liu, F.; et al. Closed-loop framework for discovering stable and low-cost bifunctional metal oxide catalysts for efficient electrocatalytic water splitting in acid. J. Am. Chem. Soc. 2025, 147, 22642-54.
5. Zhang, D.; Li, H. The hidden engine of AI in electrocatalysis: databases and knowledge graphs at work. Molecular. Chemistry. &. Engineering. 2025, 1, 100003.
6. Zhang, D.; She, F.; Chen, J.; Wei, L.; Li, H. Why do weak-binding M-N-C single-atom catalysts possess anomalously high oxygen reduction activity? J. Am. Chem. Soc. 2025, 147, 6076-86.
7. Yang, F.; Campos Dos Santos, E.; Jia, X.; et al. A dynamic database of solid-state electrolyte (DDSE) picturing all-solid-state batteries. Nano. Mater. Sci. 2024, 6, 256-62.
8. Yang, F.; Sato, R.; Cheng, E. J.; et al. Data-driven viewpoint for developing next-generation mg-ion solid-state electrolytes. J. Electrochem. 2024, 30, 2415001.
9. Wang, Q.; Yang, F.; Wang, Y.; et al. Unraveling the complexity of divalent hydride electrolytes in solid-state batteries via a data-driven framework with large language model. Angew. Chem. Int. Ed. Engl. 2025, 64, e202506573.
10. Zhang, D.; Jia, X.; Hung, T. B.; et al. “DIVE” into hydrogen storage materials discovery with AI agents. arXiv 2025, arXiv:2508.13251. Available online: https://doi.org/10.48550/arXiv.2508.13251 (accessed 10 December 2025).
11. Li, C.; Yang, W.; Liu, H.; et al. Picturing the Gap Between the Performance and US-DOE’s Hydrogen Storage Target: A Data-Driven Model for MgH2 Dehydrogenation. Angew. Chem. Int. Ed. Engl. 2024, 63, e202320151.
12. Swanson, K.; Wu, W.; Bulaong, N. L.; Pak, J. E.; Zou, J. The virtual lab of AI agents designs new SARS-CoV-2 nanobodies. Nature 2025, 646, 716-23.
13. Zhang, K.; Qi, B.; Zhou, B. Towards building specialized generalist AI with system 1 and system 2 fusion. arXiv 2024, arXiv:2407.08642. Available online: https://doi.org/10.48550/arXiv.2407.08642 (accessed 10 December 2025).
14. Cui, Z.; Li, N.; Zhou, H. A large-scale replication of scenario-based experiments in psychology and management using large language models. Nat. Comput. Sci. 2025, 5, 627-34.
15. Krishnan, A.; Anahtar, M. N.; Valeri, J. A.; et al. A generative deep learning approach to de novo antibiotic design. Cell 2025, 188, 5962-5979.e22.
16. Li, M.; Song, K.; He, J.; et al. Electron-density-informed effective and reliable de novo molecular design and optimization with ED2Mol. Nat. Mach. Intell. 2025, 7, 1355-68.
17. Pacesa, M.; Nickel, L.; Schellhaas, C.; et al. One-shot design of functional protein binders with BindCraft. Nature 2025, 646, 483-92.
18. Ruffolo, J. A.; Nayfach, S.; Gallagher, J.; et al. Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences. Nature 2025, 645, 518-25.
19. Szymanski, N. J.; Rendy, B.; Fei, Y.; et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 2023, 624, 86-91.
20. Antunes, L. M.; Butler, K. T.; Grau-Crespo, R. Crystal structure generation with autoregressive large language modeling. Nat. Commun. 2024, 15, 10570.
21. Wang, J.; Qin, R.; Wang, M.; et al. Token-Mol 1.0: tokenized drug design with large language models. Nat. Commun. 2025, 16, 4416.
22. Liu, N.; Jafarzadeh, S.; Lattimer, B. Y.; Ni, S.; Lua, J.; Yu, Y. Harnessing large language models for data-scarce learning of polymer properties. Nat. Comput. Sci. 2025, 5, 245-54.
23. Ding, K.; Yu, J.; Huang, J.; Yang, Y.; Zhang, Q.; Chen, H. SciToolAgent: a knowledge-graph-driven scientific agent for multitool integration. Nat. Comput. Sci. 2025, 5, 962-72.
24. Hu, M.; Ma, C.; Li, W.; et al. A survey of scientific large language models: from data foundations to agent frontiers. arXiv 2025, arXiv:2508.21148. Available online: https://doi.org/10.48550/arXiv.2508.21148 (accessed 10 December 2025).
25. Ghosh, S.; Brodnik, N.; Frey, C.; et al. Toward reliable ad-hoc scientific information extraction: a case study on two materials dataset. In 62nd Annual Meeting of the Association for Computational Linguistics, Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024; Association for Computational Linguistics: Stroudsburg, USA, 2024; pp 15109-23.
26. Daraqel, B.; Owayda, A.; Khan, H.; Koletsi, D.; Mheissen, S. Artificial intelligence as a tool for data extraction is not fully reliable compared to manual data extraction. J. Dent. 2025, 160, 105846.
27. Whang, S. E.; Roh, Y.; Song, H.; Lee, J. Data collection and quality challenges in deep learning: a data-centric AI perspective. The. VLDB. Journal. 2023, 32, 791-813.
28. Wang, F. Y.; Miao, Q. H. Novel paradigm for AI-driven scientific research: from AI4S to intelligent science. Bull. Chin. Acad. Sci. 2023, 38, 536-40.
29. Messeri, L.; Crockett, M. J. Artificial intelligence and illusions of understanding in scientific research. Nature 2024, 627, 49-58.
30. Wang, H.; Fu, T.; Du, Y.; et al. Scientific discovery in the age of artificial intelligence. Nature 2023, 620, 47-60.
31. Jones, N. AI hallucinations can't be stopped - but these techniques can limit their damage. Nature 2025, 637, 778-80.
32. Watson, J. L.; Juergens, D.; Bennett, N. R.; et al. De novo design of protein structure and function with RFdiffusion. Nature 2023, 620, 1089-100.
33. Ingraham, J. B.; Baranov, M.; Costello, Z.; et al. Illuminating protein space with a programmable generative model. Nature 2023, 623, 1070-8.
34. Szymanski, N. J.; Bartel, C. J.; Zeng, Y.; Diallo, M.; Kim, H.; Ceder, G. Adaptively driven X-ray diffraction guided by machine learning for autonomous phase identification. NPJ. Comput. Mater. 2023, 9, 31.
35. Wang, J.; Wang, K.; Yu, Y.; et al. Self-improving generative foundation model for synthetic medical image generation and clinical applications. Nat. Med. 2025, 31, 609-17.
36. Xu, Y.; Liu, X.; Cao, X.; et al. Artificial intelligence: a powerful paradigm for scientific research. Innovation. (Camb). 2021, 2, 100179.
37. Liang, W.; Tadesse, G. A.; Ho, D.; et al. Advances, challenges and opportunities in creating data for trustworthy AI. Nat. Mach. Intell. 2022, 4, 669-77.


