fig6

Farthest point sampling in property designated chemical feature space as an effective strategy for enhancing the machine learning model performance for small scale chemical dataset

Figure 6. (A) and (B) display the t-SNE visualization of the entire dataset and the sampled subset obtained by FPS, respectively, demonstrating the distributional change after sampling; (D) and (E) correspondingly show the results of RS. Compared to RS, FPS yields a more dispersed distribution, thus enhancing diversity in the sampled feature space; (C) illustrates the k-means clustering of the full dataset, determined by the elbow method, while (F) compares the effects of RS and FPS on cluster frequency distributions. t-SNE: t-distributed Stochastic Neighbor Embedding; FPS: farthest point sampling; RS: random sampling.

Journal of Materials Informatics
ISSN 2770-372X (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/