Title : Machine Learning Defect Properties of Semiconductors
Abstract:
Defects and impurities in semiconductors heavily influence their performance in optoelectronic applications. Quick predictions of defect properties are desired in technologically important semiconductors, but complicated by difficulties in assigning measured levels to specific defects and by the expense of large-supercell first principles computations that involve charge corrections and advanced functionals [1]. We address this issue by combining high-throughput density functional theory (HT-DFT) with machine learning (ML) to develop predictive models for defect formation energies (DFE) and charge transition levels (CTL) of native defects and functional impurities in Group IV, III-V, and II-VI zinc blende (ZB) semiconductors. Using an innovative approach of sampling dozens of metastable polymorphs each from defect configurations in thousands of distinct DFT computations, we generate one of the largest known computational defect datasets, containing many types of vacancies, self-interstitials, anti-site substitutions, impurity interstitials and substitutions, and defect complexes [2,3].
Two distinct types of ML methods are applied: (a) random forest, Gaussian process, and neural network regression models based on manual descriptors encoding the defect atom’s elemental properties, coordination environment, and “unit cell” defect data [2,3], and (b) crystal Graph-based Neural Networks (GNNs) trained using entire defective structures as input [4], specifically using three established GNN techniques, namely Crystal Graph Convolutional Neural Network (CGCNN) [5], Materials Graph Network (MEGNET) [6], and Atomistic Line Graph Neural Network (ALIGNN) [7]. Root-mean square errors (RMSE) in predicting DFE are as high as 1 eV with the former, while ALIGNN yields errors of ~ 0.3 eV or less which represents a prediction accuracy of 98% given the range of values within the dataset, improving significantly on the state-of-the-art.
While the first set of models yield only optimized energies based on a smaller dataset of ~ 1500 points, the GNN models are trained on > 15,000 data points and can be applied to predict accurate unoptimized, partially optimized, or fully optimized DFE values corresponding to any defective structure. The best ML-based DFE predictions for defects in multiple charge states are used to predict relevant CTLs with good accuracy compared to DFT and better accuracy compared to ML models trained directly on the CTL data. Ultimately, the best models are applied to perform screening across hundreds of thousands of hypothetical single defects/dopants and defect complexes to find stable defects which may or may not create energy levels within the band gap and affect the semiconductor’s performance in optoelectronic devices. We also demonstrate that GNN models can be used as an effective surrogate for DFT computations to obtain low energy defective structures for any semiconductor-defect combination, which is very promising for screening over large chemical spaces without the need for expensive DFT.
REFERENCES
[1] Freysoldt, C. et al. First-principles calculations for point defects in solids. Rev Mod Phys 86, 253–305 (2014).
[2] Mannodi-Kanakkithodi, A. et al. Machine-learned impurity level prediction for semiconductors: the example of Cd-based chalcogenides. NPJ Comput Mater 6, 39 (2020).
[3] Mannodi-Kanakkithodi, A. et al. Universal machine learning framework for defect predictions in zinc blende semiconductors. Patterns 3, 100450 (2022).
[4] Rahman, H., Gollapalli, P., Manganaris, P. & Mannodi-Kanakkithodi, A. Accelerating Defect Predictions using Graph-based Neural Networks. In Prep. (2023.
[5] Xie, T. & Grossman, J. C. Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. Phys Rev Lett 120, 145301 (2018).
[6] Chen, C., Ye, W., Zuo, Y., Zheng, C. & Ong, S. P. Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals. Chemistry of Materials 31, 3564–3572 (2019).
[7] Choudhary, K. & DeCost, B. Atomistic Line Graph Neural Network for improved materials property predictions. NPJ Comput Mater 7, 185 (2021).