Room temperature ionic liquids viscosity prediction from deep-learning models
1Department of Physics, Michigan Technological University, Houghton, MI 49931, USA.
2Department of Computer Science, San José State University, San José, CA 95192-0249, USA.
3Department of Physics and Astronomy, California State University Northridge, Northridge, CA 91330, USA.
*Correspondence to: Prof./Dr. Kah Chun Lau, Department of Physics and Astronomy, California State University Northridge, Northridge, CA 91330, USA. E-mail:
Ionic liquids (ILs) are a new group of novel solvents with great potential in design-synthesis. They are promising electrolyte candidates in energy storage applications, especially in rechargeable batteries. However, in practice, their usage remains limited due to the unfavorable high-viscosity (η) property at ambient conditions. To optimize the design synthesis of ILs, a systematic fundamental study of their structure-property relationship is deemed necessary. In this study, we employed a deep-learning (DL) model to predict the room-temperature viscosity of a wide range of ILs that consist of various cationic and anionic families. Based on this DL model, accurate prediction of IL viscosity can be realized, reaching an R2 score of 0.99 with a root mean square error of ~45 mPa·s. To further help identify low- and high-η ILs, a low/high-η binary classification model with an overall accuracy of 93% for test prediction is obtained based on the DL model. From the important structure-property relationship analysis governed by the top-rank molecular descriptors of this model, a list of very low-η ILs (i.e., η < 30 mPa·s) that could be potentially useful in battery electrolytes is identified. Based on the finding of the DL model, it suggests that in order to achieve low-η, grafting IL cations into smaller sizes (e.g., smaller head rings) and short alkyl chains and reducing ionization potentials/energies will help. Meanwhile, for the same cations, further reducing anions in sizes, chain lengths, and hydrogen bonds might be useful to further reduce the viscosity. Thus, with a fine selection and molecular grafting of anionic and cationic species in ILs, we believe fine-tuning IL viscosities can be achieved through the proper design synthesis of functional groups in ILs.
In current modern society, increasing energy demand is inevitable and causes a lot of unwanted environmental problems due to the strong dependence on fossil fuels used to generate electricity in various applications. In particular, the consequences of climate change can be dramatic. To be better tailored to this challenge, utilizing and diversifying renewable energy resources is the key to sustainable growth of current society. To achieve this goal, wide applications of electrochemical energy storage technologies (e.g., batteries) are one of the promising strategies because of their several attractive features, including excellent round-trip efficiency, high power flexibility in various grid applications, durable cycle life, etc.[1,2]. Meanwhile, to provide intermittent renewable energy resources (e.g., solar, wind) into the grid and various applications, rechargeable batteries are known to be a practical solution to energy storage technology. They are energy-saving and environmentally friendly and can be utilized on a large scale, supplying the world with clean and sustainable electricity.
Due to their high volumetric energy density[3,4], lithium (Li)-based batteries have been widely used in many portable electronic devices since the 1990s and are now powering many battery electric vehicles. To improve their performance and capacity, there is tremendous progress in the exploration and design-development of new electrode materials in Li-based batteries, especially the cathodes, including Li-O2 and Li-S batteries[4-6]. However, one of the limiting factors remains to be the electrolytes, which inherently govern the current or power density, cycling performance, electrochemical stability, and safety[6-13]. For a working electrolyte, it has to be both a good ionic conductor and an electronic insulator; therefore, ionic liquids (ILs) are one of the promising candidates for electrolytes[7-13].
For rechargeable batteries, the crucial parameters that determine their IL-based electrolytes include the viscosity (η), ionic conductivity, electrochemical stability, and safety[7-15]. Among these key parameters, ionic conductivity is one of the important performance metrics for these novel liquids in energy storage applications, which is usually determined by the viscosity. In general, low η is strongly correlated with high ionic conductivity[14,15] and is an important feature critically needed in battery applications. The viscosity determines the resistance to flow in ILs and is generally governed by several factors: (1) size; (2) shape; and (3) ionic interactions among the constituent anions and cations. These interactions are determined by electrostatic forces, van der Waals forces, and hydrogen bonds, which depend on the molecular structures of ILs[16-20]. In addition, it was also found that the fluidity of ILs is complicated. Besides the motion of free cations or anions, the influences of ion pairs, clusters, or aggregates are significant in ILs[21,22]. Although these factors are known, finding optimum and practical design principles to perfect the related physicochemical properties (e.g., viscosity) of ILs is not trivial.
It is known that ILs are a specific class of molecular electrolytes characterized by the absence of co-solvent in solution due to their unique interplay between electrostatic and van der Waals interactions. However, understanding their unique viscosity features can be challenging. Particularly, an accurate measurement of viscosity is never a trivial task. Although using viscometers has proven to be an effective means of determining the IL viscosity over a wide range of temperatures and pressures, measuring an extensive selection of ILs at various thermodynamic conditions can be extremely challenging. Meanwhile, to better understand their fundamental properties, the advanced molecular dynamics (MD) simulations based on polarizable force fields can provide thermodynamics, structural correlations, ions dynamics, and collective dynamics description, including viscosity prediction[17,25-29] with great accuracy. However, the huge computational cost generally limits their use and is only suitable for specific problem-based focus studies. According to Katritzky et al., about 1018 combinations of cations or anions could be used to form ILs. The large diversity of IL species and their physicochemical properties make a systematic detailed investigation on these IL compounds exceptionally difficult, especially for both experimental and theoretical studies.
To address this challenge, it is important to develop robust computational tools to benefit the experimental design and synthesis of new ILs with desirable structural properties, such as ILs with practically low η and high ionic conductivity within a room-temperature range. To explore and predict the viscosities of large varieties of ILs, a comprehensive study based on advanced atomistic or molecular simulation can be a challenge and might not be feasible in practice[25-29]. To overcome this challenge, a systematic high-throughput screening supported by a detailed study on a large amount of reported IL datasets, with a specific focus on viscosity analysis, has been proposed in recent years[31-38]. As an affordable solution and a predictive model to estimate or predict the viscosity of various ILs, advanced statistical models following Arrhenius, Litovitz, Andrade, Vogel-Fucher-Tammann (VFT) equations using quantitative structure-property relationships (QSPR) are one of the practical methods[33-35]. In particular, advanced traditional machine-learning (ML) methods, such as support vector machine (SVM) and least-squares SVM (LSSVM) approaches, are found to be very useful[34,38] when combined with structural data of ILs through the group contribution (GC) theory. In terms of the accuracy and prediction capacity, this approach is found to be comparable to classical QSPR methods[33-35].
However, due to the ever-increasing data on ILs in reported literature, a well-timed strategy is to utilize advanced ML methods for a systematic study of various types of IL properties. To accommodate the huge datasets from reported literature, the contemporary deep-learning (DL) models[39-42] are known to outperform traditional statistical methods or ML models because of their capacity to process a huge number of feature properties from big data and intelligent big data analysis in materials science for design and discovery[43,44]. Thus, the development of robust simulation methods that integrate high-throughput screening of huge feature properties of ILs using state-of-the-art data-mining approaches and DL models can be extremely valuable in their fundamental studies. This unique strategy will help us to significantly speed up the exploration and discovery of new ILs from currently known data, complementary to more specific case studies using advanced atomistic simulations. With this as a motivation, we propose to adopt a combined data mining method, chemoinformatic approach, and DL models to high-throughput screen and predict the viscosities for a large variety of ILs, with the hope to benefit the design and development of electrolytes in energy storage applications.
Dataset extraction and preparation
In this work, all the viscosity datasets we used were gathered from the Ionic Liquids Database - ILThermo (v2.0.)[45,46], which is a comprehensive database of thermophysical and thermodynamic properties of ILs in the field. According to the latest update (by 28th Dec 2022), the ILThermo contains 2,732 types of ILs and includes 5,177 compounds, with a total of 870,304 datapoints collected from 4,230 published works of literature. For pure ILs, the database has 2,332 IL systems with 145,602 datapoints related to thermodynamic, thermochemical, and transport properties. For this study, the room-temperature viscosity collected dataset contains 922 types of ILs and includes a wide range of IL families. For this wide range of IL candidates, predicting room-temperature viscosities accurately and understanding their useful structure-property relationship without depending on costly computational resources are important in practice. The details of data extraction, conversion, and post-processing of the ILThermo dataset can be found in our recent work. To account for an extensive description of molecular structure features of each individual IL in the dataset, all the 5,272 molecular descriptors based on chemoinformatic QSPR approach[48,49], e.g., constitutional indices, topological indices, connectivity indices, walk and path counts, etc., were generated based on Dragon7 software.
Prior to the ML study, it is important to remove the unnecessary data. During this process, all the molecular descriptor columns with low variance and those containing missing values or empty columns were removed. All the molecular descriptors and viscosity values were normalized using the Standard Scaler function from scikit-learn (scikit-learn 1.2.0). To overcome the problem of overfitting, the dimension of the original molecule descriptors matrix was further reduced using the Pearson correlation. This helps us to further identify important molecular descriptors that exhibit a statistical significance with a high correlation to the viscosity of ILs. To achieve this goal, the molecular descriptors with low correlations (< 0.20) and high correlations (> 0.90) were further excluded from the dataset. Throughout this process, a set of important molecular structure features consisting of 179 molecular descriptors was identified. After the correlation feature selection and normalization of these molecular descriptors, the dataset was randomly split into training and testing datasets with a ratio of 80/20 (or 80% for training and 20% for testing) for the evaluation of our ML models. To improve the accuracy (i.e., R2) of our ML models in regression analysis, we have removed some of the outliers (i.e., 13 in total). These outliers are not restricted to particular cationic or anionic species. They are mostly ILs with high η (η ~ 150-2,030 mPa·s) and might not be very useful in batterie applications. Generally, all these outliers are large molecules and have large molecular weight
Machine-learning model: deep-learning
In this work, we have considered two types of DL models, i.e., deep neural networks (DNN) and convolutional neural networks (CNN), which are based on the algorithms implemented in TensorFlow (i.e., TensorFlow 2), a popular ML framework that provides a high-level python API to construct and train DL models. DL is a subfield of ML and is an artificial neural network (ANN) that is essentially represented by multiple layers of neural networks. While a neural network with a single layer can still make approximate predictions, additional hidden layers can help to optimize and refine for accurate prediction[39-41]. These neural networks attempt to mimic how the human brain processes information progressively with higher-level features from large data and are able to develop a hierarchy of learning processes based on a set of algorithms defined within each layer.
For a DL model, a multi-layer feedforward neural network is constructed with multiple hidden layers, of which each layer contains predefined numbers of neurons to capture the non-linear relationship between the input features (e.g., molecular descriptors of ILs) and the output viscosities. The network is trained using stochastic gradient descent with backpropagation to minimize the mean squared error between the predicted and actual viscosity values. Figure 1 shows a general structure of our DNN and CNN models, which consists of multiple layers, with each layer containing a different number of neurons. As shown in Figure 1A, our DNN model consists of one input layer (179 neurons), 1st hidden layer (128 neurons), 2nd hidden layer (64 neurons), 3rd hidden layer (32 neurons), 4th hidden layer (16 neurons), and output layer (1 neuron). In contrast, our CNN model [Figure 1B] is a neural network with basic building blocks/layers consisting of tensors except for the output layer. Specifically, Figure 1B features one input layer represented by 1D convolutional layer with tensor (179, 1), 1st hidden layer based on 1D convolutional layer with tensor (177, 32), 2nd hidden layer as flattened layer (5,664 neurons), 3rd hidden layer (180 neurons), 4th hidden layer (128 neurons), 5th hidden layer (64 neurons), 6th hidden layer (32 neurons), 7th hidden layer
RESULTS AND DISCUSSION
Data consolidation and evaluation
In this work, we are only interested in the viscosity of pure ILs that are close to the room temperature (i.e.,
In this dataset, a wide diversity of structures and functional groups in ILs with various combinations of cations (e.g., imidazolium, ammonium, pyridinium, phosphonium, pyrrolidinium, etc.) and anions (e.g., bistriflimide (NTF2) derivatives, sulfonate, phosphate, hexafluorophosphate, borate, sulfate, acetate, dicyanamide, triazolide, etc.) families can be found and is highlighted in Figure 2. From this large dataset (922 types of ILs), a wide distribution of room temperature measured η can be found, whose viscosities vary from 2 to 97,000 (in mPa·s) based on various combinations of anionic and cationic families of ILs. Although the imidazolium-based cation is the dominant cationic family in this dataset [Figure 2], it covers a wide variety of ILs. Among the 320 types of ILs, those consisting of imidazolium-related cations exhibit a wide range of viscosity values, i.e., spanning from 1-ethyl-1H-imidazolium acetate to 1-(2-cyanoethyl)-3-(phenylmethyl)-1H-imidazolium chloride with η ~ 4-69,000 mPa·s. Meanwhile, for the imidazolium-based ILs that with viscosity < 50 mPa·s, there are only 77 candidates are found from 320, as shown in Figure 2. Similarly, a wide range of η ~ 12-20,100 mPa·s is also found among the dominant NTF2 anionic species
Viscosity prediction from deep-learning (DL) models
To measure the accuracy of the prediction from the DL models (i.e., DNN and CNN) described in Section "Machine-learning model: deep-learning", three metrics, such as the square coefficient of correlation (R2), the root mean square error (RMSE), and mean absolute error (MAE), were used to assess the performance of the DL models for the test dataset. Based on the selected 179 molecular descriptors for each IL (Section "Dataset extraction and preparation"), a summary of the performance of DL models on test dataset prediction can be found in Figure 3 and Table 1. As shown in Figure 3, a linear regression with high R2 value [Table 1] can be found. Supported by the high R2 value, the predicted viscosity values from both DNN and CNN models are found to be randomly distributed along the diagonal line with only a small deviation, compared with experimentally reported viscosity values at room temperature.
Figure 3. The DL model predicted viscosity values vs. experimentally measured viscosity values (both in mPa·s unit) in the test dataset, with (A) representing the DNN model and (B) representing the CNN model separately. The dataset used consists of 909 types of ILs with 179 molecular descriptors for each IL.
The comparison of the performance metrics for test predictions of IL viscosities obtained in this work with reported literature. ANN is a feedforward artificial neural network, and GC/FFANN-LSSVM is a combined two-layer feedforward artificial neural network and least-squares support vector machine based on a group contribution scheme for ILs. MTL-TransCNN is a multi-task learning model based on transformer convolutional neural networks using QSPR models. GC/SVM is a support vector machine model using a newly improved scheme of group contributions of ILs
|Reference||ML model||Number of distinct ILs
|This work||DNN||909||298 ± 5||0.9869||63.78||41.68|
|This work||CNN||909||298 ± 5||0.9980||45.27||30.42|
|Beckner et al., 2018||ANN||723||273.15-373.15||0.9290a||0.6856||N/A|
|Paduszyński et al., 2019||GC/FFANN-LSSVM||1,596||290-410||0.9120b||203.43||N/A|
|Baskin et al., 2022||MTL-TransCNN||988||298||0.690c||0.40||0.28|
|Baskin et al., 2022||MTL-TransCNN||988||288-343||0.674d||0.375||0.265|
|Boualem et al., 2022||GC/SVM||1,654||253-571||0.9859e||57.92||N/A|
As shown in Table 1, we found that the accuracy of viscosity prediction is, in general, dictated by the sample size of distinct ILs (n), ML models, and the representation of structure features or molecular descriptors of ILs. Compared to the reported literature[34,36-38], the test R2 score and RMSE for the prediction of viscosity around room temperature (T ~ 298 K) reported in this work are found to be outstanding. Among the DL models we considered, the performance metrics obtained from CNN models are found to be among the best in reported literature, especially for the prediction of room-temperature viscosities (i.e., R2 ~ 0.9980, RMSE ~ 45.27, MAE ~ 30.42) [Table 1]. For the DNN model, the prediction accuracy is also found to be excellent, i.e., R2 ~ 0.9869, RMSE ~ 63.78, MAE ~ 41.68 [Table 1], compared to the reported literature, especially for the room-temperature viscosity prediction.
To further examine the influences of molecular descriptors on the performance of DL models, the top important molecular descriptors for these models were computed, and the top 20 most important molecular descriptors can be found in Supplementary Table 1. Based on Pearson correlation, these 20 descriptors are found to have important impacts on the room-temperature viscosity of ILs obtained from DNN and CNN models. For DNN models, the performance of metrics is found to be nearly similar (i.e., R2 ~ 0.988,
Prior to our study, a systematic study of IL viscosity prediction across a wide range of IL families based on the ILThermo dataset[45,46] has been reported by Beckner et al. Based on the dataset, consisting of 723 distinct ILs for temperature range 273.15-373.15 K, pressure range of 60-160 kPa, and η range of 3.5-993 mPa·s, a reasonably good model of viscosity prediction can be obtained based on traditional ML models. According to Beckner et al., high accuracy of viscosity prediction (R2 ~ 0.93, RMSE ~ 0.69 mPa·s) [Table 1] can be achieved based on feedforward ANN (FFANN). For that work, the result is based on 11 molecular descriptors selected by the least absolute shrinkage and selection operator (LASSO) from 633 molecular structure descriptors of ILs using the QSPR model.
The ability to predict the viscosities of a wide range of ILs accurately using the QSPR model has inspired further studies of physicochemical properties of other ILs in combination with diverse ML methods and molecular representations derived from QSPR models. To achieve this goal, a large-scale benchmark study of QSPR models combining several ML methods (e.g., random forest regression, RFR; extreme gradient boosting, XGBoost; TransCNN, etc.) with different types of molecular representations to predict several key physical properties of ILs (i.e., electrical conductance, density, refractive index, melting point, viscosity, and surface tension) was reported by Baskin et al. recently. As shown by Baskin et al., it is possible to predict N different properties of ILs at the same condition simultaneously or the same property under N different conditions (e.g., temperature) simultaneously based on multi-task learning (MTL) models. From this MTL model, it was found that the accuracy, unfortunately, is low for viscosity prediction (R2 = 0.69, RMSE = 0.40, MAE = 0.28) [Table 1] when making a prediction for five physical properties (i.e., electrical conductance, density, surface tension, viscosity, and refractive index) at T = 298 K during the test prediction. Whereas for MTL models in making a simultaneous prediction for viscosities at different temperatures within the range of T = 288-343 K, the performance metrics were found to be nearly similar (i.e., R2 = 0.674, RMSE = 0.375, MAE = 0.265) [Table 1] during the cross-validation test prediction. Thus, this suggests that while more robust than a single-task QSPR model that only manages to predict one physical property at room temperature, how to systematically improve the accuracy of an MTL model capable of providing simultaneous predictions of several room-temperature properties or predicting a property at several different temperatures remains a challenge.
In addition to QSPR models adopted by Beckner et al., one of the best approaches reported in the literature was based on the GC method developed by Paduszyński et al.[33,34,36]. Based on the GC method, the complex functional groups used to describe a diverse family of ILs can be represented as their cations and anions fragmented into a set of predefined molecular groups or fragments. And each group has a direct influence on the property (e.g., η(T) = η0f(T) where η0 denotes the viscosity at a reference temperature, T0 = 298.15 K) value as a function of temperature (f(T)), and can be combined with ML algorithms (e.g., FFANN; stepwise multiple linear regression, LSSVM) for a systematic study. Based on an extensive IL dataset, the FFANN is found to be the best ML model to predict the viscosity at room temperature (η0) with test accuracy of
To improve upon the GC method inspired by Paduszyński et al.[33,55-57], a newly improved GC fragmentation scheme has been proposed by Boualem et al. recently. This new GC scheme is capable of defining a large variety of ILs and is able to differentiate among the isomers. According to this scheme, the molecular structure of each IL can be considered as a collection of three separate groups, i.e., cationic, anionic, and substituent groups. The cationic and anionic groups are the constituent groups, representing charged components/fragments of ions, whereas the substituent groups are those representing neutral components/fragments of various types of side chains[33,38,58]. With this new approach, it is possible to overcome the limitation related to structural representation that is not sufficiently described in a conventional GC scheme[33,34]. With this new fragmentation scheme combined with SVM, the overall accuracy of the test prediction of η(T) reported by Boualem et al. was impressive, i.e., R2 ~ 0.9859 with RMSE ~ 57.92 mPa·s, despite being the largest dataset considered to date [Table 1]. Thus, this suggests that besides the ML model itself, a systematic classification and robust identification of molecular structural features of ILs are also critical factors in improving the accuracy of the model.
Low- and high-viscosity binary classification prediction from deep-learning model
In addition to the room-temperature viscosity prediction (Section "Viscosity prediction from deep-learning (DL) models"), it is also necessary to investigate the important structure-property relationship that determines the low- or high-η of various ILs at room temperature based on a robust binary classification of ML models. It is noteworthy to point out that the extremely low-η (< 2-3 mPa·s) ILs at room temperature comparable to traditional solvents (e.g., water, ethanol) are very rare in the ILThermo (v2.0.) database[45,46], i.e., only six candidates (i.e., from 2-methylpyridinium acetate to N-methyl-2-oxopyrrolidinium butanoate) out of 922 types of ILs are found in this work. Whereas for the ILs below the range of η ~ 10 mPa·s at room temperature, only about 19 types of ILs can be found (e.g., 2-methylpyridinium acetate, trihexylammonium hexanoate, 1-ethyl-3-methyl-1H-imidazolium tricyanomethanide, etc.). The very limited dataset makes ML model prediction, especially for the extremely low-η ILs (< 2-3 mPa·s or ≤ 10 mPa·s), becomes very difficult. According to a recent comprehensive review of room-temperature viscosities of ILs by Jiang et al., typical room-temperature low-to-medium viscosities of ILs were commonly found below 100 mPa·s, whereas for high-η, the η is generally larger than 100 mPa·s. In addition, it is also found that the most commonly used low-η ILs in metal-ion batteries at room temperature are within the range of η ~ 19-156 mPa·s. With this as a reference[18,59], all the ILs studied in this work can be classified into two groups, i.e., low-η ILs
Figure 4. (A) The schematic plot of the room-temperature viscosity distribution (in log(mPa·s) scale) that classifies the low-η
To further identify the hidden correlation of the structure-property relationship of these low-η and high-η ILs, a DL model based on a DNN model is constructed [Supplementary Figure 2], which is similar to the DNN model discussed in Section "Machine-learning model: deep-learning". Based on the 179 selected molecular descriptors (M = 179 in Table 2), similar to our aforementioned viscosity prediction model (Section "Viscosity prediction from deep-learning (DL) models"), the overall accuracy of the low/high-η binary classification reaches 93% for test prediction, even with a skewed dataset (i.e., ~58.6% high-η ILs relative to ~41.4% low-η ILs). To better evaluate the performance of this DNN model in low/high-η binary classification, the positive precision (positive predictive value, PPV) (i.e., the accuracy of positive prediction/high-η prediction), negative precision (negative predictive value, NPV) (i.e., the accuracy of negative prediction/low-η prediction), recall for sensitivity of positive class (i.e., true positive rate, TPR), and recall for specificity of negative class (i.e., true negative rate, TNR) are used. These metrics are defined in the following equations:
The DNN model performance metrics for test predictions of low/high-η binary classification for 922 types of ILs obtained in this work with different numbers (M) of molecular descriptors for each IL. In this table, the negative case is low-η, while the positive case is high-η. The dataset of low/high-η binary classification for 922 types of ILs with the top 20 molecular descriptors can be found in File2.csv in the Supplementary Material
|M = 179||precision||recall||M = 20||precision||recall|
positive precision =
negative precision =
recall (sensitivity) =
Based on the computed confusion matrix [Supplementary Figure 3], all the values of TP, TN, FP, and FN can be obtained. Based on the test sample (i.e., 20% of dataset), the overall performance metrics of DNN models [Supplementary Figure 2] in low/high-η binary classification are shown in Table 2. Despite the skewed dataset, a balanced accuracy in both negative (NPV ~ 91%) and positive precision (PPV ~ 95%) was found [Table 2] based on our DNN model. For positive (i.e., high-η) prediction, the TPR of 92% is slightly lower than TNR (i.e., 94%), even though a larger sample size in positive cases (high-η). To further examine the influence of the number of molecular descriptors on the DNN model performance in low/high-η binary classification, the top-ranking important molecular descriptors were computed. From the feature filtering and ranking scores [Figure 5] of DNN models, the important correlation among the top 20 molecular descriptors that are important in low/high-η binary classification of ILs was obtained using Pearson correlation. Based on the DNN model with only the top 20 molecular descriptors [Figure 5], the overall accuracy of the low/high-η binary classification is 90 % for test prediction and is slightly lower than the 179 selected descriptor-based test accuracy, i.e., 93%. As shown in Table 2, the overall performance metrics (i.e., PPV = 89%, NPV = 92%, TPR = 94%, TNR = 86%) of the DNN model with M = 20 are also generally lower than the accuracy obtained with M = 179. This suggests that the top 20 important molecular descriptors might not be sufficient to achieve the high accuracy anticipated in low/high-η binary classification prediction according to our current DNN model.
Figure 5. A Pearson correlation matrix comparing the viscosity with the top 20 important molecular descriptors for low/high-η binary classification of ILs based on DNN models. The numbers shown in the Pearson correlation matrix are the feature correlation coefficients among different molecular descriptors. A brief description of these molecular descriptors can be found in Supplementary Table 2.
Important molecular descriptors that determine viscosities of ILs
According to the DNN model, the top 20 important molecular descriptors that determine the low/high-η of ILs are highlighted in both Figure 5 and Supplementary Table 2. From Supplementary Tables 1 and 2, some common important molecular descriptors of ILs can be found in the findings of viscosity prediction
To improve our basic understanding on how these top ranked molecular descriptors
Figure 6. The distribution of the IL viscosity that distinguishes the low-η (represented as 0) and high-η (represented as 1) using the combination of two important molecular descriptors selected from Pearson correlation matrix: (A) CATS2D_03_DA vs. MPC10; (B) P_VSA_ppp_L vs. ATSC7i; (C) CATS2D_03_DA vs. piPC10; and (D) ATSC7i vs. ATS8m. The yellow region highlights the dominance of high-η ILs, whereas the green region highlights the prevalence of low-η ILs in the distribution. For the ILs with very low η (η < 30 mPa·s) that could be potentially useful in battery electrolytes, they are mostly found in the red dotted region. For the complete list of these potentially useful ILs (η < 30 mPa·s), it can be found in File1.csv in the Supplementary Material.
Whereas for MPC10, it is a molecular path count (MPC) of order ten topological descriptors counting the total number of molecule paths of length m (in this case is ten). The length m of the path is the number of edges along the molecular path and is related to path order in an IL[48,49,62]. This is complementary to piPC10, which quantifies multiple path counts. piPC10 is a descriptor that can capture bond order features (e.g., aromatic bonds) and belongs to the path count descriptor group[48,49,63]. It is a count of molecular graph weighted paths of a given length (in this case is ten) in the molecular path, where each path is weighted by the product of the conventional bond order of the involved edges and, therefore, can account for multiple bonds in an IL[48,49,63]. Large values in MPC10 imply large values in molecular branching or presence of long chain branches (e.g., 17-hydroxy-N-(17-hydroxy-3,6,9,12,15-pentaoxaheptadec-1-yl)-N-methyl-N-tetradecyl-3,6,9,12,15-pentaoxaheptadecan-1-aminium methyl sulfate) in ILs. Whereas larger values in piPC10 imply significant presence of multiple bonds (e.g., single, double, aromatic bonds) in molecular branches of ILs (e.g., L-phenylalanine benzyl ester bis(perfluoroethylsulfonyl)imide). For those ILs consisting of anionic and cationic species, this suggests that despite large molecular branching in IL cations, a reduced chain length or a minimal number of hydrogen bonds acceptor-donor in IL anions (e.g., low viscous trioctylammonium butanoate with η ~ 13 mPa·s, MPC10 = 3.1, CATS2D_03_DA = 0 in Figure 6A) helps reduce the IL viscosities, analogous to dicyanamide-based ILs[18,64].
The P-VSA-like descriptors are the molecular descriptors that define the amount of van der Waals surface area (VSA) having a property in a certain range. According to P-VSA-based models[48,49,65], P_VSA_ppp_L is the descriptor measuring the potential pharmacophore points of lipophilic, an important factor that estimates the level of lipophilicity. High-lipophilicity molecules tend to be hydrophobic or less polar, whereas for ATSC7i, it is a 2D autocorrelation descriptor [Supplementary Table 2] based on autocorrelation of a topological structure (ATS) that describes how a property is distributed along the topological structure[48,49,66]. For ATSC7i, it is a descriptor measuring the centered Broto-Moreau autocorrelation of lag seven used to weigh all the contributions of the ionization potential of each different path length (lag) in the molecular graph. Smaller values of ATSC7i imply a strong tendency in the formation of cations. Therefore, from Figure 6B, it is found that the small values in both P_VSA_ppp_L and ATSC7i tend to yield low-η with highly ionic ILs (e.g., N-methyl-2-oxopyrrolidinium acetate with η ~ 2.4 mPa·s, P_VSA_ppp_L = 13.9, ATSC7i = 0 in Figure 6B). In contrast, for high-η ILs (e.g., trihexyl(tetradecyl)phosphonium chloride with
Thus, from the general trend of molecular descriptors for low/high-η ILs highlighted in Figure 6A-D, a set of useful design rules to tune the IL viscosity through molecular grafting can be obtained. Specifically for low-η ILs, small values in all these important descriptors, i.e., CATS2D_03_DA, MPC10, piPC10, P_VSA_ppp_L, ATSC7i, and ATS8m, are generally preferred (i.e., the red dotted region in Figure 6A-D and File1.csv in the Supplementary Material). From this observation, a list of 96 IL candidates with very low viscosities (i.e., below ~30 mPa·s), which might be potentially useful electrolytes in battery applications, can be identified [File1.csv in the Supplementary Material]. Thus, the observation [Figure 6] suggests that in order to reduce the IL viscosity, grafting IL cations into smaller sizes (e.g., smaller head rings) and short alkyl chains and reducing ionization potentials/energies will help. Meanwhile, for the same cations, further reducing anions in sizes, chain length, and hydrogen bonds [e.g., TFSI-, N(CN)2-] might be useful to further help to reduce the viscosities. Thus, to fine-tune the IL viscosity, a synergistic effect to achieve an optimum design of both cations and anions in ILs is deemed necessary.
ILs are a new group of solvents with great potential in design synthesis. They are promising electrolytes in energy storage applications, especially in rechargeable batteries. It is known that the variation in viscosity of ILs can lead to subtle effects in their transport properties, e.g., ionic conductivity, charge transfer rate, etc., which are extremely important factors in the development of novel electrolytes for energy storage applications[14,15,59]. However, in practice, the usages of ILs remain limited due to their unfavorable viscosity property at ambient conditions. To optimize their design synthesis, a systematic fundamental study of structure-property relationships in ILs is deemed necessary. With this as our motivation, we pursued a baseline study to investigate the trend of room-temperature viscosities for various types of ILs. To search for important insights that will lead us to fine-tune the viscosity of ILs, an integrated approach that combines high-throughput screenings of a large IL dataset, chemoinformatics, and DL models is adopted in this work.
Based on the dataset obtained from ILThermo (v2.0)[43,44], we have constructed a robust DL model (Section "Viscosity prediction from deep-learning (DL) models") to predict the viscosity of ILs at ambient conditions with high accuracy. We achieved R2 scores of ~0.9869 (RMSE ~ 63.78 mPa·s) and 0.9980
Despite a large deviation in viscosity distributions and a huge variety of anion-cation combinations presented in the IL dataset, we found the viscosities of various ILs can be determined based on a limited number of important molecular descriptors. From the top ranked molecular features/descriptors obtained from DL models (Section "Low- and high-viscosity binary classification prediction from deep-learning model"), the molecular features, i.e., the geometrical structures, shapes or branching characters, constitutional molecular weights or sizes, partial charges, functional groups, local bonds, hydrogen bonding, and van der Waals interactions, related to both anions and cations of ILs are found equally important. Similarly, based on the DL model, an important structure-property relationship governed by a set of important molecular descriptors (e.g., ATSC7i, ATS8m, MPC10, P_VSA_ppp_L, etc. in Figures 5 and 6) of the cations and anions of ILs in defining the low/high-η ILs can be found (Section "Important molecular descriptors that determine viscosities of ILs"). The analysis of DL model prediction suggests that in order to reduce the IL viscosity, grafting IL cations into smaller sizes (e.g., smaller head rings) and short alkyl chains and also reducing ionization potentials/energies will help. Meanwhile, for the same cations, further reducing anions in size, chain length, and hydrogen bonds (e.g., TFSI-, N(CN)2-) might be useful to further help reduce the viscosity of ILs. This is supported by a list of potentially useful ILs with very low η (i.e.,
Thus, with a fine selection and molecular grafting of anionic and cationic species in ILs, we believe the design synthesis of appropriate molecular functional groups in ILs is vital to fine-tune the viscosity for potentially useful electrolytes. However, in order to obtain robust ILs for battery applications in practice, simply fine-tuning IL viscosity is insufficient. To identify robust IL-based electrolytes, considering a synergistic effect of achieving an optimum design of both cations and anions in ILs to simultaneously fulfill other important physicochemical properties (e.g., redox stability, salt concentration, ionic conductivity, melting/boiling points, thermal conductivity, heat capacity) relevant to specific energy storage applications, further development of more sophisticated multi-modal DL models will be necessary.
Conceptualization; supervision; project administration; funding acquisition: Lau KC
Methodology: Acar Z, Nguyen P, Cui X, Lau KC
Formal analysis: Acar Z, Cui X, Lau KC
Data curation: Nguyen P
Writing-original draft preparation; writing-review and editing: Cui X, Lau KC
Visualization: Acar Z, Nguyen P, Lau KCAvailability of data and materials
The data supporting our work can be found in the Supplementary Material.Financial support and sponsorship
This work was supported by the U.S. Research Corporation for Science Advancement (RCSA) through the Cottrell Scholar Award (Award No. 26829).Conflicts of interest
All authors declared that there are no conflicts of interest.Ethical approval and consent to participate
Not applicable.Consent for publication
© The Author(s) 2023.Supplementary Materials
1. Goodenough JB. Electrochemical energy storage in a sustainable modern society. Energy Environ Sci 2014;7:14-8.
2. Trahey L, Brushett FR, Balsara NP, et al. Energy storage emerging: a perspective from the joint center for energy storage research. Proc Natl Acad Sci USA 2020;117:12550-7.
3. Aurbach D, Markevich E, Salitra G. High energy density rechargeable batteries based on Li metal anodes. the role of unique surface chemistry developed in solutions containing fluorinated organic co-solvents. J Am Chem Soc 2021;143:21161-76.
4. Li Q, Chen J, Fan L, Kong X, Lu Y. Progress in electrolytes for rechargeable Li-based batteries and beyond. Green Energy Environ 2016;1:18-42.
5. Lu J, Lee YJ, Luo X, et al. A lithium-oxygen battery based on lithium superoxide. Nature 2016;529:377-82.
6. Ponnada S, Kiai MS, Gorle DB, Nowduri A. History and recent developments in divergent electrolytes towards high-efficiency lithium-sulfur batteries - a review. Mater Adv 2021;2:4115-39.
7. Watanabe M, Thomas ML, Zhang S, Ueno K, Yasuda T, Dokko K. Application of ionic liquids to energy storage and conversion materials and devices. Chem Rev 2017;117:7190-239.
9. Zhang J, Sun B, Zhao Y, et al. A versatile functionalized ionic liquid to boost the solution-mediated performances of lithium-oxygen batteries. Nat Commun 2019;10:602.
10. Josef E, Yan Y, Stan MC, et al. Ionic liquids and their polymers in lithium-sulfur batteries. Isr J Chem 2019;59:832-42.
11. Ortiz-martínez V, Gómez-coma L, Pérez G, Ortiz A, Ortiz I. The roles of ionic liquids as new electrolytes in redox flow batteries. Sep Purif Technol 2020;252:117436.
12. Giffin GA. The role of concentration in electrolyte solutions for non-aqueous lithium-based batteries. Nat Commun 2022;13:5250.
13. Cao X. Important factors for the reliable and reproducible preparation of non-aqueous electrolyte solutions for lithium batteries. Commun Mater 2023;4:10.
14. Martin S, Pratt HD 3rd, Anderson TM. Screening for high conductivity/low viscosity ionic liquids using product descriptors. Mol Inform 2017;36:1600125.
15. Tiago GAO, Matias IAS, Ribeiro APC, Martins LMDRS. Application of ionic liquids in electrochemistry-recent advances. Molecules 2020;25:5812.
16. Barthen P, Frank W, Ignatiev N. Development of low viscous ionic liquids: the dependence of the viscosity on the mass of the ions. Ionics 2015;21:149-59.
17. Tsuzuki S, Shinoda W, Saito H, Mikami M, Tokuda H, Watanabe M. Molecular dynamics simulations of ionic liquids: cation and anion dependence of self-diffusion coefficients of ions. J Phys Chem B 2009;113:10641-9.
18. Jiang S, Hu Y, Wang Y, Wang X. Viscosity of typical room-temperature ionic liquids: a critical review. J Phys Chem Ref Data 2019;48:033101.
19. Philippi F, Rauber D, Eliasen KL, et al. Pressing matter: why are ionic liquids so viscous? Chem Sci 2022;13:2735-43.
20. Koutsoukos S, Philippi F, Malaret F, Welton T. A review on machine learning algorithms for the ionic liquid chemical space. Chem Sci 2021;12:6820-43.
21. Hayes R, Warr GG, Atkin R. Structure and nanostructure in ionic liquids. Chem Rev 2015;115:6357-426.
22. Marullo S, D'Anna F, Rizzo C, Billeci F. Ionic liquids: “normal” solvents or nanostructured fluids? Org Biomol Chem 2021;19:2076-95.
24. Diogo JCF, Caetano FJP, Fareleira JMNA, Wakeham WA. Viscosity measurements on ionic liquids: a cautionary tale. Int J Thermophys 2014;35:1615-35.
25. Bedrov D, Piquemal JP, Borodin O, MacKerell AD Jr, Roux B, Schröder C. Molecular dynamics simulations of ionic liquids and electrolytes using polarizable force fields. Chem Rev 2019;119:7940-95.
26. Zhang Y, Otani A, Maginn EJ. Reliable viscosity calculation from equilibrium molecular dynamics simulations: a time decomposition method. J Chem Theory Comput 2015;11:3537-46.
27. Kirova EM, Norman GE. Viscosity calculations at molecular dynamics simulations. J Phys Conf Ser 2015;653:012106.
28. Goloviznina K, Canongia Lopes JN, Costa Gomes M, Pádua AAH. Transferable, polarizable force field for ionic liquids. J Chem Theory Comput 2019;15:5858-71.
29. Vázquez-Montelongo EA, Vázquez-Cervantes JE, Cisneros GA. Current status of AMOEBA-IL: a multipolar/polarizable force field for ionic liquids. Int J Mol Sci 2020;21:697.
30. Katritzky AR, Jain R, Lomaka A, et al. Correlation of the melting points of potential ionic liquids (imidazolium bromides and benzimidazolium bromides) using the CODESSA program. J Chem Inf Comput Sci 2002;42:225-31.
31. Berrod Q, Ferdeghini F, Zanotti JM, et al. Ionic liquids: evidence of the viscosity scale-dependence. Sci Rep 2017;7:2241.
32. Chen Y, Peng B, Kontogeorgis GM, Liang X. Machine learning for the prediction of viscosity of ionic liquid-water mixtures. J Mol Liq 2022;350:118546.
33. Paduszyński K, Domańska U. Viscosity of ionic liquids: an extensive database and a new group contribution model based on a feed-forward artificial neural network. J Chem Inf Model 2014;54:1311-24.
34. Paduszyński K. Extensive databases and group contribution QSPRs of ionic liquids properties. 2. viscosity. Ind Eng Chem Res 2019;58:17049-66.
35. Chen B, Liang M, Wu T, Wang HP. A high correlate and simplified QSPR for viscosity of imidazolium-based ionic liquids. Fluid Phase Equilibria 2013;350:37-42.
36. Beckner W, Mao CM, Pfaendtner J. Statistical models are able to predict ionic liquid viscosity across a wide range of chemical functionalities and experimental conditions. Mol Syst Des Eng 2018;3:253-63.
37. Baskin I, Epshtein A, Ein-eli Y. Benchmarking machine learning methods for modeling physical properties of ionic liquids. J Mol Liq 2022;351:118616.
38. Boualem AD, Argoub K, Benkouider AM, Yahiaoui A, Toubal K. Viscosity prediction of ionic liquids using NLR and SVM approaches. J Mol Liq 2022;368:120610.
39. Goodfellow I, Bengio Y, Courville A. Deep learning, 1st ed. Cambridge, MA: MIT Press; 2016, pp. 363-405.
41. Sarker IH. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci 2021;2:420.
42. Sejnowski TJ. The unreasonable effectiveness of deep learning in artificial intelligence. Proc Natl Acad Sci USA 2020;117:30033-8.
43. Liu Y, Zhao T, Ju W, Shi S. Materials discovery and design using machine learning. J Materiomics 2017;3:159-77.
44. Liu Y, Guo B, Zou X, Li Y, Shi S. Machine learning assisted materials design and discovery for rechargeable batteries. Energy Stor Mater 2020;31:434-50.
45. Dong Q, Kazakov A, Muzny C, et al. Ionic liquids database - ILThermo. 2006. Available online: https://ilthermo.boulder.nist.gov/ILThermo/mainmenu.uix [Last accessed on 16 August 2023].
46. Kazakov A, Magee J, Chirico R, et al. Ionic liquids database - ILThermo (v2.0). 2013. Available online: https://trcsrv1.boulder.nist.gov/ilthermo/ilthermo.html [Last accessed on 16 August 2023].
47. Acar Z, Nguyen P, Lau KC. Machine-learning model prediction of ionic liquids melting points. Appl Sci 2022;12:2408.
48. Talete srl dragon. Version 7.0 software for molecular descriptor calculation. Available online: https://chm.kode-solutions.net/pf/dragon-7-0/ [Last accessed on 16 August 2023].
49. Todeschini R, Consonni V. Molecular descriptors for chemoinformatics, 1st ed. Weinheim, Germany: Wiley-VCH Verlag GmbH & Co; 2009.
50. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. J Mach Learn Res 2011;12:2825-30. Available online: https://github.com/scikit [Last accessed on 16 August 2023]
51. Gutman I, Milovanović E, Milovanović I. Beyond the zagreb indices. AKCE Int J Graphs Comb 2020;17:74-85.
53. Abadi M, Agarwal A, Barham P, et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. 2015. Available online: https://research.google/pubs/pub45166/ [Last accessed on 16 August 2023].
55. Chen Y, Kontogeorgis GM, Woodley JM. Group contribution based estimation method for properties of ionic liquids. Ind Eng Chem Res 2019;58:4277-92.
56. Baghban A, Kardani MN, Habibzadeh S. Prediction viscosity of ionic liquids using a hybrid LSSVM and group contribution method. J Mol Liq 2017;236:452-64.
57. Lazzús JA, Pulgar-villarroel G. A group contribution method to estimate the viscosity of ionic liquids at different temperatures. J Mol Liq 2015;209:161-8.
58. Huang Y, Dong H, Zhang X, Li C, Zhang S. A new fragment contribution-corresponding states method for physicochemical properties prediction of ionic liquids. AIChE J 2013;59:1348-59.
59. Zhou W, Zhang M, Kong X, Huang W, Zhang Q. Recent advance in ionic-liquid-based electrolytes for rechargeable metal-ion batteries. Adv Sci 2021;8:2004490.
60. Geron A. Hands-on machine-learning with scikit-learn, Keras and TensorFlow, 2nd ed. Canada: O’Reilley; 2019.
61. Reutlinger M, Koch CP, Reker D, et al. Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for 'orphan' molecules. Mol Inform 2013;32:133-8.
62. Randić M, Jerman-blazić B, Grossman S, Rouvray D. A rational approach to the optimal design of drugs. Math Model 1987;8:571-82.
63. Randić M, Jurs PC. On a fragment approach to structure-activity correlations. Quant Struct Act Relat 1989;8:39-48.
64. Yuan WL, Yang X, He L, Xue Y, Qin S, Tao GH. Viscosity, conductivity, and electrochemical property of dicyanamide ionic liquids. Front Chem 2018;6:59.
66. Broto P, Moreau G, Vandicke C. Molecular structures: perception, autocorrelation descriptor and sar studies. Autocorrelation descriptor. Eur J Med Chem 1984;19:66-70. Available online: https://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=9624511 [Last accessed on 18 August 2023].
Cite This Article
Acar Z, Nguyen P, Cui X, Lau KC. Room temperature ionic liquids viscosity prediction from deep-learning models. Energy Mater 2023;3:300039. http://dx.doi.org/10.20517/energymater.2023.38
Acar Z, Nguyen P, Cui X, Lau KC. Room temperature ionic liquids viscosity prediction from deep-learning models. Energy Materials. 2023; 3(5): 300039. http://dx.doi.org/10.20517/energymater.2023.38
Acar, Zafer, Phu Nguyen, Xiaoqi Cui, Kah Chun Lau. 2023. "Room temperature ionic liquids viscosity prediction from deep-learning models" Energy Materials. 3, no.5: 300039. http://dx.doi.org/10.20517/energymater.2023.38
Acar, Z.; Nguyen P.; Cui X.; Lau KC. Room temperature ionic liquids viscosity prediction from deep-learning models. Energy Mater. 2023, 3, 300039. http://dx.doi.org/10.20517/energymater.2023.38
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at email@example.com.