Solubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network

Authors

Abstract

The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 (±0.3) when all data points were used as training set and the solubilities were back-calculated. The AE for predicted solubilities using a trained network employing 1/3 of data points from each set was 0.4 (±0.3) and this finding reveals that the network is well trained using a limited number of experimental data. To provide a full predictive method, data sets were divided into two sets and the network was trained using 20 data sets and the next 20 sets were used as prediction sets. The produced average AEs (±SD) were 1.7 (±1.1) and 1.6 (±1.5), for two sets of analyses. In these analyses, only the computational descriptors, temperature and pressure ofSC-CO2 were used and no experimental solubility data is employed.

Keywords


Acute and Subchronic Toxicity?of Teucrium polium Total Extract in Rats
Iranian Journal of Pharmaceutical Research (2007), 6 (4): 243-250
Received: November 2006
Accepted: May 2007

Copyright ? 2007 by School of Pharmacy
Shaheed Beheshti University of Medical Sciences and Health Services

Original Article

Solubility Prediction of Drugs in Supercritical Carbon Dioxide
Using Artificial Neural Network

 

Abolghasem Jouybana*, Somaieh Soltanib and Karim Asadpour Zeynalic

aFaculty of Pharmacy and Drug Applied Research Center, Tabriz University of Medical Sciences, Tabriz, Iran. bNanotechnology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran. cFaculty of Chemistry, University of Tabriz, Tabriz, Iran.

 

Abstract

The descriptors computed by HyperChem? software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (?SD) of data sets was 0.4 (?0.3) when all data points were used as training set and the solubilities were back-calculated. The AE for predicted solubilities using a trained network employing 1/3 of data points from each set was 0.4 (?0.3) and this finding reveals that the network is well trained using a limited number of experimental data. To provide a full predictive method, data sets were divided into two sets and the network was trained using 20 data sets and the next 20 sets were used as prediction sets. The produced average AEs (?SD) were 1.7 (?1.1) and 1.6 (?1.5), for two sets of analyses. In these analyses, only the computational descriptors, temperature and pressure ofSC-CO2 were used and no experimental solubility data is employed.

 

Keywords: Solubility prediction; Supercritical carbon dioxide; Artificial neural network; Pharmaceuticals.

Introduction

Supercritical fluid technology provides great potential in pharmaceutical industry. The properties of supercritical fluids (SCFs) are between liquids and gases. Density (a property representing solubilization power) of SCFs is similar to that of liquids, viscosity (a property representing flow rate) is similar to the viscosity of gases, and diffusion coefficient is at least ten times more than that of liquids. These properties of SCFs can be easily controlled by changing temperature and pressure. There are various industrial applications in chemical and pharmaceutical areas and the main industrial applications of SCFs could be categorized as:

1. Alternative solvents for separation processes: The release of common solvents used in industrial separation processes is a major environmental consideration and is not compatible with green chemistry, while there is no hazardous wastes for mainly used SCF, i.e. supercritical carbon dioxide (SC-CO2).

2. Reaction media for chemical synthesis both for small molecules and polymers.

3. Reprocessing fluid in production of particles (in micro and nano scales), fibers and foams.

Solubility data of drugs in SCFs is the key information for designing a supercritical technology. A number of solubility data sets of pharmaceuticals have been published in the literature; however, demand is more than the available databases. In addition, there is no data for new drugs or chemicals. Experimental determination of solubility in SCFs is time consuming and also costly. As an alternative, researchers developed a number of models for representing the data. In a paper (1), available empirical and semi-empirical models were compared employing experimental data sets and it was found that the Equation 1 was the most accurate model both from correlation and prediction points of view.

where y2 is the mole fraction solubility of the solute in SC-CO2, P is the pressure (bar), T stands for temperature (K), ρ denotes the density of pure SC-CO2 and K0-K5 are the model constants (1). The main limitation of the empirical and/or semi-empirical models is the presence of curve-fitting parameters which should be computed employing experimental data. To overcome this limitation, the models can be trained using a minimum number of experimental data and then predict the unmeasured solubilities at pressures and temperatures of interest (2). To provide a predictive method, physico-chemical properties of drugs were calculated using HyperChem? software and then used along with temperature and pressure as input variables for a neural network model and the accuracy of the proposed method was compared with those of previous methods.

The artificial neural network (ANN) technique is a powerful non-linear mapping technique which is a mathematical system that simulates biological neural networks. It consists of processing elements (neurons, nodes) that are organized in the layers. There is always one input and one output layer and at least one hidden layer. Each layer of nodes receives its input from the previous layer or from the network input. The output of each node feeds the next layer or the output of the network. There are several types of neural networks, but back-propagation neural networks are the most frequently used models used in chemical and pharmaceutical applications (3).

A three layer network with a sigmoidal transfer function in hidden and output layers with back-propagation error algorithm was designed in this study. Neural networks were implemented in Matlab 6.1 (4) software using Neural Network Toolbox for Windows running on a personal computer (Pentium IV 2400 MHz). The architecture of the network was 15-4-1. Before ANN analysis, all input and output data were normalized between 0.1 and 0.9. After simulation, the values of predicted data sets were transformed to the experimental values.

The calculated physico-chemical properties of the solutes computed by HyperChem? software were used as inputs and the logarithms of their solubilities were used as outputs. Generally, the neural network methodology has several empirically determined parameters. These include: the number of iteractions or epochs, the number of hidden nodes, learning rate and momentum terms. The optimum values for ANN parameters were evaluated by obtaining those values, which yielded the lowest prediction errors. The optimized values for the number of epochs, number of nodes in the hidden layer, learning rate and momentum are 20000, 4, 0.1 and 0.9, respectively. To ensure that the global optimum had been reached and not just a local optimum, the algorithm was run from different starting values of initial weights. Each set of starting values resulted in almost the same set of optimum values, confirming that a global optimum had been found.

Experimental

Numerical methods and experimental data

Solubility data of 40 pharmaceutically interesting compounds were collected from the literature. Details of data sets including solute?s name, number of data points in each set (N), temperature and pressure ranges are listed in Table 1. Considering the experimental data collected, one should keep in mind that there are some differences between experimental data for a given solute from different research groups. The importance of solubility data, their accuracies and precisions have been discussed by Hutchensen and Foster (5) and different solubility behaviours of oleic acid in SC-CO2 have been reported from various research groups. As another example, Bush and Eckert (6) compared the experimental solubility of octacosane in SC-CO2 at 35?C from 4 different research groups where solubility differences of more than a 10 factor had been reported. As a general rule, the lower the solubility the higher the expected RSD values. The possible reason for these differences should be an enhancement effect caused by any impurities, differences in pressure and temperature calibrations or technical variations during solubility measurements.

 

 

All data points from 40 data sets were used to train the ANN, and then the solubilities were back-calculated using the trained ANN (numerical method I). The calculated solubilities were compared with the corresponding experimental values and the individual absolute error (IAE) was computed using the following equation:

The absolute error (AE) was calculated by the following equation:

Where N is the number of data points.

In order to investigate the prediction capability of the proposed ANN method, all data points were divided into training (1/3 of data points) and test (2/3 of data points) sets. The ANN was trained using training set and the solubility of test set was predicted using trained ANN (numerical method II). Using this method, one needs a number of experimental data points from each solute to predict the solubility at other temperatures and pressures of interest.

In the next set of analysis, 40 data sets were divided into training and test sets and then the ANN trained using training data sets and solubility of test set was predicted (numerical methods III and IV). Using this prediction method, the researchers need only chemical structure of the solute of interest and no experimental solubility data of the solute is required.

 

Computation of descriptors

The selected theoretical descriptors of the solutes were found by AM1 semi-empirical quantum mechanical method using molecular descriptors, properties and orbital programs of HyperChem? 7.0 (7). The structure of each solute was drawn in 2D, converted to 3D using HyperChem? 7.0 (7), and preminimized by Polak-Ribiere geometry optimization using MM+ software (8). The structures found by MM+, were used as the starting point for re-minimization by Polak-Ribiere optimization using AM1 semi-empirical quantum mechanical method. Energy minimizations were performed until the absolute value of the largest partial derivative of energy with respect to the coordinates was below 0.01 kcal mol-1 A-1. The computed descriptors include: surface area approximate (SAA), surface area grid (SAG), molar volume (VOL), hydration energy (HE), logarithm of partition coefficient (logP), molar refractivity (REF), polarizability (POL), molecular mass (MASS), total energy (TE), dipole moment (DM), energy of the lowest unoccupied molecular orbital (LUMO) and energy of the highest occupied molecular orbital (HOMO). Table 2 lists the numerical values of the computational descriptors of the studied solutes.

 

Results and Discussion

Solubility data of 40 drugs in SC-CO2 at various temperatures and pressures were used to train the ANN, then the solubilities were back-calculated, and AE was computed and listed in the second column of Table 3. The analysis (numerical method I) showed the correlation ability of the ANN and the minimum and maximum AEs for this analysis were 0.1 (for resorcinol) and 1.1 (for benzocaine) and the overall AE (?SD) was 0.4 (?0.3).

 

 

In the next numerical analysis (method II) all data points of 40 solutes were divided into training and test sets. The AEs of predicted solubilities for test sets are listed in the third column of Table 3. The overall AE was 0.4 (?0.3) and there was no difference between AE of ANN trained using all data points and a limited number of data points. This shows that the ANN method is well trained using a limited number of data points. This type of numerical analysis which reduces the number of experimental measurements could be employed in industry where researchers are interested in an accurate prediction method.

The real need in pharmaceutical industry is a predictive method without any experimentally obtained parameter in prediction procedure. To check the applicability of the proposed method for providing such a prediction method, data sets with odd set numbers in 1 were used to train the ANN and, the solubility data of even data set numbers were predicted (numerical method III). AEs are computed and listed in the fourth column of Table 3 and the overall AE was 1.7 (?1.1). In numerical method IV, even data set numbers were used as training and the odd data sets as test set and the overall AE was 1.6 (?1.5). AE variation of the full predictive version of the proposed method (numerical methods III and IV) was 0.1 (for procaine) to 6.0 (for nicotinic acid). The prediction error produced by the proposed method is relatively high, however, one should keep in mind that there are high discrepancies between experimental solubilities of a solute determined under similar experimental conditions from different laboratories.

Figure 1 shows the relative frequency of IAE in five subgroups for four numerical analyses. The probabilities of solubility prediction using numerical methods I and IIwith IAE<1.6 was 0.975 and 0.961. The average probability for solubility prediction using numerical methods III and IV (ab initio prediction method) with IAE<1.6 was 0.631.

 

 

In conclusion, the proposed method provides relatively accurate solubility calculations. Computation of descriptors is straightforward and by collecting a minimum number of experimental data, acceptable predictions could be achieved. Ab initio method provides a reasonably accurate prediction method and could be used as an estimation method in industry.

Acknowledgement

The authors would like to thank Research Office, Tabriz University of Medical Sciences for financial support of this work (grant number 83-102).

References

        (1) Jouyban A, Chan HK and Foster NR. Mathematical representation of solute solubility in supercritical carbon dioxide using empirical expressions. J. Supercrit. Fluids (2002) 24: 19-35
(2) Jouyban A, Rehman M, Shekunov BY, Chan HK, Clark BJ and York P. Solubility prediction in supercritical carbon dioxide using minimum number of experimental data. J. Pharm. Sci. (2002) 91: 1287-1295
(3) Zupan J, Gasteiger J, Neural networks: a new method for solving chemical problems or just a passing phase? Anal. Chim. Acta (1991) 248: 1-30
(4) MATLAB for Windows, The Language of Technical Computing, Ver. 6.1.0.450 Release 12.1, The MathWorks, Inc., (2001)
(5) Hutchenson KW and Foster NR. Innovations in Supercritical Fluids Science and technology , American Chemical Society, Washington DC (1995) 1-31
(6) Bush D and Eckert CA. Estimation of solid solubilities in supercritical carbon dioxide from solute solvatochromic parameters. In: Abraham MA and Sunol AK. (eds.) Supercritical Fluids: Extraction and Pollution, American Chemical Society. Washington DC (1997) 37-50
(7) HyperChem 7.0, Molecular Mechanics and Quantum Chemical Calculations Package, HyperCube Inc., Ontario (2002)
(8) HyperCube Extended MM2 Molecular Mechanics Method, HyperCube Inc., Ontario (2002)
(9) Huang Z, Lu WD, Kwai S and Chiew YC. Solubility of aspirin in supercritical carbon dioxide with and without acetone. J. Chem. Eng. Data (2004) 49: 1323-1327
(10) Vatanara A, Rouholamini Najafabadi A, Khajeh M and Yamini Y. Solubility of some inhaled glucocorticoids in supercritical carbon dioxide. J. Supercrit. Fluids (2005) 33: 21-25
(11) Garmroodi A, Hassan J and Yamini Y. Solubilities of the drugs benzocaine, metronidazole benzoate, and naproxen in supercritical carbon dioxide. J. Chem. Eng. Data (2004) 49: 709-712
(12) Weinstein RD, Gribbin JJ and Muske KR. Solubility and salting behavior of several β-adrenergic blocking agents in liquid and supercritical carbon dioxide. J. Chem. Eng. Data (2005) 50: 226-229
(13) Cheng KW, Tang M and Chen YP. Solubilities of benzoin, propyl 4-hydroxybenzoate and mandelic acid in supercritical carbon dioxide. Fluid Phase Equilib. (2002) 201: 79-96
(14) Asghari-Khiavi M and Yamini Y. Solubility of the drugs bisacodyl, methimazole, methylparaben, and iodoquinol in supercritical carbon dioxide. J. Chem. Eng. Data (2003) 48: 61-65
(15) Asghari-Khiavi M, Yamini Y and Farajzadeh MA. Solubilities of two steroid drugs and their mixtures in supercritical carbon dioxide. J. Supercrit. Fluids (2004) 30: 111-117
(16) Johannsen M and Brunner G. Solubilities of the xanthines caffeine, theophylline and theobromine in supercritical carbon dioxide. Fluid Phase Equilib. (1994) 95: 215-226
(17) Li S, Maxwell RJ and Shadwell RJ. Solubility of amphenicol bacteriostats in Co2. Fluid Phase Equilib. (2002) 198: 67-80
(18) Duarte ARC, Coimbra P, de Sousa HC and Duarte CMM. Solubility of flurbiprofen in supercritical carbon dioxide. J. Chem. Eng. Data (2004) 49: 449-452
(19) Stassi A, Bettini R, Gazzaniga A, Giordano F and Schiraldi A. Assessment of solubility of ketoprofen and vanillic acid in supercritical Co2 under dynamic conditions. J. Chem. Eng. Data (2000) 45: 161-165
(20) Weinstein RD, Muske KR, Moriarty J and Schmidt EK. The solubility of benzocaine, lidocaine, and procaine in liquid and supercritical carbon dioxide. J. Chem. Eng. Data (2004) 49: 547-552
(21) Ting SST, Tomasko DL, Foster NR and MacNaughton SJ. Solubility of naproxen in supercritical carbon dioxide with and without cosolvents. Ind. Eng. Chem Res. (1993) 32: 1471-1481
(22) Gordillo MD, Blanco MA, Molero A and Martinez de la Ossa E. Solubility of the antibiotic penicillin G in supercritical carbon dioxide. J. Supercrit. Fluids (1999) 15: 183-190
(23) Ko M, Shah V, Bienkowski PR and Cochran HD. Solubility of the antibiotic penicillin V in supercritical Co2. J. Supercrit. Fluids (1991) 4: 32-39
(24) MacNaughton SJ, Kikic I, Foster NR, Alessi P, Cortesi A and Colombo I. Solubility of anti-inflammatory drugs in supercritical carbon dioxide. J. Chem. Eng. Data (1996) 41: 1083-1086
(25) Yamini Y, Fat?hi MR, Alizadeh N and Shamsipur M. Solubility of dihydroxybenzene isomers in supercritical carbon dioxide. Fluid Phase Equilib. (1998) 152: 299-305
(26) Brunner G and Johannsen M. Solubilities of the fat-soluble vitamins A, D, E and K in supercritical carbon dioxide. J. Chem. Eng. Data (1997) 42: 106-111
(27) Gurdial GS and Foster NR. Solubility of o-hydroxybenzoic acid in supercritical carbon dioxide. Ind. Eng. Chem. Res. (1991) 30: 575-580
(28) Hampson JW, Maxwell RJ, Li S and Shadwell RJ. Solubility of three veterinary sulfonamides in supercritical carbon dioxide by a recirculating equilibrium method. J. Chem. Eng. Data (1999) 44: 1222-1225
(29) Burgos-Sol?rzano GI, Brennecke JF and Stadtherr MA. Solubility measurements and modeling of molecules of biological and pharmaceutical interest with supercritical Co2. Fluid Phase Equilib. (2004) 220: 55-67