As discussed earlier, the solubility of a drug may play a crucial role in the process of absorption from the gut. Although the solvent medium in the gut is not pure water, aqueous solubility (Sa) is usually taken as the surrogate parameter. Several methods of calculating water solubility from chemical structure have been described in the literature (see below). However, their accuracy is significantly lower than that of distribution coefficients, particularly if they are applied to very chemically diverse compounds. The reason for this fault is that, at least for substances which form solid crystals under normal conditions, the solubility is mainly determined by two energy contributions. One of them describes the energy required to place the solute molecules in the solvent and is given mainly by the chemical potential. This part is very much related to distribution coefficients, such as log P, and can be estimated from the two-dimensional chemical structure sufficiently ac curately. The second energy contribution arises from the need for the molecules to leave the energetically favorable crystal structure. This energy relies very much on the interactions of the drug molecules with each other and on their ability to be densely packed, and is therefore strongly dependent on the three-dimensional structure of the molecules. This feature is the principal reason why the accuracy of prediction on the basis of two-dimensional structures is limited.
A broad survey of the different prediction methods is given in , including examples of applications. A review of various methods with a critical discussion of their quality can be found in . Below, we give a short overview of the different approaches. Basically, three classes of methods can be distinguished: fragment-based methods, quantitative correlations with log P or with molecular descriptors, and the universal quasi-chemical functional group activity coefficient (UNIFAC) method.
The fragment-based methods [82, 83] rely on the assumption that Sa is an additive property which can be calculated by summing up the contributions of the different chemical groups of a molecule. These methods have the advantage of being easily applicable. However, their applicability is restricted to structures for which all groups contained within the structure are parameterized in terms of their contribution to solubility.
Because methods for calculating log P are commonly available (see Section 26.4.1), using a log Sa/log P correlation is a very convenient method to estimate Sa. A quite simple physicochemical consideration  leads to the result that, for nonionic liquid solutes, a relationship of the form log Sa = a - b log P (6)
should exist. This relation is a quantification of the qualitative rule that water is a good solvent for polar and a poor solvent for nonpolar, lipophilic solutes. Depending on the dataset, the parameters a and b have values of 0.25-1.5 and 0.95-1.2 respectively . In its simplest form, the correlation gives good estimations only for liquid compounds and can therefore generally only be used to make rough estimations. For compounds that are solid, a term regarding the energy required to remove a molecule from the crystal has to be included, which can be accomplished by considering the melting temperature of the crystal [84, 85]. However, because the value of this property is generally not known for virtual structures and substances produced by combinatorial chemistry, these approaches cannot be helpful for the evaluation of combinatorial libraries and are therefore not discussed in greater detail here.
Several relationships between solubility and molecular connectivity have been described [86-89]. The advantage of these methods is that their application is quite simple because the connectivity indices needed can easily be calculated from the structure. On the other hand, their applicability is somewhat restricted because the single relationships are in each case restricted to one single or only a few classes of chemical structures.
The diverse approaches relying on the UNIFAC method  represent mixed methods which are partly fragment based and partly quantitative property-solubility relationships. They rely on the assumption that solubility is determined by the so-called activity coefficient, which describes the interaction between solute and solvent. This activity index consists of a term which describes the work needed to form a cavity within the water that is able to take up the solute molecule and a second term which describes the interactions between the solute molecule and water. The first term is connected with molecular size and surface area and can be calculated directly from the structure. The second term can be estimated using a fragment contribution method . Again, when taking into account only the activity index, the properties of the crystal to be dissolved are not considered. A possible way to overcome this, problem again, is to take into account the melting temperature, which is not usually practical in combinatorial chemistry. However, recently a related method - unified physical property estimation relationship (UPPER) - has been published  which combines symmetry and flexibility properties of the structures with the activity indices. These additional parameters are related to the entropy of fusion and at least partly incorporate the contribution of crystal energy.
As mentioned above, water solubility is only a surrogate for the solubility which actually determines the absorption from the gut, not only because the gut contents deviate in their composition from pure water, but also because the pH value within the GI tract varies over a wide range. In the stomach, a very acidic medium with pH values around 2 is present in the fasting state, rising to pH 5-6 after eating. At the beginning of the small intestine, the pH value jumps to about 5, and then increases constantly with increasing distance from the stomach to a value of around 7.5. Consequently, at least for compounds with basic or acidic groups, during the passage through the GI tract the ionization state and thus the solution properties may change with time if their pKa values are close to the pH values in the GI tract. The reason for this effect is that the ionized form is usually much more soluble than the neutral form, which means that for acids solubility may increase while for bases it decreases during the passage through the GI tract. Bearing in mind that most of all currently marketed drugs are ionizable compounds, this effect is far from negligible. With regard to the estimation of solubility, the situation is even more complicated owing to the fact that the majority of the methods used for such estimations (see above) are explicitly valid only for nonelectrolytes, which limits their value significantly.
The change in solubility can be roughly estimated from the change in log P according to the relationship between those properties, as discussed above. For a good estimation of this effect, the knowledge of the pKa value of the compound is necessary. Unfortunately, no reliable method for the prediction of pKa values is known.
In conclusion, it can be stated that solubility is an important parameter and should be taken into account in library design. However, at present, the possibilities of making useful predictions of this property on the basis of chemical structure are limited.
758 | 26 Estimation of Physicochemical and ADME Parameters 26.4.3
Was this article helpful?