Events‎ > ‎AI3SD Event List‎ > ‎

***Postphoned*** 06/05/2020 - AI3SD Seminar: First Principles and Machine Learning Calculations of Aqueous Solubility - University of Southampton

posted 19 Feb 2020, 10:14 by Samantha Kanza   [ updated 27 Mar 2020, 03:25 ]
When: Wednesday 6th May

Where: Building 29/1101, University of Southampton 

Dr John Mitchell, EaStCHEM School of Chemistry, University of St Andrews

Bio: John Mitchell obtained his PhD in Theoretical Chemistry from Cambridge, studying the energetics of hydrogen bonding with Prof. Sally Price. He then worked with Prof. Janet Thornton at University College London, applying computational chemistry to the growing field of structural bioinformatics. He returned to Cambridge in 2000, taking up a lectureship in Chemistry. He was appointed to a readership at St Andrews in 2009. His research uses theoretical and machine learning techniques in pharmaceutical chemistry, condensed phase modelling, and structural bioinformatics. His group have worked extensively on prediction of bioactivity, solubility, melting point and hydrophobicity from chemical structure, using both informatics and theoretical chemistry methodologies. Recently they have developed novel applications of machine learning in computational biochemistry, such as drug side effect prediction, identifying athletic performance enhancers, and competing against a panel of human experts to predict solubility accurately.

However simple it may appear, predicting how much of a substance will dissolve in water or other solvents turns out to be both an important and a difficult problem. Medicines need to be soluble in environments throughout the body, including the stomach and the bloodstream, in order to get to the sites where they are designed to act. The difficulty of designing medicines that are both effective and soluble continues to deny or delay treatments that patients need, as well as costing hundreds of millions of pounds in wasted research effort. Beyond pharmaceuticals, developing safer pesticides depends on understanding their solubility in wet soil or rivers, in order to know their environmental impact. Furthermore, design of new compounds for high-tech uses and separation of mirror-image molecules would be greatly facilitated by the ability to calculate and predict solubility accurately.  
There are two philosophically very different approaches to computing solubility, which will be discussed and compared throughout this seminar. One is to look towards first principles theoretical chemistry. A major advantage of using theoretically rigorous quantitative modelling of both crystals and solutions, typically derived from quantum mechanics, is that the methods can be rigorously adapted to give the solubility for other solvents, to account for the presence of other molecules, and to vary the predictions with changes of temperature. Using the best available computational chemistry models, often borrowed from crystal structure prediction or condensed phase physics, also allows us to benchmark, adapt, and systematically improve the methodology to approach the chemical accuracy that will be required if first principles models are to acquire mainstream utility in solubility science.
The alternative is to use an informatics-based approach, a field once described as QSPR but now more likely to be badged Machine Learning or Artificial Intelligence. Informatics methods are designed with the sole objective of accurate numerical prediction. We seek simply to link our inputs – representations of molecular structures, to the outputs – accurate predictions of experimental solubility, with no requirement for the model to utilise real-world chemistry or physics. Any interpretability or mechanistic insight from the model is a secondary consideration. Currently, Machine Learning techniques such as Support Vector Machine or Random Forest offer solubility predictions more numerically accurate and orders of magnitude faster than first principles.