Dr John B. O. Mitchell

Dr John B. O. Mitchell

Reader

Researcher profile

Phone
+44 (0)1334 46 7259
Email
jbom@st-andrews.ac.uk

 

Biography

John Mitchell has a PhD in Theoretical Chemistry from Cambridge. He returned there from University College London in 2000, taking up a lectureship in Chemistry. He was appointed to a readership at St Andrews in 2009. His recent research has used computational techniques in pharmaceutical chemistry and structural bioinformatics. His group have worked extensively on prediction of bioactivity, solubility, melting point and hydrophobicity from chemical structure, using both informatics and theoretical chemistry methodologies. Recently they have developed novel applications of machine learning in computational biochemistry, such as drug side effect prediction, and identifying athletic performance enhancers.

Teaching

Lecturer CH5714 Chemical Applications of Electronic Structure Calculations; Lecturer CH4431 Scientific Writing; Lecturer CH3717 Statistical Mechanics and Computational Chemistry; Convenor & Tutor, CH1202 Introductory Chemistry; Lecturer ID2005 Scientific Thinking; Tutor CH2701 Physical Chemistry 2; Tutor CH1401 Introductory Inorganic and Physical Chemistry; Tutor CH5461 Integrating Chemistry; Project Supervisor CH4442 & CH5441 Research Projects; Lecturer SUPACCH Computational Chemistry (Postgraduate course).

Research areas

Our research covers everything that is broadly both chemical and computational. Some of the main themes are described below, but if you're just after a list of publications, here you are.

  • Machine Learning

A substantial part of computational chemistry involves building mathematical models to analyse data. The Machine Learning (ML) part of our work comprises everything that is not an attempt realistically to model the processes by which the real world actually works. In jargon, this is everything that is not physics-based. Such tasks might firstly be regression, that is predicting numerical values such as solubilities. Secondly they might be classification, assigning items such as molecules to classes like "toxic" or "non-toxic". Thirdly, they might be clustering, finding patterns in unlabelled data. In our group, we use such models to predict and calculate properties such as solubility, bioactivity and toxicity.

Such modelling in fact has a long history in chemistry, dating back to the 19th century. However, for much of that time models were limited to simple linear regressions. In the latter part of the 20th century, the field developed through building QSAR (Quantitative Structure-Activity Relationship) and QSPR (ditto, but now it's Structure-Property) models with multi-linear regression, and then onto non-linear methods. The field was usually known as chemoinformatics (or cheminfomatics, being unsure how to spell its own name). In the modern era, the sophistication of the models has increased to a point where it's more descriptive, and certainly more widely understood, to call these techniques Machine Learning.

What about Artifical Intelligence (AI) - can we cut a divide between ML and AI? Probably not a clear one. As Google DeepMind executive Mat Velloso said: “If it is written in Python, it's probably machine learning. If it is written in PowerPoint, it's probably AI.” While there's clearly a sbstantial overlap between the categories, we tend to refer to souped up non-linear regression models as ML, but to LLMs as AI. Nonetheless, under the lid, LLMs are just large neural networks doing neural network things like optimising weights.

  • Molecular Simulation

By way of contrast, molecular simulation is definitely a physics-based approach. We set up in the computer a mathematical representation of the molecules involved, one that typically includes the chemical nature and spatial co-ordinates of each constituent atom. The computer then produces a possible future of that molecular system, calculating its response to its physical and chemical environment at each timestep to create a trajectory, in a process known as Molecular Dynamics.

If carried out intelligently, such methods can provide great scientific insight into the behaviour of the system, covering things such as structure, energy, interactions with other molecules, phase changes and much more. Typically, simulations are carried out with the molecules contained in 3-dimensional boxes that are stacked together without limit in all directions and fill space with no gaps, a scenario described as having "periodic boundary conditions." Our group use such methods for structural studies of the interactions between enzymes and their substrates, with applications like plastic-eating enzymes and new medicines.

The forces, or more explicitly the interaction energies, between molecules are defined by a "force field", which has little to do with science fiction but a lot to do with the fundamental physical processes governing the attractive and repulsive interactions amongst atoms and molecules. This forms a major part of the scientific input into simulations. Historically, force fields have been either fitted to experiment or parameterised via theoretical calculation, but increasingly they are now being generated through ML.

  • Quantum Chemistry

For all the usefulness of simulations, typically their force fields know nothing about covalent bond making or breaking, which means that they can't be used to study chemical reactions, molecular orbitals or even the vibrational motions of molecules. Instead, a more chemically intelligent approach is required, and this is provided by the electronic structure methods of quantum chemistry. Such approaches are known as "first principles," due to their sound basis in atomic and molecular quantum mechanics.

The most foundational such method historically has been Hartree-Fock self-consistent field theory (HF). However, in this century, Density Functional Theory (DFT) has become a much more widely known and used alternative, largely because it generally gives a more accurate result at a lesser cost.

We use quantum chemical methods such as HF and DFT for a variety of applications, including the energetics of chemical reactions, development of force fields, physics-based calculation of solubility and the prediction of crystal structures. While our group are very much users rather than developers of quantum chemical methods, we appreciate their central role in computational chemistry.

  • Bioinformatics

The sequential and alphabetical nature of both DNA and proteins makes them a rich source of computational research. Study of these essential and foundational biomolecules provides a window into the evolutionary history of life and its chemistry, as well as the impressive structural diversity of proetin folds. Our own research frequently occupies the interface between chemistry and biology, the interactions between large biological polymers and smaller molecules being fundamental to processes of life and disease alike.

Much of our work in these areas has centred on enzymes, their chemical functions and their evolutionary histories. In this post-AlphaFold era, we continue to seek out new research questions that can shed light on the rich and diverse repertoire of biochemistry. In this endeavour, we frequently collaborate with collegues in Biology as well as Chemistry.

Additional information about the current Mitchell Group can be found here: https://jbomgroup.wp.st-andrews.ac.uk/

PhD supervision

  • Benedict Connaughton

Selected publications

 

See more publications