Banner


I'm Hew, a PhD student at the University of Oxford researching generative flow models for protein folding in the Oxford Protein Informatics Group (OPIG) supervised by Prof. Charlotte Deane.

I’m also a self-employed ML and software consultant with a portfolio of work for Isomerase where I developed their flagship directed evolution tool Evoselect.

Feel free to reach out to me if you’re interested in collaborating or my ML consulting.




✉ Email Me

Latest

Blog 3: Protein Modelling Pt.3 - Thermodynamics and Statistical Mechanics

Introduction

In the previous blog we describe the first method for approximating the Potts model via message passing as proposed by Weigt et al.. The resultant method was somewhat ineffecient relying on a slow iterative belief propagation. In this blog I will walk you through the next iteration in methods for approximating the Potts model, specifically the Mean Field approximation approach pioneered by the same group that introduced message passing. We will walk through the paper by Morcos et al. and they’re more elegant and efficient solution to the Potts model through a marriage of statistical mechanics and thermodynamics to evolutionary sequence analysis.

Blog 2: Protein Modelling Pt.2 - Graph Network Messaging

Introduction

In the previous blog we outlined the statistical basis of the Potts model. The resultant Ising-like form is largely intractable for any reasonably sized protein and multiple sequence alignment. Although no closed form solution exists, methods exploiting numerical approaches to approximate the energy function have shown remarkable success when tested for their accuracy in predicting protein residue contacts. In this blog we’ll discuss the two earliest approaches to this.

Blog 1: Protein Modelling Pt.1 - From Similarity to Structure Prediction

Introduction

Biology has typically been the least quantitative of the sciences. However, over recent decades the surge in sequencing data, made possible by next generation sequencing, has facilitated the application of statistical and machine learning to biology. In this blog I will describe what first got me into computational biology and what helped power the first drastic improvements in protein structure prediction introduced by AlphaFold2. Specifically, I will discuss what is now called evolutionary or Direct Coupling Analysis (DCA) and focus on several key papers that formulated this fascinating application of statistical mechanical principles to biology.

Markers of the Ageing Macrophage: A Systematic Review and Meta-Analysis

My 4th year integrated Masters project at the University of Sheffield was extended into a systematic meta-narrative literature review of age-related changes in macrophages which was published in Frontiers in Immunology.

TLDR we included 240 articles in the review and aggregated information from 122 of these into a Meta-analysis by Information Content. MAIC is novel method of quantitatively assessing heterogeneous and non-standardised literature results such as the literature on the ageing macrophage which encompassed 2 mouse strains, 1 rat strain, humans, 4 subtypes of macrophage, and a diverse range of young vs old age classifications and experimental methodologies like qPCR, flow cytometry and ELISA. We quantitatively summarised the top explored genes and proteins in the literature by MAIC as downregulated or upregulated by counting the number of articles that reported upregulation or downregulation with age but weighting by various statistical factors such as degree of sharing across different methodologies in different publications (higher for representation across diverse methodologies). Chord diagrams were used to represent the MAIC data where segments depicted genes (Figure 1)

Markers of the Ageing Macrophage: A Systematic Review and Meta-Analysis

My 4th year integrated Masters project at the University of Sheffield was extended into a systematic meta-narrative literature review of age-related changes in macrophages which was published in Frontiers in Immunology.

TLDR we included 240 articles in the review and aggregated information from 122 of these into a Meta-analysis by Information Content. MAIC is novel method of quantitatively assessing heterogeneous and non-standardised literature results such as the literature on the ageing macrophage which encompassed 2 mouse strains, 1 rat strain, humans, 4 subtypes of macrophage, and a diverse range of young vs old age classifications and experimental methodologies like qPCR, flow cytometry and ELISA. We quantitatively summarised the top explored genes and proteins in the literature by MAIC as downregulated or upregulated by counting the number of articles that reported upregulation or downregulation with age but weighting by various statistical factors such as degree of sharing across different methodologies in different publications (higher for representation across diverse methodologies). Chord diagrams were used to represent the MAIC data where segments depicted genes (Figure 1)

0%