I have over 15 years of experience in computational biology. My research goal is to predict the dynamic behaviour of the living cell by computer simulation of the genome scale network models representing experimental data on interaction between molecules.
I am convinced that we can fully exploit information about full genomic sequence of human and other organisms only if we use legacy of molecular biology data to build predictive mechanistic models of genotype-phenotype relationship. Due to the number of molecular components in the cell and non-linearity of their interactions this goal can only be achieved by computer simulation. The successful computer simulation of the molecular cell biology will enable prediction of the individual genetic differences on the trajectories of major diseases providing foundation for predictive and personalized medicine of the future. Likewise, industrial biotechnology is being revolutionized by increasing ability to computer simulate the effects of genetic engineering in commercial cell lines and therefore rationally design industrial fermentation processes.I developed novel Quasi Steady State Petri Net (QSSPN) algorithm (Bioinformatics, doi: 10.1093/bioinformatics/btt552) integrating Petri nets and constraint-based analysis to predict the feasibility of qualitative dynamic behaviours in models of gene regulation, signalling and whole-cell metabolism. We presented the first dynamic simulations including regulatory mechanisms and a genome scale metabolic network in human cell, using bile acid homeostasis in human hepatocytes as a case study. QSSPN simulations reproduced experimentally determined qualitative dynamic behaviours and permitted mechanistic analysis of genotype-phenotype relationships. (QSSPN Project website: http://sysbio3.fhms.surrey.ac.uk/qsspn/index.html)
I have performed computational part of the project leading to the first reconstruction of the Genome Scale Metabolic Reaction Network of Mycobacterium tuberculosis, causative agent of Tuberculosis disease (Genome Biology, 2007). The tools developed for this project motivated have been matured into SurreyFBA software recently published by my group (Bioinformatics, 2011). I have also been working on analysis of gene expression data in the context of genome scale metabolic networks (PLoS Computational Biology, 2011) and development of software for web based computation with FBA models (BMC Bioinformatics, 2011). Industrial biotechnology is an important application area for genome scale metabolic modeling; I worked on FBA simulations in the context of bioprocess feed development for antibiotic production in Streptomyces coelicolor (Metabolic Engineering, 2008).
I have been modeling stochastic effects in molecular interaction network dynamics for 10 years. I have constructed detailed model of prokaryotic gene expression and investigated dependence between accuracy of gene expression and transcription and translation initiation rates (J. Biol. Chem, 2001). This work has also lead to the publication of STOCKS software for stochastic simulation of molecular interaction network (Bioinformatics, 2002). Subsequently, we have developed Maximal Timestep Method, a hybrid algorithm enabling stochastic simulation of systems with reaction rates varying by many orders of magnitude. The method has been applied to investigate propagation of gene expression noise to the level of metabolic processes leading to epigenetically inherited changes in single cell physiology (Biophysical Journal 2004). More recently, I was working the influence of RNA regulators on gene expression noise (Biophysical Journal 2009) and constructed stochastic kinetic model of Two Component System Signalling (Molecular Biosystems 2010).
In have past bioinformatics experience in the field of homology modeling of protein structure (Nucl. Acids. Research 2003, Nature Immunology 2003), regulatory sequence analysis (J. Biol. Chem 2005) and annotation of genome sequences (Nature 2004). I did my PhD in the area of Biophysics and worked on the agent-based simulations of protein crystal growth (Biophysical Journal 1997). I have also performed molecular dynamics simulations and analysed light scaterring spectra (J. Phys. Chem. 1999).
1. BMS3072: Systems Biology: Genomes in Action
2. BMS1023: Numeracy skills and Statistics
4. MSc Coourses: Statistics
Professor of Systems Biology
Module organiser for BMS3072
Find me on campus Room: 05 PG 02
BACKGROUND: Mycobacterium tuberculosis continues to kill more people than any other bacterium. Although its archetypal host cell is the macrophage, it also enters, and survives within, dendritic cells (DCs). By modulating the behaviour of the DC, M. tuberculosis is able to manipulate the host's immune response and establish an infection. To identify the M. tuberculosis genes required for survival within DCs we infected primary human DCs with an M. tuberculosis transposon library and identified mutations with a reduced ability to survive. RESULTS: Parallel sequencing of the transposon inserts of the surviving mutants identified a large number of genes as being required for optimal intracellular fitness in DCs. Loci whose mutation attenuated intracellular survival included those involved in synthesising cell wall lipids, not only the well-established virulence factors, pDIM and cord factor, but also sulfolipids and PGL, which have not previously been identified as having a direct virulence role in cells. Other attenuated loci included the secretion systems ESX-1, ESX-2 and ESX-4, alongside many PPE genes, implicating a role for ESX-5. In contrast the canonical ESAT-6 family of ESX substrates did not have intra-DC fitness costs suggesting an alternative ESX-1 associated virulence mechanism. With the aid of a gene-nutrient interaction model, metabolic processes such as cholesterol side chain catabolism, nitrate reductase and cysteine-methionine metabolism were also identified as important for survival in DCs. CONCLUSION: We conclude that many of the virulence factors required for survival in DC are shared with macrophages, but that survival in DCs also requires several additional functions, such as cysteine-methionine metabolism, PGLs, sulfolipids, ESX systems and PPE genes.
Non-alcoholic fatty liver disease (NAFLD) is a progressive disease of increasing public health concern. In western populations the disease has an estimated prevalence of 20%-40%, rising to 70%-90% in obese and type II diabetic individuals. Simplistically, NAFLD is the macroscopic accumulation of lipid in the liver, and is viewed as the hepatic manifestation of the metabolic syndrome. However, the molecular mechanisms mediating both the initial development of steatosis and its progression through non-alcoholic steatohepatitis to debilitating and potentially fatal fibrosis and cirrhosis are only partially understood. Despite increased research in this field, the development of non-invasive clinical diagnostic tools and the discovery of novel therapeutic targets has been frustratingly slow. We note that, to date, NAFLD research has been dominated by in vivo experiments in animal models and human clinical studies. Systems biology tools and novel computational simulation techniques allow the study of large-scale metabolic networks and the impact of their dysregulation on health. Here we review current systems biology tools and discuss the benefits to their application to the study of NAFLD. We propose that a systems approach utilising novel in silico modelling and simulation techniques is key to a more comprehensive, better targeted NAFLD research strategy. Such an approach will accelerate the progress of research and vital translation into clinic.
Background: Leprosy has afflicted humankind throughout history leaving evidence in both early texts and the archaeological record. In Britain, leprosy was widespread throughout the Middle Ages until its gradual and unexplained decline between the 14th and 16th centuries. The nature of this ancient endemic leprosy and its relationship to modern strains is only partly understood. Modern leprosy strains are currently divided into 5 phylogenetic groups, types 0 to 4, each with strong geographical links. Until recently, European strains, both ancient and modern, were thought to be exclusively type 3 strains. However, evidence for type 2 strains, a group normally associated with Central Asia and the Middle East, has recently been found in archaeological samples in Scandinavia and from two skeletons from the medieval leprosy hospital (or leprosarium) of St Mary Magdalen, near Winchester, England.Results: Here we report the genotypic analysis and whole genome sequencing of two further ancient M. leprae genomes extracted from the remains of two individuals, Sk14 and Sk27, that were excavated from 10th-12th century burials at the leprosarium of St Mary Magdalen. DNA was extracted from the surfaces of bones showing osteological signs of leprosy. Known M. leprae polymorphisms were PCR amplified and Sanger sequenced, while draft genomes were generated by enriching for M. leprae DNA, and Illumina sequencing. SNP-typing and phylogenetic analysis of the draft genomes placed both of these ancient strains in the conserved type 2 group, with very few novel SNPs compared to other ancient or modern strains.Conclusions: The genomes of the two newly sequenced M. leprae strains group firmly with other type 2F strains. Moreover, the M. leprae strain most closely related to one of the strains, Sk14, in the worldwide phylogeny is a contemporaneous ancient St Magdalen skeleton, vividly illustrating the epidemic and clonal nature of leprosy at this site. The prevalence of these type 2 strains indicates that type 2F strains, in contrast to later European and associated North American type 3 isolates, may have been the co-dominant or even the predominant genotype at this location during the 11th century. © 2014 Mendum et al.; licensee BioMed Central Ltd.
BACKGROUND: Leprosy has afflicted humankind throughout history leaving evidence in both early texts and the archaeological record. In Britain, leprosy was widespread throughout the Middle Ages until its gradual and unexplained decline between the 14th and 16th centuries. The nature of this ancient endemic leprosy and its relationship to modern strains is only partly understood. Modern leprosy strains are currently divided into 5 phylogenetic groups, types 0 to 4, each with strong geographical links. Until recently, European strains, both ancient and modern, were thought to be exclusively type 3 strains. However, evidence for type 2 strains, a group normally associated with Central Asia and the Middle East, has recently been found in archaeological samples in Scandinavia and from two skeletons from the medieval leprosy hospital (or leprosarium) of St Mary Magdalen, near Winchester, England. RESULTS: Here we report the genotypic analysis and whole genome sequencing of two further ancient M. leprae genomes extracted from the remains of two individuals, Sk14 and Sk27, that were excavated from 10th-12th century burials at the leprosarium of St Mary Magdalen. DNA was extracted from the surfaces of bones showing osteological signs of leprosy. Known M. leprae polymorphisms were PCR amplified and Sanger sequenced, while draft genomes were generated by enriching for M. leprae DNA, and Illumina sequencing. SNP-typing and phylogenetic analysis of the draft genomes placed both of these ancient strains in the conserved type 2 group, with very few novel SNPs compared to other ancient or modern strains. CONCLUSIONS: The genomes of the two newly sequenced M. leprae strains group firmly with other type 2F strains. Moreover, the M. leprae strain most closely related to one of the strains, Sk14, in the worldwide phylogeny is a contemporaneous ancient St Magdalen skeleton, vividly illustrating the epidemic and clonal nature of leprosy at this site. The prevalence of these type 2 strains indicates that type 2F strains, in contrast to later European and associated North American type 3 isolates, may have been the co-dominant or even the predominant genotype at this location during the 11th century.
Dynamic simulation of genome-scale molecular interaction networks will enable the mechanistic prediction of genotype-phenotype relationships. Despite advances in quantitative biology, full parameterization of whole-cell models is not yet possible. Simulation methods capable of using available qualitative data are required to develop dynamic whole-cell models through an iterative process of modelling and experimental validation.
One of the most challenging problems in microbiology is to understand how a small fraction of microbes that resists killing by antibiotics can emerge in a population of genetically identical cells, the phenomenon known as persistence or drug tolerance. Its characteristic signature is the biphasic kill curve, whereby microbes exposed to a bactericidal agent are initially killed very rapidly but then much more slowly. Here we relate this problem to the more general problem of understanding the emergence of distinct growth phenotypes in clonal populations. We address the problem mathematically by adopting the framework of the phenomenon of so-called weak ergodicity breaking, well known in dynamical physical systems, which we extend to the biological context. We show analytically and by direct stochastic simulations that distinct growth phenotypes can emerge as a consequence of slow-down of stochastic fluctuations in the expression of a gene controlling growth rate. In the regime of fast gene transcription, the system is ergodic, the growth rate distribution is unimodal, and accounts for one phenotype only. In contrast, at slow transcription and fast translation, weakly non-ergodic components emerge, the population distribution of growth rates becomes bimodal, and two distinct growth phenotypes are identified. When coupled to the well-established growth rate dependence of antibiotic killing, this model describes the observed fast and slow killing phases, and reproduces much of the phenomenology of bacterial persistence. The model has major implications for efforts to develop control strategies for persistent infections.
Streptomycetes sense and respond to the stress of phosphate starvation via the two-component PhoR-PhoP signal transduction system. To identify the in vivo targets of PhoP we have undertaken a chromatin-immunoprecipitation-on-microarray analysis of wild-type and phoP mutant cultures and, in parallel, have quantified their transcriptomes. Most (ca. 80%) of the previously in vitro characterized PhoP targets were identified in this study among several hundred other putative novel PhoP targets. In addition to activating genes for phosphate scavenging systems PhoP was shown to target two gene clusters for cell wall/extracellular polymer biosynthesis. Furthermore PhoP was found to repress an unprecedented range of pathways upon entering phosphate limitation including nitrogen assimilation, oxidative phosphorylation, nucleotide biosynthesis and glycogen catabolism. Moreover, PhoP was shown to target many key genes involved in antibiotic production and morphological differentiation, including afsS, atrA, bldA, bldC, bldD, bldK, bldM, cdaR, cdgA, cdgB and scbR-scbA. Intriguingly, in the PhoP-dependent cpk polyketide gene cluster, PhoP accumulates substantially at three specific sites within the giant polyketide synthase-encoding genes. This study suggests that, following phosphate limitation, Streptomyces coelicolor PhoP functions as a 'master' regulator, suppressing central metabolism, secondary metabolism and developmental pathways until sufficient phosphate is salvaged to support further growth and, ultimately, morphological development.
Phenotypic differences of genetically identical cells under the same environmental conditions have been attributed to the inherent stochasticity of biochemical processes. Various mechanisms have been suggested, including the existence of alternative steady states in regulatory networks that are reached by means of stochastic fluctuations, long transient excursions from a stable state to an unstable excited state, and the switching on and off of a reaction network according to the availability of a constituent chemical species. Here we analyse a detailed stochastic kinetic model of two-component system signalling in bacteria, and show that alternative phenotypes emerge in the absence of these features. We perform a bifurcation analysis of deterministic reaction rate equations derived from the model, and find that they cannot reproduce the whole range of qualitative responses to external signals demonstrated by direct stochastic simulations. In particular, the mixed mode, where stochastic switching and a graded response are seen simultaneously, is absent. However, probabilistic and equation-free analyses of the stochastic model that calculate stationary states for the mean of an ensemble of stochastic trajectories reveal that slow transcription of either response regulator or histidine kinase leads to the coexistence of an approximate basal solution and a graded response that combine to produce the mixed mode, thus establishing its essential stochastic nature. The same techniques also show that stochasticity results in the observation of an all-or-none bistable response over a much wider range of external signals than would be expected on deterministic grounds. Thus we demonstrate the application of numerical equation-free methods to a detailed biochemical reaction network model, and show that it can provide new insight into the role of stochasticity in the emergence of phenotypic diversity.
BACKGROUND: Neisseria meningitidis is an important human commensal and pathogen that causes several thousand deaths each year, mostly in young children. How the pathogen replicates and causes disease in the host is largely unknown, particularly the role of metabolism in colonization and disease. Completed genome sequences are available for several strains but our understanding of how these data relate to phenotype remains limited. RESULTS: To investigate the metabolism of N. meningitidis we generated and selected a representative Tn5 library on rich medium, a minimal defined medium and in human serum to identify genes essential for growth under these conditions. To relate these data to a systems-wide understanding of the pathogen's biology we constructed a genome-scale metabolic network: Nmb_iTM560. This model was able to distinguish essential and non-essential genes as predicted by the global mutagenesis. These essentiality data, the library and the Nmb_iTM560 model are powerful and widely applicable resources for the study of meningococcal metabolism and physiology. We demonstrate the utility of these resources by predicting and demonstrating metabolic requirements on minimal medium such as a requirement for PEP carboxylase, and by describing the nutritional and biochemical status of N. meningitidis when grown in serum, including a requirement for both the synthesis and transport of amino acids. CONCLUSIONS: This study describes the application of a genome scale transposon library combined with an experimentally validated genome-scale metabolic network of N. meningitidis to identify essential genes and provide novel insight to the pathogen's metabolism both in vitro and during infection.
Constraint-based approaches facilitate the prediction of cellular metabolic capabilities, based, in turn on predictions of the repertoire of enzymes encoded in the genome. Recently, genome annotations have been used to reconstruct genome scale metabolic reaction networks for numerous species, including Homo sapiens, which allow simulations that provide valuable insights into topics, including predictions of gene essentiality of pathogens, interpretation of genetic polymorphism in metabolic disease syndromes and suggestions for novel approaches to microbial metabolic engineering. These constraint-based simulations are being integrated with the functional genomics portals, an activity that requires efficient implementation of the constraint-based simulations in the web-based environment.
A large number of cDNA inserts were sequenced from a high-quality library of chicken bursal lymphocyte cDNAs. Comparisons to public gene databases indicate that the cDNA collection represents more than 2,000 new, full-length transcripts. This resource defines the structure and the coding potential of a large fraction of B-cell specific and housekeeping genes whose function can be analyzed by disruption in the chicken DT40 B-cell line.
Page Owner: bss1ak
Page Created: Monday 27 October 2014 14:55:26 by kj0008
Last Modified: Monday 4 April 2016 12:20:18 by sa0043
Assembly date: Tue Aug 23 00:39:04 BST 2016
Content ID: 134838