Dr Samaneh Kouchaki
Academic and research departmentsCentre for Vision, Speech and Signal Processing (CVSSP), Department of Electrical and Electronic Engineering.
I joined CVSSP in July 2020 to lead research and teaching in machine learning for health and dementia care in collaboration with the UK Dementia Research Institute (UK DRI) Care Research & Technology Centre (a joint initiative between CVSSP, the Surrey Sleep Research Centre and Department of Mathematics at Surrey, and Imperial College London).
I previously spent three years as a postdoctoral researcher within the Institute of Biomedical Engineering at the University of Oxford. I was the senior machine learning researcher for the ‘100,000 Genomes Project for Tuberculosis’, an international consortium involving the Centres for Disease Control of most major nations (including the USA, UK and China), jointly funded by the Gates Foundation and the Wellcome Trust. My research focus was on the prediction of antibiotic resistance in pathogens such as those that cause tuberculosis.
Prior to this, I was at the University of Manchester within the Division of Evolutionary and Genomic Sciences where I was funded by the EU Horizon 2020 Virogenesis project, working on next-generation DNA sequencing using signal and image processing techniques coupled with unsupervised machine learning.
I obtained my PhD in Computer Science at Surrey in 2015. My PhD focused on developing novel multi-way techniques for source separation with application to biomedical signals.
Areas of specialism
Biomedical signal processing; deep supervised/semi-supervised learning for healthcare data; graph learning and embedding for omics data; time-series data processing and pattern analysis.
Dr Kouchaki’s research is aimed at improving patient care by providing decision support. Her objective is to develop intelligent tools, based on hybrid architectures of advanced probabilistic and deep learning techniques, that facilitate improved patient outcomes.
PhD research positions
Dr Kouchaki is currently looking to supervise PhD students interested in exploring the following topics:
Deep interpretable learning for healthcare data:
Interpretability is an important factor in healthcare as it helps clinicians understand how the model is working and also discover the important clinical variables. Adding interpretability to deep learning techniques would allow their deployment in clinical use.
Machine learning and signal processing for genetic data:
Omics-based research includes the study of multiple genetic resources (proteomics, genomics, metabolomics) and provides biological insights for many healthcare applications. Traditionally, machine learning techniques have been employed successfully in analysing individual genetic resources but not multi-omics. Multi-omics research is vital for understanding complex biological systems.
Heterogeneous graph embedding and graph convolutional networks for multi-sensor data analysis:
The aim is to develop a sophisticated and robust framework for embedding multiple features for a joint dense representation of the data to better predict the patient’s outcome by applying deep learning algorithms.
Postgraduate research supervision
LABORATORIES DESIGN & PROFESSIONAL STUDIES III and IV (EEE2036 and EEE2037)
The early clinical course of COVID-19 can be difficult to distinguish from other illnesses driving presentation to hospital. However, viral-specific PCR testing has limited sensitivity and results can take up to 72 h for operational reasons. We aimed to develop and validate two early-detection models for COVID-19, screening for the disease among patients attending the emergency department and the subset being admitted to hospital, using routinely collected health-care data (laboratory tests, blood gas measurements, and vital signs). These data are typically available within the first hour of presentation to hospitals in high-income and middle-income countries, within the existing laboratory infrastructure. We trained linear and non-linear machine learning classifiers to distinguish patients with COVID-19 from pre-pandemic controls, using electronic health record data for patients presenting to the emergency department and admitted across a group of four teaching hospitals in Oxfordshire, UK (Oxford University Hospitals). Data extracted included presentation blood tests, blood gas testing, vital signs, and results of PCR testing for respiratory viruses. Adult patients (>18 years) presenting to hospital before Dec 1, 2019 (before the first COVID-19 outbreak), were included in the COVID-19-negative cohort; those presenting to hospital between Dec 1, 2019, and April 19, 2020, with PCR-confirmed severe acute respiratory syndrome coronavirus 2 infection were included in the COVID-19-positive cohort. Patients who were subsequently admitted to hospital were included in their respective COVID-19-negative or COVID-19-positive admissions cohorts. Models were calibrated to sensitivities of 70%, 80%, and 90% during training, and performance was initially assessed on a held-out test set generated by an 80:20 split stratified by patients with COVID-19 and balanced equally with pre-pandemic controls. To simulate real-world performance at different stages of an epidemic, we generated test sets with varying prevalences of COVID-19 and assessed predictive values for our models. We prospectively validated our 80% sensitivity models for all patients presenting or admitted to the Oxford University Hospitals between April 20 and May 6, 2020, comparing model predictions with PCR test results. We assessed 155 689 adult patients presenting to hospital between Dec 1, 2017, and April 19, 2020. 114 957 patients were included in the COVID-negative cohort and 437 in the COVID-positive cohort, for a full study population of 115 394 patients, with 72 310 admitted to hospital. With a sensitive configuration of 80%, our emergency department (ED) model achieved 77·4% sensitivity and 95·7% specificity (area under the receiver operating characteristic curve [AUROC] 0·939) for COVID-19 among all patients attending hospital, and the admissions model achieved 77·4% sensitivity and 94·8% specificity (AUROC 0·940) for the subset of patients admitted to hospital. Both models achieved high negative predictive values (NPV; >98·5%) across a range of prevalences (≤5%). We prospectively validated our models for all patients presenting and admitted to Oxford University Hospitals in a 2-week test period. The ED model (3326 patients) achieved 92·3% accuracy (NPV 97·6%, AUROC 0·881), and the admissions model (1715 patients) achieved 92·5% accuracy (97·7%, 0·871) in comparison with PCR results. Sensitivity analyses to account for uncertainty in negative PCR results improved apparent accuracy (ED model 95·1%, admissions model 94·1%) and NPV (ED model 99·0%, admissions model 98·5%). Our models performed effectively as a screening test for COVID-19, excluding the illness with high-confidence by use of clinical data routinely available within 1 h of presentation to hospital. Our approach is rapidly scalable, fitting within the existing laboratory testing infrastructure and standard of care of hospitals in high-income and middle-income countries. Wellcome Trust, University of Oxford, Engineering and Physical Sciences Research Council, National Institute for Health Research Oxford Biomedical Research Centre.
Early prediction of pathogen infestation is a key factor to reduce the disease spread in plants. Macrophomina phaseolina (Tassi) Goid, as one of the main causes of charcoal rot disease, suppresses the plant productivity significantly. Charcoal rot disease is one of the most severe threats to soybean productivity. Prediction of this disease in soybeans is very tedious and non-practical using traditional approaches. Machine learning (ML) techniques have recently gained substantial traction across numerous domains. ML methods can be applied to detect plant diseases, prior to the full appearance of symptoms. In this paper, several ML techniques were developed and examined for prediction of charcoal rot disease in soybean for a cohort of 2,000 healthy and infected plants. A hybrid set of physiological and morphological features were suggested as inputs to the ML models. All developed ML models were performed better than 90% in terms of accuracy. Gradient Tree Boosting (GBT) was the best performing classifier which obtained 96.25% and 97.33% in terms of sensitivity and specificity. Our findings supported the applicability of ML especially GBT for charcoal rot disease prediction in a real environment. Moreover, our analysis demonstrated the importance of including physiological featured in the learning. The collected dataset and source code can be found in https://github.com/Elham-khalili/Soybean-Charcoal-Rot-Disease-Prediction-Dataset-code.