Bahar Khorram

Postgraduate Research Student

b.khorram@surrey.ac.uk

Academic and research departments

Centre for Vision, Speech and Signal Processing (CVSSP).

About

My research project

Embedding and data representation for time-coursed tabular healthcare data

While the superiority of deep neural networks in many domains is undoubtful, machine

learning for tabular data still did not fully benefit from the DNN power. The aim of the

project will be to address several challenges relating to the use of deep learning for

healthcare data, especially in terms of interpretability and robustness.

Supervisors

Samaneh Kouchaki

Publications

Bahar Khorram, Ramin Nilforooshan, Payam Barnaghi, Samaneh Kouchaki (2026)Prediction of Clinically Significant Depressive Symptoms at 2-Year Follow-Up in Older Adults: Machine Learning Study Using the English Longitudinal Study of Ageing (ELSA), In: JMIR Formative ResearchIn Press(In Press) JMIR Publications

Background: Depression in older adults is often underdiagnosed due to atypical symptom presentation and generational stigma, leading to delayed intervention. Early identification of individuals at risk of developing elevated depressive symptoms is therefore critical, but traditional approaches show limited predictive accuracy. To date, no study has applied machine learning models to predict clinically significant depressive symptoms at 2-year follow-up in older adults in the UK using data from the English Longitudinal Study of Ageing (ELSA). Moreover, the impact of encoding strategies for categorical healthcare variables has not been examined. Objective: This study aimed to develop and evaluate machine learning (ML) models to predict the clinically significant depressive symptoms at 2-year follow-up in older adults using ELSA data. We further compared ordinal and one-hot encoding strategies across different ML architectures and identified key predictors of depressive symptoms at follow-up. Methods: Data were drawn from four consecutive waves of ELSA, including participants aged ≥50 years without significant depressive symptoms at the baseline wave (Waves 6-9). Clinically significant depressive symptoms were defined as CES-D-8 ≥4 at the subsequent wave (Waves 7-10). Over 120 features spanning sociodemographic, psychological, and health-related domains were analysed. Eight ML models were applied including tree-based ensembles, deep learning architectures for tabular data, distance-based, probabilistic, and linear methods. Model performance was assessed using area under the receiver operating characteristic curve (AUROC), and F1-score. Model interpretability was examined using SHapley Additive exPlanations (SHAP). Sensitivity analyses assessed the robustness of results across alternative CES-D-8 thresholds (≥3, ≥4, ≥5) and encoding strategies. Results: Across waves, the best-performing models achieved mean AUROC scores of 0.72–0.73, with a peak of 0.75 in the highest-performing wave. Ordinal encoding consistently outperformed one-hot encoding across all ML models, yielding improvements in AUROC and F1-score, with the greatest increase in tree-based methods. SHAP consistently identified loneliness, sleep disturbances, and low social engagement as strong predictors of elevated depressive symptoms at follow-up. Sensitivity analyses across CES-D-8 thresholds demonstrated robust feature importance, with AUROC ranging from 0.67 to 0.82. Traditional machine learning models (Random Forest, XGBoost, Support Vector Machines) generally achieved higher performance than the deep learning models for this task. Conclusions: Our findings demonstrate the feasibility of predicting clinically significant depressive symptoms at 2-year follow-up in UK older adults with moderate accuracy. Ordinal encoding demonstrates superior performance for healthcare datasets with inherently ordered categorical features. The identification of consistent risk factors highlights opportunities for developing targeted clinical screening tools and preventive interventions. This study provides new evidence on depressive symptoms prediction in the UK context, leveraging longitudinal data from ELSA, and contributes to advancing digital mental health research for aging populations.

Bahar Khorram, Samaneh Kouchaki (2026)Early Sepsis Prediction Using a Hybrid LSTM-GAT Model: A Study on the PhysioNet 2019 Dataset, In: BMJ Health & Care InformaticsIn Press(In Press) BMJ Publishing Group

Objective: Sepsis is a potentially fatal systemic response to infection, in which early clinical intervention is critical to reduce mortality. This study presents a hybrid deep learning model that combines temporal and structural information from clinical data to improve early sepsis prediction. Methods: We used data from the PhysioNet/Computing in Cardiology Challenge 2019 to predict sepsis onset up to 12 hours in advance. We developed a hybrid model integrating Long Short-Term Memory (LSTM) networksand Graph Attention Networks (GAT) to capture temporaldynamics and inter-variable relationships. Performance was compared with three baseline models. To ensure robustness, all models were trainedusing five repeatedtrain-test splits with different random seeds. Results: The dataset includes40,336 adult ICU patients. Of all the patients, 2,932 developed sepsis during their stay. Each patient’s data includes hourly data on 40 clinical variables,including vital signs, laboratory results, and demographic information. The LSTM-GATmodel achieved an AUROC of 0.853 ± 0.005, F1-score of 0.627 ± 0.006, and specificity of 0.872 ± 0.007, outperforming baselinemodels. Despite being trained on fixed temporal windows, the model generalized well across multiple prediction horizons without retraining. Discussion: By integrating temporal and structural representations, the proposed approach achieves improved predictive performance compared with baseline. This capability may support earlier identification of high-risk patients and enhance timely clinical decision-making in critical care environments. Conclusions: The proposed model demonstrates the advantage of combining sequence and graph-based methods. It offers a promising tool for real-time clinical decision support in sepsis detection.