Abstract
Head-and-neck squamous-cell carcinoma (HNSCC) remains a major global health burden and a persistent source of cancer-related morbidity and mortality. Despite progress in imaging, radiotherapy delivery, and systemic therapies, clinical decision-making in HNSCC continues to be challenged by three intertwined factors: (i) the difficulty of accurately delineating tumour boundaries in anatomically complex regions; (ii) the limitations of conventional staging systems to fully capture tumour heterogeneity and latent biological risk; and (iii) the uncertainty surrounding treatment-response mechanisms, particularly the extent to which radiotherapy dose escalation causally improves survival for specific patient subgroups. While artificial intelligence (AI) has shown promise in addressing each of these tasks, the current landscape is largely fragmented: segmentation models optimise geometric accuracy in isolation, staging models focus on classification performance without explicit biological grounding, and outcome prediction frameworks often rely on associational learning that cannot distinguish baseline risk from true treatment benefit. As a result, existing AI systems in oncology rarely provide a coherent, biologically informed, and decision-supportive pipeline capable of linking tumour phenotype, radiobiological exposure, and clinically actionable treatment policies.This thesis aims to bridge this gap by proposing a unified, Transformer-enabled AI framework that connects three essential components of the HNSCC clinical workflow: tumour segmentation, multimodal staging and phenotyping, and causal dose–response estimation. The central hypothesis of this work is that deep imaging phenotypes derived from CT, when integrated with structured clinical information and radiobiological dose descriptors, can yield not only improved predictive performance, but also mechanistically interpretable and policy-relevant insights regarding personalised radiotherapy. To test this hypothesis, the proposed framework is evaluated across two publicly available HNSCC cohorts with distinct data regimes and heterogeneity profiles, namely the HEAD-NECK-RADIOMICS-HN1 dataset (HN-1) and TCGA-HNSCC, enabling a rigorous assessment of transportability and cohort-dependent model behaviour.
First, we develop a hybrid segmentation architecture, UNet++-TF, which augments the nested skip-connection design of UNet++ with lightweight Transformer bottlenecks to introduce global contextual modelling while preserving convolutional inductive bias for boundary localisation. This design improves geometric fidelity in tumour contouring and demonstrates superior delineation of complex tumour margins relative to CNN-only baselines, highlighting the value of parameter-efficient Transformer modules for clinically realistic segmentation pipelines.
Second, we introduce a multimodal Transformer-fusion strategy for risk modelling in HNSCC that integrates deep CT-derived representations with clinicopathologic variables and radiobiological dose descriptors—most notably the dose-rate-aware biologically effective dose, denoted BEDDD. Beyond improving predictive performance, the fusion model is explicitly analysed under differing cohort conditions. Notably, while multimodal fusion produces limited gains in the smaller, more homogeneous HN-1 cohort, it yields meaningful improvements in calibration and risk stratification within TCGA-HNSCC, particularly for imaging phenotypes characterised by necrosis and heterogeneity. These results highlight an important translational insight: multimodal integration is not universally beneficial and must be evaluated in relation to cohort size, class distribution, and phenotype diversity, especially when deployment is intended for real clinical environments.
Third, we move beyond association-based prediction by incorporating causal machine learning into the radiotherapy decision pipeline. Using the double-machine-learning paradigm with CausalForest-DML, we estimate both average and heterogeneous treatment effects associated with BEDDD escalation. The causal analyses reveal a modest but consistent survival benefit attributable to higher biologically effective dose delivery, with the strongest gains observed in clinically high-risk subgroups—particularly HPV-negative patients and tumours exhibiting necrosis-rich deep imaging phenotypes. These findings provide a causal interpretation that traditional prognostic models cannot offer and demonstrate the feasibility of imaging-guided dose personalisation supported by heterogeneous treatment effect estimation.
Overall, this thesis delivers an end-to-end AI framework that unifies segmentation, multimodal phenotyping, and causal dose-response modelling into a coherent decision-support pipeline for HNSCC management. By explicitly linking tumour appearance, biological dose exposure, and outcome benefit, the proposed approach lays a methodological foundation for biologically informed and patient-specific radiotherapy strategies. The framework also offers a clear pathway for future extensions, including multi-omic integration and federated learning deployment to enable broader, privacy-preserving clinical validation across institutions.
| Date of Award | 19 Feb 2026 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Ossama Alshabrawy (Supervisor) & Wai Lok Woo (Supervisor) |
Keywords
- Artificial Intelligence in Medical Imaging
- Transformer-Enhanced Tumour Segmentation
- Causal Machine Learning and Treatment Effect Estimation
- Multimodal Fusion and AJCC Staging
- Radiobiological Dose Modelling (BEDDD)
Cite this
- Standard