Background: STEM education providers increasingly use complex intervention models to redress persistent under-representation in STEM sectors. These intervention models require robust evaluation to determine their effectiveness. The study examines a complex, sustained intervention intended to build science capital in young people aged 11–15 over 3 years, which drew on science capital theory and related research to inform intervention design and evaluation. When evaluation results differed from those anticipated, process evaluation supported authors to interpret these findings. By outlining challenges faced in the evaluation of a complex, sustained STEM outreach intervention, this paper addresses critique that outreach programmes focus too often on short-term and positive findings. Results: Intervention outcomes were assessed using a quantitative questionnaire adapted from science capital research, issued to pupils at the intervention’s baseline (2015), midpoint (2017) and endpoint (2019). Adopting a cohort-based model, the 2015 questionnaire collected a baseline for the Year 7 intervention group (children aged 11–12, N = 464), and established baseline comparator groups for Year 9 (children aged 13–14, N = 556) and Year 11 (children aged 15–16, N = 342). The Year 7 intervention group was re-evaluated again in 2017 when in Year 9 (N = 556), and in 2019 when in Year 11 (N = 349). Analysis explored differences in science capital between the intervention and comparator groups and identified lower composite science capital scores and greater proportions of low- and medium-science capital in the intervention group when compared with the two comparator groups. A rationale for this emerged from the subsequent process evaluation. Conclusions: This study’s main contribution is the provision of nuanced insight into the evaluation of STEM interventions for use by others evaluating in similar circumstances, particularly those adopting sustained or complex delivery models. This paper concludes that assessing the effectiveness of complex interventions cannot rely on quantitative evaluation of outcomes alone. Process evaluation can complement quantitative instruments and aid interventions to better understand variability and interpret results. While this study highlights the value of science capital when designing intervention models, it also illustrates the inherent challenges of using an outcome measure of ‘building science capital’, and quantifying levels over an intervention’s course.