Auditory verbal hallucinations (AVHs) are the experience of hearing voices in the absence of any speaker, often associated with a schizophrenia diagnosis. Prominent cognitive models of AVHs suggest they may be the result of inner speech being misattributed to an external or non-self source, due to atypical self- or reality monitoring. These arguments are supported by studies showing that people experiencing AVHs often show an externalising bias during monitoring tasks, and neuroimaging evidence which implicates superior temporal brain regions, both during AVHs and during tasks that measure verbal self-monitoring performance. Recently, efficacy of noninvasive neurostimulation techniques as a treatment option for AVHs has been tested. Meta-analyses show a moderate effect size in reduction of AVH frequency, but there has been little attempt to explain the therapeutic effect of neurostimulation in relation to existing cognitive models. This article reviews inner speech models of AVHs, and argues that a possible explanation for reduction in frequency following treatment may be modulation of activity in the brain regions involving the monitoring of inner speech.