Despite state-of-the-art solutions to detect phishing attacks, there is still a lack of accuracy for the detection systems in the online mode which leading to loopholes in web-based transactions. In this research, a novel framework is proposed which combines a neural network with reinforcement learning to detect phishing attacks in the online mode for the first time. The proposed model has the ability to adapt itself to produce a new phishing email detection system that reflects changes in newly explored behaviours, which is accomplished by adopting the idea of reinforcement learning to enhance the system dynamically over time. The proposed model solves the problem of limited dataset by automatically add more emails to the online dataset in the online mode. A novel algorithm is proposed to explore any new phishing behaviours in the new dataset. Through rigorous testing using the well-known data sets, we demonstrate that the proposed technique can handle zero-day phishing attacks with high performance levels achieving high accuracy, TPR, and TNR at 98.63%, 99.07%, and 98.19% respectively. In addition, it shows low FPR and FNR, at 1.81% and 0.93% respectively. Comparison with other similar techniques on the same dataset shows that the proposed model outperforms the existing methods.