The use of bot malware and botnets as a tool to facilitate other malicious cyber activities (e.g. distributed denial of service attacks, dissemination of malware and spam, and click fraud). However, detection of botnets, particularly peer-to-peer (P2P) botnets, is challenging. Hence, in this paper we propose a sophisticated traffic reduction mechanism, integrated with a reinforcement learning technique. We then evaluate the proposed approach using real-world network traffic, and achieve a detection rate of 98.3%. The approach also achieves a relatively low false positive rate (i.e. 0.012%).