QP-SNN: Quantized and Pruned Spiking Neural Networks

Wenjie Wei, Zijian Zhou, Ammar Belatreche, Yimeng Shan, Yu Liang, Honglin Cao, Jieyuan Zhang, Malu Zhang, Yang Yang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to encode information and operate in an asynchronous event-driven manner, offering a highly energy-efficient paradigm for machine intelligence. However, the current SNN community focuses primarily on performance improvement by developing large-scale models, which limits the applicability of SNNs in resource-limited edge devices. In this paper, we propose a hardware-friendly and lightweight SNN, aimed at effectively deploying high-performance SNN in resource-limited scenarios. Specifically, we first develop a baseline model that integrates uniform quantization and structured pruning, called QP-SNN baseline. While this baseline significantly reduces storage demands and computational costs, it suffers from performance decline. To address this, we conduct an in-depth analysis of the challenges in quantization and pruning that lead to performance degradation and propose solutions to enhance the baseline's performance. For weight quantization, we propose a weight rescaling strategy that utilizes bit width more effectively to enhance the model's representation capability. For structured pruning, we propose a novel pruning criterion using the singular value of spatiotemporal spike activities to enable more accurate removal of redundant kernels. Extensive experiments demonstrate that integrating two proposed methods into the baseline allows QP-SNN to achieve state-of-the-art performance and efficiency, underscoring its potential for enhancing SNN deployment in edge intelligence computing.
Original languageEnglish
Title of host publicationThe 13th International Conference on Learning Representations
PublisherInternational Conference on Learning Representations (ICLR)
Number of pages25
Publication statusAccepted/In press - 22 Jan 2025
EventThe 13th International Conference on Learning Representations - Singapore EXPO, Singapore, Singapore
Duration: 24 Apr 202528 Apr 2025
Conference number: 13th
https://iclr.cc/Conferences/2025

Conference

ConferenceThe 13th International Conference on Learning Representations
Abbreviated titleICLR 2025
Country/TerritorySingapore
CitySingapore
Period24/04/2528/04/25
Internet address

Keywords

  • Spiking Neural Networks
  • Neuromorphic Computing
  • Spiking Pruning
  • Spiking Quantization

Cite this