Spike-Driven Lightweight Large Language Model With Evolutionary Computation

Malu Zhang, Wenjie Wei, Zijian Zhou, Wanlong Liu, Jie Zhang*, Ammar Belatreche, Yang Yang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Downloads (Pure)

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, but their deployment in resource-constrained environments remains challenging due to substantial memory and computational requirements. Benefiting from the sparse event-driven computation paradigm of Spiking Neural Networks (SNNs), some research has focused on designing spike-based language models. However, existing spike-based language models achieve only partial computational efficiency gains and fail to address memory constraints comprehensively. In this paper, we propose an evolved and quantized spike-driven language model (EQ-SpikeLM) to address identified challenges. This model incorporates two primary innovations. First, inspired by the artificial bee colony algorithm in evolutionary computation, we propose an architecture evolution method, namely ABC-Arc. This method optimizes network topology by systematically removing redundant neural pathways. Second, a dynamic post-training quantization (DynPTQ) strategy is developed for the evolved SpikeLM, facilitating the conversion of floating-point parameters to lower-bit precision without requiring model retraining. By combining these two methods, EQ-SpikeLM significantly reduces storage and computational demands while preserving model performance. Experimental evaluation on the GLUE benchmark demonstrates EQ-SpikeLM's ability to maintain performance equivalent to its uncompressed counterpart, with a substantial reduction in both model size and power consumption. These results position EQ-SpikeLM as a viable approach for deploying large language models in resource-constrained edge computing scenarios.

Original languageEnglish
Number of pages14
JournalIEEE Transactions on Evolutionary Computation
Early online date5 Sept 2025
DOIs
Publication statusE-pub ahead of print - 5 Sept 2025

Keywords

  • Evolutionary computation
  • Large language model
  • Model compression
  • Spiking neural networks

Cite this