Abstract
d Vision Transformers (ViTs) holds potential for achieving both energy efficiency and high performance, particularly suitable for edge vision applications. However, a significant performance gap still exists between SNN-based ViTs and their ANN counterparts. Here, we first analyze why SNN-based ViTs suffer from limited performance and identify a mismatch between the vanilla self-attention mechanism and spatio-temporal spike trains. This mismatch results in degraded spatial relevance and limited temporal interactions. To address these issues, we draw inspiration from biological saccadic attention mechanisms and introduce an innovative Saccadic Spike Self-Attention (SSSA) method. Specifically, in the spatial domain, SSSA employs a novel spike distribution-based method to effectively assess the relevance between Query and Key pairs in SNN-based ViTs. Temporally, SSSA employs a saccadic interaction module that dynamically focuses on selected visual areas at each timestep and significantly enhances whole scene understanding through temporal interactions. Building on the SSSA mechanism, we develop a SNN-based Vision Transformer (SNN-ViT). Extensive experiments across various visual tasks demonstrate that SNN-ViT achieves state-of-the-art performance with linear computational complexity. The effectiveness and efficiency of the SNN-ViT highlight its potential for power-critical edge vision applications.
Original language | English |
---|---|
Title of host publication | The 13th International Conference on Learning Representations |
Publisher | International Conference on Learning Representations (ICLR) |
Number of pages | 22 |
Publication status | Accepted/In press - 22 Jan 2025 |
Event | The 13th International Conference on Learning Representations - Singapore EXPO, Singapore, Singapore Duration: 24 Apr 2025 → 28 Apr 2025 Conference number: 13th https://iclr.cc/Conferences/2025 |
Conference
Conference | The 13th International Conference on Learning Representations |
---|---|
Abbreviated title | ICLR 2025 |
Country/Territory | Singapore |
City | Singapore |
Period | 24/04/25 → 28/04/25 |
Internet address |