Abstract
Multiview clustering (MVC) aims to reveal the underlying structure of multiview data by categorizing data samples into clusters. Deep learning-based methods exhibit strong feature learning capabilities on large-scale datasets. For most existing deep MVC methods, exploring the invariant representations of multiple views is still an intractable problem. In this paper, we propose a cross-view contrastive learning (CVCL) method that learns view-invariant representations and produces clustering results by contrasting the cluster assignments among multiple views. Specifically, we first employ deep autoencoders to extract view-dependent features in the pretraining stage. Then, a cluster-level CVCL strategy is presented to explore consistent semantic label information among the multiple views in the fine-tuning stage. Thus, the proposed CVCL method is able to produce more discriminative cluster assignments by virtue of this learning strategy. Moreover, we provide a theoretical analysis of soft cluster assignment alignment. The extensive experimental results obtained on several datasets demonstrate that the proposed CVCL method outperforms several state-of-the-art approaches.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV) |
| Place of Publication | Piscataway |
| Publisher | IEEE |
| Pages | 16752-16761 |
| Number of pages | 10 |
| ISBN (Electronic) | 9798350307184 |
| ISBN (Print) | 9798350307191 |
| DOIs | |
| Publication status | Published - 15 Jan 2024 |
| Event | 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 - Paris Convention Centre, Paris, France Duration: 2 Oct 2023 → 6 Oct 2023 https://iccv2023.thecvf.com/ |
Conference
| Conference | 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 |
|---|---|
| Abbreviated title | ICCV 2023 |
| Country/Territory | France |
| City | Paris |
| Period | 2/10/23 → 6/10/23 |
| Internet address |
Keywords
- representation learning
- computer vision
- computational modeling
- semantics
- self-supervised learning
- feature extraction