TY - JOUR
T1 - Semantic combined network for zero-shot scene parsing
AU - Wang, Yinduo
AU - Zhang, Haofeng
AU - Wang, Shidong
AU - Long, Yang
AU - Yang, Longzhi
PY - 2020/3/27
Y1 - 2020/3/27
N2 - Recently, image-based scene parsing has attracted increasing attention due to its wide application. However, conventional models can only be valid on images with the same domain of the training set and are typically trained using discrete and meaningless labels. Inspired by the traditional zero-shot learning methods which employ auxiliary side information to bridge the source and target domains, the authors propose a novel framework called semantic combined network (SCN), which aims at learning a scene parsing model only from the images of the seen classes while targeting on the unseen ones. In addition, with the assistance of semantic embeddings of classes, the proposed SCN can further improve the performances of traditional fully supervised scene parsing methods. Extensive experiments are conducted on the data set Cityscapes, and the results show that the proposed SCN can perform well on both zero-shot scene parsing (ZSSP) and generalised ZSSP settings based on several state-of-the-art scenes parsing architectures. Furthermore, the authors test the proposed model under the traditional fully supervised setting and the results show that the proposed SCN can also significantly improve the performances of the original network models.
AB - Recently, image-based scene parsing has attracted increasing attention due to its wide application. However, conventional models can only be valid on images with the same domain of the training set and are typically trained using discrete and meaningless labels. Inspired by the traditional zero-shot learning methods which employ auxiliary side information to bridge the source and target domains, the authors propose a novel framework called semantic combined network (SCN), which aims at learning a scene parsing model only from the images of the seen classes while targeting on the unseen ones. In addition, with the assistance of semantic embeddings of classes, the proposed SCN can further improve the performances of traditional fully supervised scene parsing methods. Extensive experiments are conducted on the data set Cityscapes, and the results show that the proposed SCN can perform well on both zero-shot scene parsing (ZSSP) and generalised ZSSP settings based on several state-of-the-art scenes parsing architectures. Furthermore, the authors test the proposed model under the traditional fully supervised setting and the results show that the proposed SCN can also significantly improve the performances of the original network models.
KW - learning (artificial intelligence)
KW - natural language processing
KW - object recognition
KW - object detection
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85082011912&partnerID=8YFLogxK
U2 - 10.1049/iet-ipr.2019.0870
DO - 10.1049/iet-ipr.2019.0870
M3 - Article
AN - SCOPUS:85082011912
VL - 14
SP - 757
EP - 765
JO - IET Image Processing
JF - IET Image Processing
SN - 1751-9659
IS - 4
ER -