TY - GEN
T1 - Exploiting Spatial Attention and Contextual Information for Document Image Segmentation
AU - Sang, Yuman
AU - Zeng, Yifeng
AU - Liu, Ruiying
AU - Yang, Fan
AU - Yao, Zhangrui
AU - Pan, Yinghui
N1 - Funding Information: Supported by the NSF in China (NSF: 62176225 and 61836005).
PY - 2022
Y1 - 2022
N2 - We propose a new framework of combining an attention mechanism with a conditional random field to deal with a document image segmentation task. The framework aims to recognize homogeneous regions, e.g. text, figures, or tables, in document images through a pixel-wise spatial attention module. The attention module obtains essential global information and gathers long-distance pixel dependencies. To get extra knowledge around images, we use a conditional random field to model contextual information in the document. The new framework enables an effective combination of pixel features with their contextual information in the document image segmentation task. We conduct extensive experiments over multiple challenging datasets and demonstrate the performance of our new framework in comparison to a series of state-of-the-art segmentation methods.
AB - We propose a new framework of combining an attention mechanism with a conditional random field to deal with a document image segmentation task. The framework aims to recognize homogeneous regions, e.g. text, figures, or tables, in document images through a pixel-wise spatial attention module. The attention module obtains essential global information and gathers long-distance pixel dependencies. To get extra knowledge around images, we use a conditional random field to model contextual information in the document. The new framework enables an effective combination of pixel features with their contextual information in the document image segmentation task. We conduct extensive experiments over multiple challenging datasets and demonstrate the performance of our new framework in comparison to a series of state-of-the-art segmentation methods.
KW - Conditional random field
KW - Document image segmentation
KW - Spatial attention mechanism
UR - http://www.scopus.com/inward/record.url?scp=85130236179&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05981-0_21
DO - 10.1007/978-3-031-05981-0_21
M3 - Conference contribution
AN - SCOPUS:85130236179
SN - 9783031059803
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 261
EP - 274
BT - Advances in Knowledge Discovery and Data Mining - 26th Pacific-Asia Conference, PAKDD 2022, Proceedings
A2 - Gama, João
A2 - Li, Tianrui
A2 - Yu, Yang
A2 - Chen, Enhong
A2 - Zheng, Yu
A2 - Teng, Fei
PB - Springer
T2 - 26th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2022
Y2 - 16 May 2022 through 19 May 2022
ER -