Exploiting Spatial Attention and Contextual Information for Document Image Segmentation

Yuman Sang, Yifeng Zeng*, Ruiying Liu, Fan Yang, Zhangrui Yao, Yinghui Pan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

We propose a new framework of combining an attention mechanism with a conditional random field to deal with a document image segmentation task. The framework aims to recognize homogeneous regions, e.g. text, figures, or tables, in document images through a pixel-wise spatial attention module. The attention module obtains essential global information and gathers long-distance pixel dependencies. To get extra knowledge around images, we use a conditional random field to model contextual information in the document. The new framework enables an effective combination of pixel features with their contextual information in the document image segmentation task. We conduct extensive experiments over multiple challenging datasets and demonstrate the performance of our new framework in comparison to a series of state-of-the-art segmentation methods.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 26th Pacific-Asia Conference, PAKDD 2022, Proceedings
EditorsJoão Gama, Tianrui Li, Yang Yu, Enhong Chen, Yu Zheng, Fei Teng
PublisherSpringer
Pages261-274
Number of pages14
ISBN (Print)9783031059803
DOIs
Publication statusPublished - 2022
Event26th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2022 - Chengdu, China
Duration: 16 May 202219 May 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13282 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2022
Country/TerritoryChina
CityChengdu
Period16/05/2219/05/22

Fingerprint

Dive into the research topics of 'Exploiting Spatial Attention and Contextual Information for Document Image Segmentation'. Together they form a unique fingerprint.

Cite this