A Connected Components Based Layout Analysis Approach for Educational Documents

Ruiying Liu, Shenbao Yu, Fan Yang, Yinghui Pan, Yifeng Zeng*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Downloads (Pure)

Abstract

Layout analysis, which aims to detect and categorize areas of interest on document images, is an increasingly important part in document image processing. Existing researches have conducted layout analysis on various documents, but none has been proposed for documents yielded from teaching, i.e. exam papers and workbooks, which are worth studying. In this paper, we propose a novel layout analysis system to achieve two tasks for workbook pages and exam papers respectively. On one hand, we segment text and non-text areas of workbook pages. On the other hand, we extract regions of interest on exam papers. Our system is based on connected component (CC) analysis, specifically, it extracts geometric features and spatial information of CCs to recognize page elements. We carried out experiments on images collected from real-world scenarios, and promising results confirmed the applicability and effectiveness of our system.
Original languageEnglish
Title of host publicationICCSE 2021
Subtitle of host publicationThe 16th International Conference on Computer Science and Education
Place of PublicationPiscataway
PublisherIEEE
Number of pages6
Publication statusAccepted/In press - 25 May 2021
EventICCSE 2021: The 16th International Conference on Computer Science and Education - Lancaster University, Lancaster, United Kingdom
Duration: 17 Aug 202121 Aug 2021
http://www.ieee-iccse.org/?_v=1625228873586

Conference

ConferenceICCSE 2021
CountryUnited Kingdom
CityLancaster
Period17/08/2121/08/21
Internet address

Fingerprint

Dive into the research topics of 'A Connected Components Based Layout Analysis Approach for Educational Documents'. Together they form a unique fingerprint.

Cite this