KERTAS: dataset for automatic dating of ancient Arabic manuscripts

Kalthoum Adam, Asim Baig, Somaya Al-Maadeed*, Ahmed Bouridane, Sherine El-Menshawy

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)
35 Downloads (Pure)

Abstract

The age of a historical manuscript can be an invaluable source of information for paleographers and historians. The process of automatic manuscript age detection has inherent complexities, which are compounded by the lack of suitable datasets for algorithm testing. This paper presents a dataset of historical handwritten Arabic manuscripts designed specifically to test state-of-the-art authorship and age detection algorithms. Qatar National Library has been the main source of manuscripts for this dataset while the remaining manuscripts are open source. The dataset consists of over 2000 images taken from various handwritten Arabic manuscripts spanning fourteen centuries. In addition, a sparse representation-based approach for dating historical Arabic manuscript is also proposed. There is lack of existing datasets that provide reliable writing date and author identity as metadata. KERTAS is a new dataset of historical documents that can help researchers, historians and paleographers to automatically date Arabic manuscripts more accurately and efficiently.

Original languageEnglish
Pages (from-to)283-290
Number of pages8
JournalInternational Journal on Document Analysis and Recognition
Volume21
Issue number4
Early online date8 Sept 2018
DOIs
Publication statusPublished - 1 Dec 2018

Keywords

  • Historical documents dataset
  • Image processing
  • Classification
  • Feature extraction

Fingerprint

Dive into the research topics of 'KERTAS: dataset for automatic dating of ancient Arabic manuscripts'. Together they form a unique fingerprint.

Cite this