Abstract
Off-line writer identification is the process of matching a handwritten sample with its author. Manual identification is very time-consuming because it requires a meticulous comparison of character shape details. Consequently the automation of writer identification has become an important area of research interest. The codebook (or bag of features) approach is a state-of-the-art computerized technique for writer identification. One way to achieve a high identification rate is to expose the personalized set of character shapes, or allographs, that a writer has adopted over the years. The main problem associated with this approach is the extremely large of number of points of interest that are generated. In this paper we extend the basic model to include an ensemble of codebooks. Additionally, Kernel discriminant analysis using spectral regression (SR-KDA) is used as a dimensionality reduction technique in order to avoid over-fitting. Fusion of multiple codebooks is shown to increase the identification rate by 11% compared with a single codebook approach.
Original language | English |
---|---|
Pages (from-to) | 18-25 |
Journal | Pattern Recognition Letters |
Volume | 59 |
Early online date | 17 Mar 2015 |
DOIs | |
Publication status | Published - 1 Jul 2015 |
Keywords
- forensic document examination
- Kernel discriminant analysis
- Grapheme features