Paper

A Proposed OCR Algorithm for the Recognition of Handwritten Arabic Characters


Authors:
Abdelhay A. Sallam; Mohammed R. Elbasyouni; Cheng Y. Suen; Ahmed T. Sahlol
Abstract
Recognition of handwritten Arabic text awaits accurate recognition solutions. There are many difficulties facing a good handwritten Arabic recognition system such as unlimited variation in human handwriting, similarities of distinct character shapes, character overlaps, and interconnections of neighbouring characters and their position in the word. Arabic characters are drawn in four forms: Isolated, Initial, Medial, and Final. The typical Optical Character Recognition (OCR) systems are based mainly on three stages, pre-processing, features extraction and recognition. Each stage has its own problems and effects on the system efficiency which may be time consuming, resource using and may contribute to the possibility of recognition errors. There are many feature extraction methods for handwritten letters. In this paper, an efficient approach for the recognition of off-line Arabic handwritten characters is presented. The approach is based on novel preprocessing operations (including different kinds of noise removal and dilation), structural, statistical and topological features from the main body of the character and also from the secondary components. Evaluation of the importance and accuracy of the selected features is made. An off-line recognition system based on the selected features was built. The system was trained and tested with CENPRMI dataset. We used the popular Feed Forward Neural Network for classification to enhance recognition accuracy. The proposed algorithm obtained has promising results in terms of accuracy (success rate of 100% for some letters with an average rate of 88%). Compared to other related works, we find that our success outcomes are higher.
Keywords
Handwritten Arabic Characters; Noise Removal; Secondary Components
StartPage
8
EndPage
22
Doi
Download | Back to Issue| Archive