Towards Robust Recognition of Handwritten Arabic Characters with Diacritics Using an Incremental Learning Approach Based on CNNs
Abstract
Handwritten Arabic text recognition (HATR) presents unique challenges due to complex character shapes, contextual variations, cursive connections, and the presence of diacritical marks. This study introduces AHAD (Arabic Handwritten Alphabet with Diacritics), a novel benchmark dataset of 71,061 handwritten Arabic character images annotated with five primary vowel diacritics; Fathah, Kasrah, Dammah, Shaddah, and Sukoon, covering 492 distinct classes that combine character identity, contextual form, and diacritic. Leveraging this dataset, we propose an incremental learning framework based on Convolutional Neural Networks (CNNs) to address fine-grained recognition of handwritten Arabic characters with its corresponding diacritics. The model was initially trained on a 114-class dataset of handwritten Arabic characters (in all contextual forms) of non-diacritic characters and fine-tuned in two phases using the AHAD dataset. The two-phase strategy includes output layer expansion, learning rate adjustment, and gradual unfreezing of deeper layers to enhance knowledge retention and prevent catastrophic forgetting. The proposed method achieved a validation accuracy of 92.96% and a test accuracy of 93.26%. Our findings demonstrate the effectiveness of incremental learning for diacritic-aware Arabic handwriting recognition and establish AHAD as a strong baseline for future research in this field.
Downloads
References
A. A. A. Ali, M. Suresha, and H. A. M. Ahmed, A survey on arabic handwritten character recognition, SN Computer Science, vol. 1, no. 3, p. 152, 2020.
R. Najam and S. Faizullah, Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction, Applied Sciences, vol. 13, no. 13, p. 7568, 2023.
N. Rahal, M. Tounsi, A. Hussain, and A. M. Alimi, Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition, IEEE Access, vol. 9, pp. 18569-18584, 2021.
M. Alghamdi and W. Teahan, Printed Arabic script recognition: A survey, International Journal of Advanced Computer Science and Applications, vol. 9, no. 9, 2018.
H. M. Balaha, H. A. Ali, and M. Badawy, Automatic recognition of handwritten Arabic characters: a comprehensive review, Neural Computing and Applications, vol. 33, pp. 3011-3034, 2021.
R. Ahmed et al., Offline arabic handwriting recognition using deep machine learning: A review of recent advances, in Advances in Brain Inspired Cognitive Systems: 10th International Conference, BICS 2019, Guangzhou, China, July 13–14, 2019, Proceedings 10, 2020: Springer, pp. 457-468, 2020.
Z. Ullah and M. Jamjoom, An intelligent approach for Arabic handwritten letter recognition using convolutional neural network, PeerJ Computer Science, vol. 8, p. e995, 2022.
A. A. A. Ali and M. Suresha, Survey on segmentation and recognition of handwritten arabic script, SN Computer Science, vol. 1, no. 4, p. 192, 2020.
N. Alrobah and S. Albahli, Arabic handwritten recognition using deep learning: A Survey, Arabian Journal for Science and Engineering, vol. 47, no. 8, pp. 9943-9963, 2022.
M. Tanvir Parvez and S. A. Mahmoud, Offline Arabic Handwritten Text Recognition: A Survey, ACM computing surveys, vol. 45, no. 2, 2013.
H. Almuallim and S. Yamaguchi, A method of recognition of Arabic cursive handwriting, IEEE transactions on pattern analysis and machine intelligence, no. 5, pp. 715-722, 1987.
A. Maghraby and E. Samkari, Arabic Text Recognition with Harakat Using Deep Learning, IJCSNS, vol. 23, no. 1, p. 41, 2023.
Y. M. Alginahi, A survey on Arabic character segmentation, International Journal on Document Analysis and Recognition (IJDAR), vol. 16, no. 2, pp. 105-126, 2013.
A. Qaroush, A. Awad, A. Hanani, K. Mohammad, B. Jaber, and A. Hasheesh, Learning-free, divide and conquer text-line extraction algorithm for printed Arabic text with diacritics, Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 9, pp. 7699-7709, 2022.
A. Lawgali, M. Angelova, and A. Bouridane, HACDB: Handwritten Arabic characters database for automatic character recognition, in European workshop on visual information processing (EUVIP), 2013: IEEE, pp. 255-259, 2013.
M. Elkhayati and Y. Elkettani, Arabic handwritten text line segmentation using a multi-agent system and a directed CNN, presented at the 2021 Fifth International Conference On Intelligent Computing in Data Sciences (ICDS), 2021.
H. M. Eraqi and S. Abdelazeem, HMM-based offline Arabic handwriting recognition: Using new feature extraction and lexicon ranking techniques, in 2012 International Conference on Frontiers in Handwriting Recognition, 2012: IEEE, pp. 554-559, 2012.
A. A. A. Ali and S. Mallaiah, Intelligent handwritten recognition using hybrid CNN architectures based-SVM classifier with dropout, Journal of King Saud University-Computer and Information Sciences, vol. 34, no. 6, pp. 3294-3300, 2022.
S. Kanoun, A. M. Alimi, and Y. Lecourtier, Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 41, no. 2, pp. 579-590, 2011.
N. A.-A. Nahla Ibrahim Youssef, A REVIEW ON ARABIC HANDWRITING RECOGNITION, Journal of Southwest Jiaotong University, vol. 57, no. 6, 2022.
A. Mostafa et al., Ocformer: A transformer-based model for arabic handwritten text recognition, in 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 2021: IEEE, pp. 182-186, 2021.
W. Albattah and S. Albahli, Intelligent arabic handwriting recognition using different standalone and hybrid CNN architectures, Applied Sciences, vol. 12, no. 19, p. 10155, 2022.
A. El-Sawy, M. Loey, and H. El-Bakry, Arabic handwritten characters recognition using convolutional neural network, WSEAS Transactions on Computer Research, vol. 5, no. 1, pp. 11-19, 2017.
M. Torki, M. E. Hussein, A. Elsallamy, M. Fayyaz, and S. Yaser, Window-based descriptors for Arabic handwritten alphabet recognition: a comparative study on a novel dataset, arXiv preprint arXiv:1411.3519, 2014.
M. Pechwitz, S. S. Maddouri, V. Märgner, N. Ellouze, and H. Amiri, IFN/ENIT-database of handwritten Arabic words, in Proc. of CIFED, 2002, vol. 2: Citeseer, pp. 127-136, 2002.
N. Altwaijry and I. Al-Turaiki, Arabic handwriting recognition system using convolutional neural network, Neural Computing and Applications, vol. 33, no. 7, pp. 2249-2261, 2020.
H. M. Balaha, H. A. Ali, M. Saraya, and M. Badawy, A new Arabic handwritten character recognition deep learning system (AHCR-DLS), Neural Computing and Applications, vol. 33, pp. 6325-6367, 2021.
M. K. Dhamad, Arabic Hand Written Segmentation and Recognition, Baghdad University, 2014.
N. AbdAllah and S. Viriri, Off-Line arabic handwritten words segmentation using morphological operators, arXiv preprint arXiv:2101.02797, 2021.
S. Alqahtani, A. Mishra, and M. Diab, A multitask learning approach for diacritic restoration, arXiv preprint arXiv:2006.04016, 2020.
W. Chen, Y. Liu, W. Wang, T. Tuytelaars, E. M. Bakker, and M. Lew, On the exploration of incremental learning for fine-grained image retrieval, arXiv preprint arXiv:2010.08020, 2020.
L. Yu et al., Semantic drift compensation for class-incremental learning., in CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020 IEEE, pp. 6980-6989, 2020.
M. Boschini et al., Transfer without forgetting, in European conference on computer vision, 2022: Springer, pp. 692-709, 2022.
D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2014: Banff, 2014.
N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, On large-batch training for deep learning: Generalization gap and sharp minima, arXiv preprint arXiv:1609.04836, 2017.
H. M. Balaha et al., Recognizing arabic handwritten characters using deep learning and genetic algorithms, Multimedia Tools and Applications, vol. 80, pp. 32473-32509, 2021.
Copyright (c) 2025 EMITTER International Journal of Engineering Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright to this article is transferred to Politeknik Elektronika Negeri Surabaya(PENS) if and when the article is accepted for publication. The undersigned hereby transfers any and all rights in and to the paper including without limitation all copyrights to PENS. The undersigned hereby represents and warrants that the paper is original and that he/she is the author of the paper, except for material that is clearly identified as to its original source, with permission notices from the copyright owners where required. The undersigned represents that he/she has the power and authority to make and execute this assignment. The copyright transfer form can be downloaded here .
The corresponding author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors. This agreement is to be signed by at least one of the authors who have obtained the assent of the co-author(s) where applicable. After submission of this agreement signed by the corresponding author, changes of authorship or in the order of the authors listed will not be accepted.
Retained Rights/Terms and Conditions
- Authors retain all proprietary rights in any process, procedure, or article of manufacture described in the Work.
- Authors may reproduce or authorize others to reproduce the work or derivative works for the author’s personal use or company use, provided that the source and the copyright notice of Politeknik Elektronika Negeri Surabaya (PENS) publisher are indicated.
- Authors are allowed to use and reuse their articles under the same CC-BY-NC-SA license as third parties.
- Third-parties are allowed to share and adapt the publication work for all non-commercial purposes and if they remix, transform, or build upon the material, they must distribute under the same license as the original.
Plagiarism Check
To avoid plagiarism activities, the manuscript will be checked twice by the Editorial Board of the EMITTER International Journal of Engineering Technology (EMITTER Journal) using iThenticate Plagiarism Checker and the CrossCheck plagiarism screening service. The similarity score of a manuscript has should be less than 25%. The manuscript that plagiarizes another author’s work or author's own will be rejected by EMITTER Journal.
Authors are expected to comply with EMITTER Journal's plagiarism rules by downloading and signing the plagiarism declaration form here and resubmitting the form, along with the copyright transfer form via online submission.
