Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) A novel Arabic handwriting recognition system based on image matching technique 1st Maamar Kef 2nd Leila Chergui Department of Computer Sciences Department of Computer Sciences Universit Mostefa Benboulaid - Batna 2 Universit Mostefa Benboulaid - Batna 2 Batna, Algeria Batna, Algeria lm kef@yahoo.fr pgleila@yahoo.fr Abstract—This paper presents a new off-line recognition sys- next section resumes several works done in handwritten Arabic tem for Arabic handwritten words. The proposed system uses recognition field. Section 3 detail the feature extraction method scale-invariant descriptor namely SIFT, and based on an image and section 4 describes our new Arabic handwritten words matching technique for achieving classification. The recognition process was done through a Keypoints matching procedure, using database. Experimental results including keypoints detection a nearest-neighbor distance-ratio. The paper presents also a new and matching are reported in section 5, where a comparative large Arabic handwritten word database. This database provides analysis of the experimental results is also discussed. Finally, a new framework for benchmarking and gives a new freely some concluding remarks end the paper. available Arabic handwritten word dataset. Several tests have been performed using our new database and the well known IFN/ENIT database for comparison purposes. A high correct I. R ELATED WORKS recognition rate was reported. Index Terms—Arabic handwriting recognition, Features ex- The main idea of scale invariant feature descriptor (SIFT) traction, SIFT descriptors, Keypoints matching, New Arabic [6] is resumed on detecting distinctive invariant features from database. images that can be later used to perform reliable matching Automatic recognition of handwritten scripts is an area between different views of an object or scene. Because of of pattern recognition that is extremely useful in numerous the proved efficiency of the SIFT keypoint detector, a large fields, including documentation analysis, mailing address in- number of researcher are attracted further for expanding or terpretation, bank check processing and more recently the using these descriptors in many applications. In handwritten reconstruction and recognition of historical manuscripts. recognition domain, SIFT was addressed in a few published Recognition of Arabic handwriting remains one of the papers. most challenging problems in the pattern recognition domain. Diem and Sablatnig [5] tried to solve the problem of de- Arabic is written by more than 240 million people, in over graded handwritten characters recognition using SIFT descrip- 20 different countries. The standard Arabic script contains 28 tors. In order to recognize a character, the local descriptors letters. Each letter has either two or four different shapes, are initially classified with a Support Vector Machine (SVM) depending on it position within a word. and then identified by a voting scheme of neighboring local One of the most challenging aspects of off-line handwriting descriptors. recognition is finding a good database that well represents De Campos [4] presented a solution to the problem of the variety of handwriting styles. Comparing with the great recognizing characters in images of natural scenes. Such number of existing databases for English script, IFN/ENIT situations could not be well handled by traditional OCR database [1] was the only freely accessible Arabic database; (Optical Character Recognition) techniques. The problem is this incited us to develop a new large database which will be addressed in an object categorization framework based on freely available for research and academic use. a bag-of-visual-words representation. For feature extraction, In this research we present a new fast and robust Arabic authors used SIFT and other descriptors. handwriting recognition system based on SIFT descriptor and Zhang et al. [13] proposed a novel SIFT based feature a recognizing procedure that use keypoints matching. Contrary for off-line handwritten Chinese character recognition. The to the majority of handwritten characters recognition systems, presented feature is a modification of SIFT descriptor taking the proposed method operates without any preprocessing steps, into account of the characteristics of handwritten Chinese since the used features are invariant regarding images’ trans- samples. MQDF classifier was used in classification phase and formations and are highly distinctive in a large database. We showed that the proposed method outperforms original SIFT also introduce a new large database of Arabic handwritten feature and two traditional features, Gabor feature and gradient words which provides a comparison tool for research works feature. in characters recognition domain. In [11] a new method for the off-line recognition of Tamil The remainder of this paper is divided into six sections. The handwriting characters based on local feature extraction was investigated. Authors represented each character by a set of local SIFT feature vectors. Character type classification on a document image problem was addressed in [12]. In that work, authors proposed a method based on a probabilistic topic model and SIFT descriptor. The character’ types are: mathematical formula, printed Japanese, printed and handwritten English. Ramana et al. [9] examined the issues in recognizing the Devanagari characters in the wild like sign boards, advertise- ments, logos, shop names, notices, and address posts. They used a variation of SIFT, namely Dense SIFT features. These are derived by densely sampling keypoints from the character and extracting SIFT descriptors around them. Mao et al. [8] incorporated SIFT descriptors in Chinese calligraphy word style recognition domain (seal script, clerical script, standard script, semi-cursive script and cursive script). In this study, authors proposed a method based on K-Nearest Neighbors (KNN) and feature vector filtering. Experiments show that SIFT feature has better recognition result than that of Gabor feature and GIST feature. For Arabic handwriting recognition, we found only one work which uses SIFT as descriptor introduced by Rothacker Fig. 1. SIFT features detection algorithm. et al. [10]. They applied the Harris detector to extract coins and for each coin, they detect keypoints using SIFT descriptors; they also used a segmentation phase with a set of Hidden For image matching and recognition, SIFT features are Markov models. first extracted from a set of learning images and stored in a Aouadi and Kacem Echi [14] presented a new method for database. A new image is matched by individually comparing Arabic handwritten word recognition. The authors extracted features extracted from it to those previously stocked in the some structural features from words image and trained a database and finding candidate matching based on Euclidean classic right-left Hidden Markov Model. Experiments were distances calculated from their feature vectors. The Euclidean carried on a set of ancient Arabic manuscripts and the IFN- distance between the SIFT feature descriptors is considered as ENIT standard database. An average recognition rate of 87% a cost measure. was reported. The experiments conducted in this paper use a 4x4x8 = 128 Rabi et al. [15] presents a recognition system of Arabic elements in each feature vector of a keypoint. Regarding the cursive handwriting using embedded training based on hid- image matching procedure, the local descriptors from several den Markov models. The extracted features were based on images are matched. A complete comparison is performed the densities of foreground pixels, concavity and derivative by computing the Euclidean distance between all potential features using sliding window, some of these features depends matching pairs. A nearest-neighbor distance-ratio matching on baselines estimation. the system achieved 87.93% of correct criterion is then used to reduce mismatches. recognition. III. T HE NEW A RABIC HANDWRITING WORD DATABASE II. SIFT DESCRIPTOR In order to make the databases as much representative as SIFT was developed by David Lowe in 2004 [7] as a con- possible, we have focused on most aspects responsible of tinuation of his previous work on invariant feature detection variations of handwriting styles like the age, the sex, the [6], and it presents a method for detecting distinctive invariant educational level, the profession, the residence town, etc. features from images that can be later used to perform reliable Data collection was conducted using 2100 forms. Each matching between different views of an object or scene. This writer was asked to fill one form comprising 11 Algerian approach consists of four major computational stages (figure village names, each word is written twice. Also, there is a field 1). for writer’s personal informations including; his name, his age, Each of these stages are executed in a descending order his residence town, and his profession. Each form possesses (cascade approach) and on every stage a filtering process is 15 exemplars. An example of a filled form represented in a applied so that only the keypoints that are robust enough are grayscale level is shown in figure 2. allowed to pass to the next stage. According to Lowe, this All the extracted images have been archived in two different will reduce significantly the cost of detecting the features. The formats: grayscale and binary formats in TIFF file format descriptor is formed from a vector containing the values of all at 300 dpi resolutions. The Arabic handwritten data were the orientation histogram entries. sorted and saved into four sets. Figure 3 shows some statistics database was used as a comparison tool to evaluate re- searchers’ works during the three competitions of the ICDAR (International Conference on Document Analysis and Recog- nition) organized in 2005, 2007 and 2009 [1]. A. Keypoints detection In our study, we are not interested by the matching of two distinct images representing the same scene (or parts of the same scene) taken from two different views; our aim is to compare two images of two handwritten words whose similar contents will be in the same area, for all images representing a given word class. The suggested method divides vertically the word images to be recognized into five frames of equal size. The objective here is to compare the detected keypoints in a given frame with its corresponding in another image representing the same word class. Figure 4 shows an example. Fig. 4. Keypoints matching of two corresponding frames in two images. The number of frames was selected through different tests of several scenarios and their impact on the recorded recognition Fig. 2. Example of a filled form. rate (table 1). For each word class, we build a model of keypoints using 25 concerning the number of words, sub-words and characters in images as training samples. Each class model contains a given each set. number of keypoints divided into five subsets, representing the different frames composing the word images. The construction process of each class model is detailed in the flow chart presented in figure 5. This process allows us to filter and improve the robustness of keypoints extracted from the training images of a given class. The number of training images used to build each class model was also fixed through several experiments. We noticed that using more than 25 images during the learning process will increase the number of detected keypoints without bring- ing a significant improvement to the recognition rate (figure 6). Fig. 3. Character’ number, sub-words’ number and words’ number of our new database. A set of 128 features are extracted for each keypoint, since a keypoint descriptor consists of eight 4x4 orientation histograms. Figure 7 presents the keypoints detection process IV. A NOVEL RECOGNITION SYSTEM using SIFT descriptors for the five frames representing a word In order to show the efficiency of the proposed system, image taken from our database. experimental tests were achieved on both databases; the Several tests were conducted in order to determine the IFN/ENIT and our new database. IFN/ENIT was produced matching ratio; this parameter fixes the matched keypoints’ by the Institute for Communications Technology at Technical number which affects the recognition rate. Tests show that the University of Braunschweig (Institut für Nachrichtentechnik, number of keypoints and the matching ratio are rising at the IFN) and the l’Ecole Nationale d’Inégnieurs de Tunis. This same time (figure 8), but the discriminating capacity of these TABLE I I MPACT OF THE FRAMES ’ NUMBER ON THE RECOGNITION RATE Frames’ number 1 2 3 4 5 6 7 8 Recognition rate (%) 57.94 63.38 76.77 87.61 93,72 90.61 86.88 81.16 Fig. 7. Keypoints detection using SIFT descriptors for a handwritten Arabic word. keypoints decreased. Figure 9 shows that Keypoints matching becomes more efficient when the matching ratio is fixed to 0.9 even if the number of keypoints is reduced. Worse still, the recognition rate tends to decrease when the ratio gets higher values. Fig. 5. Construction process of classes’ models. Fig. 8. Effect of the matching ratio on the matched keypoints’ number. Fig. 6. Effect of the training images’ number per class on the recognition rate. Fig. 9. Impact of the matching ratio on the recognition rate. The number of keypoints representing each model of the from features vectors by comparing the Euclidean distance of 200 used classes, with which the system registered the highest the closest neighbor to that of the second closest neighbor. recognition rate, is given in figure 10. Keypoints matching of the five frames representing an image pair is illustrated in figure 11. Fig. 10. Keypoints’ number in each class’s model. Fig. 11. Keypoints matching of an image pair. In the recognition process, each image of test must be firstly B. Keypoints matching divided into five frames, then the keypoints are calculated Once the keypoints were detected in two images, they for each frame. The matching process is then performed as should be paired. The best candidate match for each keypoint follows: in the first image is found by identifying its nearest neighbor in Repeat the following steps for each class model and each the second one. In this work, matching keypoints are calculated test image: 1) Each frame representing a part of a test image is compared with its correspondent part of a class model. 2) The matched keypoints rate (MKR) is then calculated for each frame as follows: matched keypoints’ number M KR = (1) detected keypoints’ number from a test image + model keypoints’ number 3) An average matching rate (AMR) is then established: M KR(f rame1) + M KR(f rame2) + M KR(f rame3) + M KR(f rame4) + M KR(f rame5) AM R = (2) 5 Finally, the model recording the highest average matching TABLE II rate will be considered as the target class. Figure 12 shows an R EGISTERED PERFORMANCES USING IFN/ENIT AND OUR NEW DATABASES example summarizing these stages. The keypoint descriptors are highly distinctive, which al- Classes number Recognition rate (%) IFN/ENIT database Our database lows a single feature to find its correct match with good 40 97.33 98.83 probability in a large database of features. 60 96.77 98.11 Tests conducted on both databases (IFN/ENIT and our 80 94.58 96.41 100 93.46 95.13 new database) are listed in table 2, where we can observe 120 90.61 93.72 that the system registered high performances with scalability, 160 88.90 91.74 since a slight loss of approximatively 8% of the accuracy 200 88 90.10 was registered when the number of classes that have to be recognized increased from 40 to 200. We also noticed that a small improvement of the recognition rate was reported during outperforms the other systems which proves the effectiveness tests done on our new database compared to the IFN/ENIT of our approach. database. V. C ONCLUSION C. Results comparison The contribution of this paper is twofold. Firstly, a new large and free database for Arabic handwriting words is presented. In order to prove the efficiency of the proposed method, we Secondly, an effective and robust off-line handwritten Arabic compare the obtained results with some pertinent works done words recognition system is presented and evaluated on this on handwritten Arabic words recognition. However, only the new database. systems tested on IFN/ENIT database have been mentioned. The developed sytem use a new type of features, namely The reported results (table 3) show that our proposed system SIFT descriptors and an efficient recognition method based on TABLE III C OMPARISON RESULTS Systems Used classifier Features extraction method Recognition rate (%) (IFN/ENIT database) Structural features 87.12 Azizi [2] MLP Statistical features 87.46 Selected features 87.05 Burrow [3] KNN Zernike moments 80 Aouadi and Kacem Echi [14] HMM Structural features 87 Rabi et al. [15] HMM densities of 87.93 foreground pixels concavity and derivative features Our system Matching based on SIFT descriptor 88 Euclidean distance Recognition, Springer, vol. 14, no. 1, pp.3–13, 2011. [2] N. Azizi, N. Farah, M. T. Khadir, M. Sellami, ”Arabic handwritten word recognition using classifiers selection and feature extraction/selection,” Proc. The 17th IEEE Conference in Intelligent Information System, Proceedings of Recent Advances in Intelligent Information Systems, Academic Publishing House, Warsaw, pp.735–742, 2009. [3] P. Burrow, ”Arabic handwriting recognition,” Thesis, School of Infor- matics, University of Edinburgh, 2004, England. [4] T.E. De Campos, B.R. Babu, M. Varma, ”Character recognition in natural images,” Proc. The International Conference on Computer Vision Theory and Applications, Lisbon, Portugal, vol. 2, pp.273–280, 2009. [5] M. Diem, R. Sablatnig, ”Recognition of degraded handwritten characters using local features,” Proc. The 10th International Conference on Doc- ument Analysis and Recognition, Barcelona, Spain, pp.221–225, 2009. [6] D. G. Lowe, ”Object recognition from local scale-invariant features,” Proc. of the International Conference on Computer Vision, Corfu, Greece, pp.1150–1157, 1999. [7] D. G. Lowe, ”Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp.91–110, 2004. [8] T. Mao, J. Wu, P. Gao, Y. Xia, Y. Lin, ”Calligraphy word style recognition by KNN based feature library filtering,” Proc. The 3rd International Conference on Multimedia Technology, uangzhou, China, pp.934–941, 2013. [9] O. V. Ramana, S. Roy, V. Narang, M. Hanmandlu, ”Devanagari character recognition in the wild,” International Journal of Computer Applications, vol. 38, no. 4, pp.38–45, 2012. [10] L. Rothacker, S. Vajda, G. A. Fink, ”Bag-of-features representations for offline handwriting recognition applied to Arabic script,” Proc. The 3rd International Conference on Frontiers in Handwriting Recognition, Bari, Italy, pp.149–154, 2012. [11] A. N. Subashini, D. Kodikara, ”Novel SIFT-based codebook generation for handwritten tamil character recognition,” Proc. The 6th International Conference on Industrial and Information Systems, Sri Lanka, pp.261– 264, 2011. [12] T. Yamaguchi, M. Maruyama, ”Character type classification via proba- bilistic topic model,” International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 5, no. 2, pp.123–140, 2012. [13] Z. Zhang, L. Jin, K. Ding, X. Gao, ”A novel feature for offline handwritten Chinese character recognition,” Proc. The 6th International Conference on Industrial and Information Systems, Sri Lanka, pp.763– 767, 2009. [14] N. Aouadi and A. Kacem Echi, ”Word Extraction and Recognition Fig. 12. Classification procedure. in Arabic Handwritten Text,” International Journal of Computing & Information Sciences, vol. 12, no. 1, pp.17–23, 2016. [15] M. Rabi, M. Amrouch, Z. Mahani, ”Recognition of cursive Arabic handwritten text using embeddedtraining based on HMMs,” Journal of an image matching procedure. A heigh recognition rate was Electrical Systems and Information Technology, 2017 (article in press). recorded through several experiments conducted on IFN/ENIT and our new database. R EFERENCES [1] H. Al Abed, V. Margner, ”ICDAR 2009 - Arabic handwriting recog- nition competition,” International Journal on Document Analysis and