Traffic-sign Recognition for Visually Impaired Pedestrians in Kyrgyzstan: Two-keypoint SIFT/BRISK Descriptor with CameraX Ayman Aljarbouh1, Dmytro Zubov1,*, Andrey Kupin2 and Nurlan Shaidullaev1 1 University of Central Asia, 125/1 Toktogul Street, Bishkek, 720001, Kyrgyzstan 2 Kryvyi Rih National University, 11 Vitaly Matusevich, Kryvyi Rih, 50027, Ukraine Abstract The traffic-sign recognition system developed in this study aims to assist the spatial cognition and mobility of visually impaired pedestrians in Kyrgyzstan. The system employs a two-keypoint binary descriptor that implements the BRISK algorithm to find sampling patterns on the image. Pairs of keypoints are localized using the SIFT method. The developed Java Android mobile application implements the SIFT and BRISK approaches in real-time on Android CameraX using AdaBoost classifiers and multithreading. With a knowledge base of 86 sampling patterns, the execution time is 0.1 s for an example with the traffic sign β€œCrosswalk left”. In experiments conducted at distances 1.5 m to 3.5 m in the city of Naryn, Kyrgyzstan, the presented SIFT/BRISK detector demonstrated a true negative of 100 % and a true positive close to 100 % (Blackview BV6600 Pro and Doogee S96 Pro smartphones achieved 100 % and 75 %, respectively) rates at 3.5 m. This pilot project is expected to continue with more precise image descriptors for longer distances. Keywords Two-keypoint descriptor, visually impaired, SIFT, BRISK, Android CameraX1 1. Introduction The visually impaired and blinds (VIBs) have made significant progress in social integration over the last five decades. This achievement is mostly based on inclusive smart technologies that create synergies between the community and VIBs [1]. Despite numerous assistive mobile applications (e.g., BDS (BeiDou Navigation Satellite System) WeChat and Google Maps) for spatial cognition, VIB navigation [2] remains problematic for the last mile, such as finding the entrance and identifying traffic signs [3-11]. In this study, a Java Android mobile application was developed to detect and recognize Kyrgyz traffic signs using the SIFT and BRISK (Scale-Invariant Feature Transform and Binary Robust Invariant Scalable Keypoints) methods [12, 13] to localize keypoints and find sampling patterns with a two-keypoint binary descriptor, respectively. The true positive (i.e., recognition accuracy) and true negative (i.e., crucial mistakes) rates [14] are expected to be near 100 % at distances from 1.5 m to 3.5 m from the traffic sign. To support VIBs, a new mobile application was developed to recognize traffic signs in Kyrgyzstan. Two key problems were solved in this study: 1. A novel technique has been applied for image processing. In the preprocessing step, the method π΅π‘–π‘‘π‘šπ‘Žπ‘. π‘π‘Ÿπ‘’π‘Žπ‘‘π‘’π‘†π‘π‘Žπ‘™π‘’π‘‘π΅π‘–π‘‘π‘šπ‘Žπ‘ creates a new bitmap scaled to a maximum resolution of 500 pixels with bilinear filtering. From up to four hundred keypoints detected by the SIFT method, two keypoints are selected. Then, the BRISK binary two-keypoint descriptor is COLINS-2024: 8th International Conference on Computational Linguistics and Intelligent Systems, April 12–13, 2024, Lviv, Ukraine βˆ— Corresponding author ayman.aljarbouh@ucentralasia.org (A. Aljarbouh); dzubov@ieee.org (D. Zubov); kupin@knu.edu.ua (A. Kupin); nurlan.shaidullaev@ucentralasia.org (N. Shaidullaev) 0000-0002-3909-2227 (A. Aljarbouh);0000-0002-5601-7827 (D. Zubov); 0000-0001-7569-1721 (A. Kupin); 0009-0003-5165-897X (N. Shaidullaev) Β© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings designed employing a 291-point shape. The Hamming distance threshold is 19600 after five AdaBoost weak classifiers, which shows a true positive rate close to 100 % (smartphone Blackview BV6600 Pro – 100 %, Doogee S96 Pro – 75 % at 3.5 m) and a true negative rate of 100 % at distances from 1.5 m to 3.5 m. 2. A multithreaded Java Android application was developed using the CameraX library [15]. The image, smartphone soft-/hardware, and number of keypoints have an effect on the execution time. In the experiment with the traffic sign β€œCrosswalk left”, the smartphone Doogee S96 Pro takes 0.1 s to find the sampling pattern using the knowledge base with 86 elements. 2. Related Works The World Health Organization showed that over 2.2 billion people experience vision impairment worldwide in 2021 [10], and hence assistive tools are in continuous demand. Traffic-sign recognition systems in cars [3] have been widely proposed for the market. The leading approach is based on convolutional neural networks and specific datasets, e.g., Tunisian traffic signs [4]. The percentage of wrongly recognized signs can reach 25 % [3], which is unacceptable for VIBs. Analysis of existing commercial products for VIBs, such as those referenced in [5-11], shows that they do not support the recognition of traffic signs related to pedestrians. Hence, the development of a mobile application that includes this functionality is a crucial task that should be undertaken to support the VIBs navigation near roads. Moreover, the usage of existing technologies is reasonable since it speeds up the development process, as was done in this study. The distance between the VIB and the traffic sign is assumed to be up to 4 m, which is the estimated width of the pedestrian path. The analyzed traffic signs are supposed to be of good quality and produced according to state standards. Google’s Android platform has been taking over 70 % of the market share last five years. The CameraX Android API (application programming interface) is a Google Android native approach to work with different cameras on Android smartphones. CameraX is a Jetpack support library, which is considered the easiest way to make the Android camera application. 3. Methods 3.1. Architecture of two-keypoint SIFT detector and BRISK descriptor with CameraX Android API Two-keypoint SIFT detector and BRISK descriptor with CameraX Android API employ the method presented in [16]. It and consists of three steps (see Figure 1): 1. Keypoints localization: The SIFT method is used to localize keypoints on the template image. 2. Image descriptor design: The BRISK method is employed to design image descriptors using pairs of keypoints that are empirically determined by an expert. This procedure will be automated in the future. 3. Image matching: Image capturing is performed using an Android CameraX library. The image is then downsampled with bilinear filtering in the method π΅π‘–π‘‘π‘šπ‘Žπ‘. π‘π‘Ÿπ‘’π‘Žπ‘‘π‘’π‘†π‘π‘Žπ‘™π‘’π‘‘π΅π‘–π‘‘π‘šπ‘Žπ‘ and converted to grayscale from RGB (Red-Green-Blue) format using the luminosity function [17]. The best pairs of keypoints calculated by the SIFT algorithm are selected for the design of the BRISK binary descriptor. Image matching uses AdaBoost classifiers and Hamming distance [18] (see Figure 2). Figure 3 presents two flowcharts: one for keypoints localization with the SIFT method and image descriptor design with the BRISK method (on the left), and another for the image matching algorithm (on the right). Figure 1: Two-keypoint SIFT detector and BRISK descriptor with CameraX Android API: A diagram Downsampling the Localization of Image capturing image with bilinear keypoints (SIFT) filtering Building two- Selecting pairs of Image matching keypoint keypoints descriptors (BRISK) Figure 2: The architecture of the proposed image matching 3.2. Classification of Kyrgyz traffic signs for pedestrians As of August 2023, Kyrgyzstan had over 200 road signs [19], including 13 related to pedestrians (see Table 1). The crosswalk signs β€œCrosswalk left”, β€œCrosswalk right”, and β€œZebra crossing” are combined into a group β€œCrosswalk”, as well as the signs β€œEmergency exit left/right” into β€œEmergency exit”. 3.3. SIFT keypoint localization Regarding the SIFT keypoint localization, the AdaBoost cascade classifier is employed in this study. In this approach, scale-invariant locations of keypoints are searched in different scales. The convolution of two-dimensional Gaussian function G(x, y, ) and input grayscale image I(x, y) gives a filtered image: Start Start Input image Input image Grayscale image Grayscale image Locate keypoints using Downsample image to maximum 500-pixel SIFT method resolution with bilinear filtering Select pairs of keypoints to Locate keypoints using build BRISK descriptors SIFT method Build two-keypoint Generate possible pairs: End BRISK descriptors Maximum 400 keypoints End Build two-keypoint BRISK-like descriptor Image matching using Hamming distance and AdaBoost weak classifiers No Template Yes image found? Figure 3: Flowcharts of the keypoints localization with the SIFT method and the image descriptor design with the BRISK method (left flowchart) and the image matching algorithm (right flowchart) 𝐿(π‘₯, 𝑦, 𝜎) = 𝐺(π‘₯, 𝑦, 𝜎) βˆ— 𝐼(π‘₯, 𝑦), (1) where β€˜*’ is the convolution operation,  is the population standard deviation, x and y are the pixel coordinates, and a Gaussian function: 1 ⁄ (2) 𝐺(π‘₯, 𝑦, 𝜎) = 𝑒 . 2πœ‹πœŽ In this study, the luminosity function [17] converts the image from RGB to greyscale I(x, y) using Android class πΆπ‘œπ‘™π‘œπ‘Ÿ: 𝐼(π‘₯, 𝑦) = πΆπ‘œπ‘™π‘œπ‘Ÿ. π‘Ÿπ‘’π‘‘(𝑝𝑖π‘₯𝑒𝑙) βˆ— 0.21 + πΆπ‘œπ‘™π‘œπ‘Ÿ. π‘”π‘Ÿπ‘’π‘’π‘›(𝑝𝑖π‘₯𝑒𝑙) βˆ— 0.72 + +πΆπ‘œπ‘™π‘œπ‘Ÿ. 𝑏𝑙𝑒𝑒(𝑝𝑖π‘₯𝑒𝑙) βˆ— 0.007, (3) where πΆπ‘œπ‘™π‘œπ‘Ÿ. π‘Ÿπ‘’π‘‘, πΆπ‘œπ‘™π‘œπ‘Ÿ. π‘”π‘Ÿπ‘’π‘’π‘›, and πΆπ‘œπ‘™π‘œπ‘Ÿ. 𝑏𝑙𝑒𝑒 are Java methods, 𝑝𝑖π‘₯𝑒𝑙 is the smallest element that can be addressed in a raster RGB image. The DoG (Difference of Gaussians) function 𝐷(π‘₯, 𝑦, 𝜎) is employed to find keypoints that are stable across different scales. DoG is the result of subtracting two neighbour scales that are smoothed by Gaussian filters with a different weight k: 𝐷(π‘₯, 𝑦, 𝜎) = 𝐿(π‘₯, 𝑦, π‘˜πœŽ) βˆ’ 𝐿(π‘₯, 𝑦, 𝜎). (4) The DoG function is the approximation of the scale-normalized Laplacian of Gaussian 22G. Building DoG 𝐷(π‘₯, 𝑦, 𝜎) follows the method proposed in [16]. To generate five SIFT scales, four images (maximum dimension sizes are 180, 340, 680, and 1360 pixels, i.e., four SIFT octaves) are smoothed five times in the Gaussian blur operator with five square matrix orders r and population standard deviations : 1. 180: r=5, =0.707107; r=7, =1; r=11, =1.414214; r=13, =2; r=19, =2.828427. 2. 340: r=11, =1.414214; r=13, =2; r=19, =2.828427; r=25, =4; r=35, =5.656854. 3. 680: r=19, =2.828427; r=25, =4; r=35, =5.656854; r=49, =8; r=69, =11.313708. 4. 1360: r=35, =5.636854; r=49, =8; r=69, =11.313708; r=97, =16; r=137, =22.627417. Table 1 Kyrgyz traffic signs related to pedestrians No. Designation Traffic sign icon No. of sampling patterns 1 Above ground pedestrian crossing 6 2 Bike crossing 7 3 Bike path 13 4 Bus stop 1 5 Crosswalk left 16 6 Crosswalk right 1 7 Emergency exit left 7 8 Emergency exit right 8 9 No entry for pedestrians 10 10 Pedestrian path 10 11 Tram stop 5 12 Underground pedestrian crossing 1 13 Zebra crossing 1 Detection of keypoint candidates (i.e., local maxima and minima of DoG function) is similar to the presented in [16, 20] methodology. Keypoints are rejected using the Taylor expansion of 𝐷(π‘₯, 𝑦, 𝜎): πœ•π· 1 πœ• 𝐷 (5) 𝐷(π‘₯) = 𝐷 + + π‘₯ π‘₯, πœ•π‘₯ 2 πœ•π‘₯ where x=(π‘₯, 𝑦, 𝜎) is the offset from and D and its derivatives are computed at a particular point. To find the extremum π‘₯ , the equation 𝐷′(𝐱) = 0 should be solved (𝐷′(𝐱) is the derivative of D with respect to x): πœ• 𝐷 πœ•π· (6) π‘₯= . πœ•π‘₯ πœ•π‘₯ Unstable extrema are rejected if |D(π‘₯ )|<0.03. The extremum D(π‘₯ ) can be found by combining Eq. (6) and Eq. (5): 1 πœ•π· (7) 𝐷(π‘₯) = 𝐷 + π‘₯. 2 πœ•π‘₯ In this study, edges are detected employing a 2Γ—2 Hessian matrix [12, 20]: 𝐷 𝐷 (8) 𝐻= , 𝐷 𝐷 where derivatives Dxx, Dxy, and Dyy are as follows: Dxx = D(x+1, y, ) + D(x-1, y, ) - 2*D(x, y, ), (9) Dyy = D(x, y+1, ) + D(x, y-1, ) – 2*D(x, y, ), (10) Dxy = (D(x+1, y+1, ) – D(x+1, y-1, ) – D(x-1, y+1, ) – D(x-1, y-1, ))/4. (11) To eliminate the number of keypoints, the following inequality should be satisfied [12, 16, 20]: π‘‡π‘Ÿ(𝐻) (12) 0< < 12.1 . 𝐷𝑒𝑑(𝐻) 3.4. Two-keypoint BRISK descriptor design In this study, the BRISK algorithm employs a binary descriptor with 291 points to depict the template image. The orientation and scale of the sample pattern are calculated using the positions of two keypoints. In the present software version, a human expert chooses two keypoints by examining keypoints with a consistent location at various octaves. In the binary descriptor, 291 points are split as follows: 0-24 (1st group), 25-82 (2nd group), and 83-290 (3rd group). Figure 4 shows an example of the descriptor for the traffic sign β€œAbove ground pedestrian crossing” (greyscale representation): (A) – points 0-24, (B) – 25-82, (C) – 83- 290, (D) – 0-290. The distance between any two points is computed via the Euclidean distance between two keypoints: for the 1st group point on the line connecting two keypoints, the distance to any nearest point is one-third of the distance between two keypoints. Each point n is associated with the mean pixel intensity I(n) in a circle of a radius 1/24, 1/16, or 1/12 of the Euclidean distance 𝐸 between two keypoints n1 and n2 [16]. The Hamming distance is calculated via the binary string 𝐡𝑆 , which is based on the comparison of average pixel intensities I(n1) and I(n2) at points n1 and n2, respectively: 1, 𝐷𝑆 > 0, (13) 𝐡𝑆 = 0, π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’, where the string 𝐷𝑆 is as follows: 𝐷𝑆 = 𝐼(𝑛 ) βˆ’ 𝐼(𝑛 ). (14) For a specific point n1, absolute differences π‘Žπ‘π‘  𝐷𝑆 are descending sorted within the appropriate group: 1. Points 0-24: the image binary descriptor considers the first 12 absolute differences for each point. 2. Points 25-82: the image binary descriptor considers the first 32 absolute differences for each point. 3. Points 83-290: the image binary descriptor considers the first 100 absolute differences for each point. Therefore, a total of 22956 absolute differences π‘Žπ‘π‘  𝐷𝑆 are considered in the image binary descriptor, which can be calculated as follows: 25 * 12 + 58 * 32 + 208 * 100 = 300 + 1856 + 20800 . 4. Experiment 4.1. Knowledge base design The knowledge base is stored on the smartphone internal storage and includes the following files [16]: β€˜root.txt’; N descriptor files β€œdN.txt”; β€œDescription.txt”; N audio files β€œNameN.mp3”. Figure 4: The binary descriptor for the traffic sign β€œAbove ground pedestrian crossing” (greyscale representation): 1st (A), 2nd (B), and 3rd (C) groups, all points (D) The knowledge base includes 86 sampling patterns (N=86; see Table 1) and has a size of 44.1 MB on the internal storage (106 MB in Random Access Memory (RAM) along with other data and code of the Android application), which is available on any up-to-date Android smartphone with operating system (OS) version 10 (10th and 11th are discussed in this study) or newer. The execution time for processing a test image (taken at sunny weather on a campus in Naryn, University of Central Asia, Kyrgyzstan; see Figure 5) on a Doogee S96 Pro smartphone is approximately 0.1 s. In this study, the Hamming distance measures the similarity between two binary strings 𝐡𝑆 . The image-matching process employs five AdaBoost weak classifiers [21] that use binary decision trees. 4.2. Experiment description To minimize the execution time of Java Android application, the maximum number of keypoints and the image dimension size are 400 and 500, respectively, which is compatible with any modern camera smartphone since a 2-megapixel sensor captures images of 1600ο‚΄1200 pixels. To avoid optical distortion effects [22] and locate the binary descriptor within the borders, a new image is created by adding 50-pixel margins to the original image. The method π΅π‘–π‘‘π‘šπ‘Žπ‘. π‘π‘Ÿπ‘’π‘Žπ‘‘π‘’π‘†π‘π‘Žπ‘™π‘’π‘‘π΅π‘–π‘‘π‘šπ‘Žπ‘ is used to downsample the original image with bilinear filtering. Then, a greyscale representation is calculated in eight parallel computational threads. Figure 5: An example of the traffic sign β€œCrosswalk left” successfully identified during the experiment (photo was taken by smartphone Doogee S96 Pro at sunny weather; author – Dmytro Zubov) To find keypoints, the image with a maximum dimension size of 500 pixels is smoothed five times. This process generates five scales using three groups (groups are applied sequentially until the target object is detected or not found) of the square matrix orders and population standard deviations in the Gaussian blur operator: 1. r=7, =1; r=11, =1.414214; r=13, =2; r=19, =2.828427; r=25, =4. 2. r=11, =1.414214; r=13, =2; r=19, =2.828427; r=25, =4; r=35, =5.656854. 3. r=5, =0.5; r=7, =0.70711; r=9, =1; r=11, =1.414214; r=13, =2. To reduce the execution time, four DoG images are calculated employing Eq. (4) in four parallel computational threads. Keypoints are identified in two parallel computational threads for the third/second/first and fourth/third/second scales. Only 400 keypoints, which are closest to the center of the image according to the Euclidean distance, are considered. Figure 6 presents the scheme of how points are analyzed – the number of side points at the current level is two pixels larger than at the previous one: 1, 3, 5, 7, 9, etc. The AdaBoost classifier is applied after discarding keypoint pairs whose descriptor points fall beyond the image boundaries: (15) 𝐹 (𝐡𝑆) = 𝑓 (𝐡𝑆), where ft(BS) is AdaBoost weak classifier and T=5 [16]. If the Hamming distance surpasses the threshold 19600, the descriptor is of the template class. The threshold value was selected empirically based on the sum of the outcomes of the five weak classifiers mentioned above. Table 2 summarizes the speedup techniques used in this study. Figure 6: Scheme of the points analysis on the image Table 2 Speedup techniques in Java Android application Operation Speedup technique Greyscale representation of the color image Eight parallel computational threads Calculation of DoG images Four parallel computational threads Localization of keypoints in third/second/first and Two parallel computational threads fourth/third/second scales Image matching Five AdaBoost classifiers 5. Results In this study, a Java Android application β€œTrafficSignsKyrgyzstanWeCanSee” employs the proposed method to detect Kyrgyz traffic signs, and hence to support the spatial cognition and mobility of VIBs. Figure 7 presents the screenshot, an original image taken by the smartphone Doogee S96 Pro at cloudy weather (location is a campus in Naryn, University of Central Asia, Kyrgyzstan), and a greyscale image with keypoints (fuchsia and turquoise colors are used for third/second/first and fourth/third/second DoG functions, respectively). Two other Java Android 10 applications (approximately 72 % of Android smartphones can run these applications as of August 2023) were designed in Android Studio 4.0: 1. Localization of all keypoints using SIFT method. 2. Identification of keypoints’ pairs and design of the sample pattern via the BRISK descriptor. Two smartphones, Doogee S96 Pro and Blackview BV6600 Pro, were used in the experiment. The true positive rate was calculated at different distances from 1.5 m to 5 m in three attempts. The results are shown in Figure 8. Figure 5 presents the photo taken during this experiment. The true positive rate was close to the required 100 % from a distance of 1.5 m to 3.5 m for the traffic sign β€œCrosswalk left”: it was 100 % for the smartphone Blackview BV6600 Pro and 75 % for the smartphone Doogee S96 Pro at a distance of 3.5 m. The presented two-keypoint SIFT detector and BRISK descriptor showed that the true negative rate was 100 %. 6. Discussion In this study, a two-keypoint SIFT detector and a BRISK descriptor with CameraX Android API compose the approach to support the navigation and mobility of VIB pedestrians in Kyrgyzstan. In this method, keypoints are localized via the SIFT algorithm, and then selected pairs of keypoints are employed to design the sample pattern, i.e., the binary BRISK descriptor. The image matching is based on the Hamming distance and AdaBoost cascade classifier. The real-life experiment showed test results: a true negative rate is 100 % (this is a crucial parameter for VIBs) and a true positive rate is close to 100 %. In general, the traffic-sign recognition system satisfies requirements and hence can be implemented in practice. However, some elements (e.g., square matrix orders and population standard deviations in the Gaussian blur operator) of the presented approach are empirical, and therefore discussable. 7. Conclusions In this study, a crucial VIB-assistive software, Java Android mobile application, was developed to recognize Kyrgyz traffic signs using two-keypoint SIFT detector, BRISK descriptor, CameraX Android API, and mp3 audio files to support the spatial cognition of VIBs near roads. With a knowledge base of 86 sampling patterns, the mobile application shows the real-time performance: the execution time is 0.1 s for example with the traffic sign β€œCrosswalk left” (location is a campus in Naryn, University of Central Asia, Kyrgyzstan). In experiments on the distance from 1.5 m to 3.5 m, the presented SIFT/BRISK detector with two-keypoint descriptor showed 100 % true negative rate and a true positive rate close to 100 %: smartphone Blackview BV6600 Pro – 100 %, Doogee S96 Pro – 75 % on the distance 3.5 m. The real-time performance is achieved using five AdaBoost classifiers for image matching and parallel computational threads for the greyscale representation of the color image (eight threads), calculation of DoG images (four threads), and localization of keypoints (two threads). A) B) C) Figure 7: An example of the screenshot (A), an original image taken by the smartphone Doogee S96 Pro at cloudy weather (B), and a greyscale image with keypoints (C) 100 True positive rate, % 80 60 40 20 0 1.5 2 2.5 3 3.5 4 4.5 5 6 Distance, m A) 100 True positive rate, % 80 60 40 20 0 1.5 2 2.5 3 3.5 4 4.5 5 6 Distance, m B) Figure 8: Results of the experiment: true positive rate on different distances for smartphones Doogee S96 Pro (A) and Blackview BV6600 Pro (B) Analysis of minimum requirements to the hardware shows that the mobile application is runnable on any up-to-date Android camera smartphone because it requires only 44.1 MB on the internal storage (106 MB in RAM along with other data and code) and a 2-megapixel sensor. The most likely prospect for further development of this study is the design of an image descriptor that is geometrically close to the traffic signs. 8. References [1] John Bricout, Paul M. A. Baker, Nathan W. Moon, and Bonita Sharma. β€œExploring the Smart Future of Participation: Community, Inclusivity, and People with Disabilities.” International Journal of E-Planning Research 10.2 (2021): 94-108. doi: 10.4018/IJEPR.20210401.oa8. [2] M. Gallay, M. Denis, M. Auvray, Navigation Assistance for Blind Pedestrians: Guidelines for the Design of Devices and Implications for Spatial Cognition, in T. Tenbrink, J. Wiener, C. Claramunt (Eds.), Representing Space in Cognition: Interrelations of Behaviour, Language, and Formal Models, Oxford Academic, Oxford, 2013, pp. 244-267. doi: 10.1093/acprof:oso/9780199679911.003.0011. [3] Darko BabiΔ‡, Dario BabiΔ‡, Mario FioliΔ‡, and Ε½eljko Ε ariΔ‡. β€œAnalysis of Market-Ready Traffic Sign Recognition Systems in Cars: A Test Field Study.” Energies 14.12 (2021). doi: 10.3390/en14123697. [4] Hana Ben Fredj, Amani Chabbah, Jamel Baili, Hassen Faiedh, and Chokri Souani. β€œAn Efficient Implementation of Traffic Signs Recognition System Using CNN.” Microprocessors and Microsystems 98 (2023). doi: 10.1016/j.micpro.2023.104791. [5] Kanak Manjari, Madhushi Verma, and Gaurav Singal. β€œA Survey on Assistive Technology for Visually Impaired.” Internet of Things 11 (2020). doi: 10.1016/j.iot.2020.100188. [6] Filippo Amore, Valeria Silvestri, Margherita Guidobaldi, et al. β€œEfficacy and Patients’ Satisfaction with the ORCAM MyEye Device Among Visually Impaired People: A Multicenter Study.” Journal of Medical Systems 47:11 (2023). doi: 10.1007/s10916-023-01908-5. [7] Myneni Madhu Bala, D. N. Vasundhara, Akkineni Haritha, and CH. V. K. N. S. N. Moorthy. β€œDesign, Development and Performance Analysis of Cognitive Assisting Aid with Multi Sensor Fused Navigation for Visually Impaired People.” Journal of Big Data 10 (2023). doi: 10.1186/s40537-023-00689-5. [8] Sonal Mali, Srushti Padade, Swapnali Mote, and Revati Omkar. β€œAn Eye for a Blind: Assistive Technology.” International Research Journal of Engineering and Technology 3.12 (2016): 532-534. [9] Zahra J. Muhsin, Rami Qahwaji, Faruque Ghanchi, and Majid Al-Taee. β€œReview of Substitutive Assistive Tools and Technologies for People with Visual Impairments: Recent Advancements and Prospects.” Journal on Multimodal User Interfaces 18 (2023): 135–156. doi: 10.1007/s12193-023-00427-4. [10] Adnan Al-Smadi, Talal Al-Qaryouti, Abdurahman Rehan, Homam Assi, and Alhareth Alsharea. A Navigation Tool for Visually Impaired and Blind People, in: Proceedings of the Eurasia Proceedings of Science, Technology, Engineering & Mathematics, volume 22 of EPSTEM, ISRES Publishing, Marmaris Turkey, 2023, pp. 119-126. doi: 10.55549/epstem.1338545. [11] Matteo Poggi and Stefano Mattoccia. A Wearable Mobility Aid for the Visually Impaired based on Embedded 3D Vision and Deep Learning, in: Proceedings of the IEEE Symposium on Computers and Communication, Messina Italy, 2016, IEEE Publishing, pp. 208-213. doi: 10.1109/ISCC.2016.7543741. [12] Zetian Tang, Zemin Zhang, Wei Chen, and Wentao Yang. β€œAn SIFT-Based Fast Image Alignment Algorithm for High-Resolution Image.” IEEE Access 11 (2023): 42012-42041. doi: 10.1109/ACCESS.2023.3270911. [13] Guoming Chu, Yan Peng, Xuhong Luo. β€œALGD-ORB: An Improved Image Feature Extraction Algorithm with Adaptive Threshold and Local Gray Difference.” PLoS ONE 18.10 (2023). doi: 10.1371/journal.pone.0293111. [14] Alaa Tharwat. β€œClassification Assessment Methods.” Applied Computing and Informatics 17.1 (2021): 168-192. doi: 10.1016/j.aci.2018.08.003. [15] R. Iyengar, Scaling Up Wearable Cognitive Assistance for Assembly Tasks, PhD’s thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2023. UMI order number: CMU-CS-23-112. doi: 10.1184/R1/23302121.v1. [16] D. Zubov, A. Aljarbouh, A. Kupin, and N. Shaidullaev, Spatial Cognition by the Visually Impaired: Image Processing with SIFT/BRISK-like Detector and Two-keypoint Descriptor on Android CameraX, in: A. Nandal, L. Zhou, A. Dhaka, T. Ganchev, F. Nait-Abdesselam (Eds.), Machine Learning in Medical Imaging and Computer Vision, IET, Stevenage, UK, 2023, pp. 249-276. doi: 10.1049/PBHE049E_ch12. [17] Mehak Maqbool Memon, Manzoor Ahmed Hashmani, Aisha Zahid Junejo, Syed Sajjad Rizvi, and Adnan Ashraf Arain. β€œA Novel Luminance-Based Algorithm for Classification of Semi- Dark Images.” Journal of Applied Sciences 11.18 (2021). doi: 10.3390/app11188694. [18] Yantong Chen, Wei Xu, and Yongjie Piao. β€œTarget Matching Recognition for Satellite Image based on the Improved FREAK Algorithm.” Mathematical Problems in Engineering, 2016 (2016). doi: 10.1155/2016/1848471. [19] Wikipedia, Road signs in Kyrgyzstan, 2023. URL: https://en.wikipedia.org/wiki/Road_signs_in_Kyrgyzstan. [20] D.G. Lowe. β€œDistinctive Image Features from Scale-invariant Keypoints.” International Journal of Computer Vision 60.2 (2004): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94. [21] Youwei Wang, Lizhou Feng, Jianming Zhu, Yang Li, and Fu Chen. β€œImproved AdaBoost Algorithm Using Misclassified Samples Oriented Feature Selection and Weighted Non- negative Matrix Factorization.” Neurocomputing 508 (2022): 153-169. doi: 10.1016/j.neucom.2022.08.015. [22] Pengbo Xiong, Shaokai Wang, Weibo Wang, Qixin Ye, and Shujiao Ye. β€œModel-Independent Lens Distortion Correction Based on Sub-Pixel Phase Encoding.” Sensors Journal 21.22 (2021). doi: 10.3390/s21227465.