=Paper=
{{Paper
|id=Vol-3746/Paper_3.pdf
|storemode=property
|title=A Mobile Facial Recognition System Based on a Set of Raspberry Technical Tools
|pdfUrl=https://ceur-ws.org/Vol-3746/Paper_3.pdf
|volume=Vol-3746
|authors=Nikolay Kiktev,Taras Lendiel,Oleksandr Korol
|dblpUrl=https://dblp.org/rec/conf/dsmsi/KiktevLK23
}}
==A Mobile Facial Recognition System Based on a Set of Raspberry Technical Tools==
A Mobile Facial Recognition System Based on a Set of Raspberry Technical Tools Nikolay Kiktev 1,2, Taras Lendiel 1 and Oleksandr Korol 1 1 National University of Life and Environmental Sciences of Ukraine, Heroiv Oborony str., Kyiv, 03041, Ukraine 2 Taras Shevchenko National University of Kyiv, 64/13, Volodymyrs’ka str., Kyiv, 01601, Ukraine Abstract Most machine vision systems use complex and expensive hardware. The work proposes to consider Raspberry hardware for the implementation of mobile machine vision systems, which will allow the implementation of the specified systems as mobile pattern recognition stations. The hardware and software of the mobile machine vision system, which reads and stores information about the recognized image, has been implemented. This article develops a mobile machine vision system that allows you to recognize a face and compare it with one stored in a database, after which the control element generates a control action on the output ports. The method of histograms of oriented gradients is used to recognize faces during the operation of the machine vision system. Keywords 1 machine vision, pattern recognition, automation systems, histogram of oriented gradients, neural network, learning, image. 1. Introduction Automated machine vision systems for identifying objects by image belong to the type of pattern recognition systems [1, 2]. The development of pattern recognition systems has reached a scale when these systems can work with one specific pattern recognition algorithm, and neural networks are used to increase the recognition speed [3]. Machine vision systems for pattern recognition, in addition to work with the detection of static images, are also used to determine the movements of a specified object or a group of them [4]. 2. Literature review and problem statement In the work of V. Martsenyuk et al. [5] proposed an algorithm that allows you to correctly identify images with different characteristics for identifying a person in a video stream. It includes anisotropic diffusion as an image preprocessing method, Gabor wavelet transform as an image processing method, histogram of oriented gradients (HOG), and one-dimensional local binary patterns (1DLBP) as methods for extracting feature vectors from images. The study used three facial image databases: The Database of Faces, Facial Recognition Technology (FERET) database, and Surveillance Cameras Face Database (SCface). The operation of the algorithm gave different results of identification accuracy with an average difference of 20%. The authors conducted experiments with different degrees of image compression, differences in image resolutions and areas of facial areas covering the images. Dynamical System Modeling and Stability Investigation (DSMSI-2023), December 05-07, 2023, Kyiv, Ukraine EMAIL: nkiktev@ukr.net (N. Kiktev); taraslendiel@nubip.edu.ua (T. Lendiel); korol.alex.fox@gmail.com (O. Korol) ORCID: 0000-0001-7682-280X (N.Kiktev); 0000-0002-6356-1230 (T.Lendiel) ©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 23 In the article by the authors O. Bychkov et al. [6] proposes a study of methods based on local texture descriptors in comparison with methods based on neural networks. In particular, an approach to personal identification is proposed, based on local texture descriptors of facial images, eliminating the disadvantages of algorithms based on neural networks. As a result of the experimental study, the identification accuracy of the proposed algorithm increased by more than 10% compared to the use of neural networks under conditions of different positions of the subject’s head. The article by N. Korshun et al. [7] proposed a methodology that describes the stages of integration of artificial intelligence and machine learning, including data input, system state analysis, decision making, optimization actions and continuous monitoring. Researchers from Saudi Arabia Alonazi, M. et al. [8] proposed to perform facial emotion recognition using the Pelican optimization algorithm with a deep convolutional neural network. The authors optimized the performance of the CapsNet model by tuning the hyperparameters using the Pelican Optimization Algorithm (POA). This ensured that the model was fine-tuned to detect a wide range of emotions and generalize effectively across different populations and scenarios. Detection and classification of different types of facial emotions occurs using a bidirectional long-short-term memory (BiLSTM) network. The simulation analysis of the AFER-POADCNN system was tested on the FER benchmark dataset. American researchers Zheng, Y. and Blasch, E. [9] performed facial microexpression recognition using score fusion and a hybrid model from a convolutional LSTM and Vision Transformer. The next step in human-machine interaction is to be able to detect facial emotions (for example, a humanoid robot). Allowing systems to recognize microexpressions allows the machine to delve deeper into a person's true feelings, allowing human emotions to be taken into account when making optimal decisions. Such machines will be able to detect dangerous situations, alert caregivers to problems and provide appropriate responses (for example, at an airport). The authors proposed a new hybrid neural network (NN) model capable of recognizing microexpressions in real-time applications. This study first compares several NN models and then creates a hybrid model by combining a convolutional neural network (CNN), a recurrent neural network (RNN, it can be long short-term memory (LSTM)) and a vision transformer. There is also an interesting study on this topic by Italian scientists Raccanello, D. and Burro, R. [10]. This work examined the suitability of the Drawing Set for the Assessment of Children's Achievement Emotions (DS-AEAL). The authors considered the theory of control-value analysis as the main theoretical basis. Specifically, a set of 10 drawings of faces representing pleasure, pride, hope, relief, relaxation, anxiety, anger, shame, sadness, and boredom was developed using 259 adults as raters. The authors also administered a matching task and a labeling task to 89 adults. The results confirmed the suitability of the correspondence between DS-AEAL and verbal labels. Overall, recognition and recall were better for primary emotions compared to secondary ones. Despite their preliminary nature, the results of these studies support the suitability of the DS-AEAL for assessing achievement emotions in a variety of learning contexts along with associated verbal labels. A simulated dataset generator for testing data analysis methods is presented in the article by Toliupa, S. et al. [11]. 2. The purpose and objectives of the research The purpose of the research is to develop software and hardware for a mobile machine vision system for image recognition, which will allow monitoring of the video image and recording of the recognized image from the input video stream. 3. Research materials and methods It is proposed to develop a mobile machine vision system based on a set of Raspberry technical tools. Raspberry is the hardware of embedded devices for creating automated control systems [12]. The specified hardware is also used to implement automated systems based on Internet of Things technology [13]. 24 A Raspberry Model Pi3 mini-computer was chosen as the controlling element of the mobile machine vision system. The control element receives video images from the electrically connected camera module OV5647 [14]. The structure of the proposed system is shown in fig. 1. Figure 1: Structural diagram of the functioning of the mobile machine vision system The pattern recognition algorithm is based on the histogram of oriented gradients (HOG) method. The HOG (Histogram of Oriented Gradients) method is one of the most effective for this task. The main idea is to use pixel intensity gradients to construct feature vectors that describe the image structure [15, 16]. During the execution of the HOG method, the image goes through a number of processing stages [17, 18]. First, the image is divided into small sectors (for example, 8*8 pixels). The next stage is the calculation of gradients, where horizontal (Gx) and vertical (Gy) intensity gradients are calculated for each pixel in the sector: Gx = I(x + 1,y) - I(x - 1,y) Gy = I(x,y + 1) - I(x,y - 1) where I(x,y) is the pixel intensity in coordinates (x,y) in the image. The calculation of intensity gradients is used to determine changes in pixel intensity in horizontal and vertical directions, which is the basis for edge detection in an image. During the execution stages of the method, the important characteristics are the direction of the gradient θ and the magnitude of the gradient │G│, which are determined at each point of the image: The complete data set consists of 18 csv files (Fig. 2). θ = arctan(Gy / Gx ) │𝐺│ = √𝐺𝑥2 + 𝐺𝑦2 The direction of the gradient indicates the direction of the fastest change in intensity, and the magnitude of the gradient indicates how rapidly the pixel value is changing. A histogram of gradient directions is created for each sector. Gradient angles are divided into bins, for example, with an interval of 20° (0°, 20°, 40°, ..., 180°). Each pixel contributes to the corresponding bin depending on the direction of the gradient. The weight of the contribution is determined by the magnitude of the gradient. To detect the image of a face in the processed image, a recognition method is used, where regression based on gradient lifting is used to accurately place control points. This approach uses a regression calculation based on gradient ascent to precisely place control points: 𝑛 𝑝 = 𝑝0 + ∑ 𝛼𝑖 𝑑𝑖 𝑖=1 25 where p0 is the initial position of control points, αi is weighting factors, di is base vectors. The face image of an individual is stored in the memory of a mini-computer. In order to compare the face of the saved person with other faces, encoding into feature vectors was performed. The recognition algorithm will be executed by the developed neural network, which will execute the specified image [19, 20]. A neural network consists of several layers: convolutional layers, pooling layers, and fully connected layers. Convolutional layers extract features, subsampling layers reduce dimensionality, and fully connected layers perform classification [19, 21]. The network is trained on a large dataset of face images using a loss function (F) to optimize parameters [19, 21]: 2 2 𝐹 = ∑(𝑦𝑖,𝑗 ∙ ‖𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑗 )‖ + (1 − 𝑦𝑖,𝑗 ) ∙ max(0, 𝑚 − ‖𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑗 )‖ )) 𝑖,𝑗 where yi,j are labels of pairs of coordinates xi, xj; f(x) is a function of a neural network, which maps the input data x into the feature space; m is the limit of the set of statistical features and results of the neural network model; ││ f(xi) - f(xj) ││ is the Euclidean distance between the vector representations xi and xj in the feature space. The last stage is the comparison of faces. For this, the Euclidean distance between feature vectors is used [19-21]: 128 𝑑 = √∑(𝑓𝑖 − 𝑔𝑖 )2 𝑖=1 where fi and gi are the components of feature vectors of two faces. If the distance is less than the specified threshold, the faces are considered the same. Hardware and software of the system. The software was implemented in the Python programming language and tested on a laboratory model (Fig. 2). Using a camera and display from Adafruit, students will begin to work with machine vision technologies, face recognition, and a desktop graphical interface for system management. To prepare the stand, we used: Raspberry Pi 3 Camera for Raspberry Pi Adafruit display LED Plastic case Programming was done using my Python using face_recognition libraries for robot recognition I want to use tkinter to create a graphical interface. The working algorithm of the developed layout is shown in Fig. 3. During the operation of the control system, it is assumed that certain information is recorded with a time counter. The recording of the specified information will be performed with the assignment of the time counter value "i". The screen of the interface of the developed software is shown in Fig. 4. Esp8266 program code is given in the Appendix. Stages of working with the program. At launch, we see a window of our program with an image from the camera and a button. The face is currently marked as Unknown. • Press the button and then enter the name. • We take a series of pictures in different positions of the head. 26 • Then we see a message about successfully added photos. • Now, when the app recognizes the added faces, the LED turns on, if it failed to recognize the added person, it turns off after 5 seconds. After successful testing of the system, a case was made for more convenient and safe use (Fig. 5). In the future, this stand can be used to teach students how to work with machine vision and face recognition. Students will be able to familiarize themselves with the principle of work thanks to a detailed article written on this topic and work with the stand. Practical implementation. • Access automation: Creation of access control systems at enterprises or offices using facial recognition. • Development of IoT solutions: Integrating cameras, microcontrollers and machine learning algorithms to create smart solutions for home and business. • Working with data and security: Use of encryption and data validation methods to ensure information security in access systems. Figure 2: Laboratory model of machine vision 4. Discussion and prospects This article is a combination of two directions in IT - image recognition using neural networks and Internet of Things technology for managing technological objects. Internet of things technology has become widespread in many industries nowadays. The article of Berestov, D. et al. [22] analyzes the application of big data in agriculture based on Internet of things (IoT) technology. An IoT information collection platform for agricultural land is proposed, which can help forecast and analyze the weather for agriculture. The research conducted in this article can be further used in our research in the automation of agricultural production: for plant productivity management [23], phytomonitoring [2], in robotic complexes, including in horticulture and greenhouse farming [24]. 27 START Yes No Face recognized ? Data entry Data Shaping control recording action for output i=0 ports Shaping control action for output Initializing the ports camera Data output Formation of the camera frame i = i ++ Face recognition Yes No i<=1000 STOP Figure 3. Algorithm flowchart Figure 4. Appearance of the interface 28 Figure 5. Laboratory stand for teaching students how to work with machine vision and face recognition 5. Conclusions A mobile machine vision system based on a set of Raspberry technical tools has been developed. Using the histogram method of oriented gradients, face recognition is performed during the operation of the machine vision system. By using the neural network tool, training was performed to recognize the stored face specified in the database. The mobile machine vision system allows you to recognize the face and compare it with the one stored in the database, after which the control element forms a control action on the output ports. The specified technical implementation allows the use of a mobile machine vision system in automated control systems and Internet of Things technology systems. 6. References [1] J. Nayak and S. B. Kaje, "Fast Image Convolution and Pattern Recognition using Vedic Mathematics on Field Programmable Gate Arrays (FPGAs)," 2022 OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, 2022, pp. 569-573, doi: 10.1109/OCIT56763.2022.00111. [2] F. Ren, X. Zhang and L. Wang, "A new method of the image pattern recognition based on neural networks," Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology, Harbin, China, 2011, pp. 3840-3843, doi: 10.1109/EMEIT.2011.6023833. [3] Weiming Hu, Xuejuan Xiao, Zhouyu Fu, D. Xie, Tieniu Tan and S. Maybank, "A system for learning statistical motion patterns," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1450-1464, Sept. 2006, doi: 10.1109/TPAMI.2006.176. [4] Lysenko, V., Zhyltsov, A., Bolbot, I., Lendiel, T., & Nalyvaiko, V. (2020). Phytomonitoring in the phytometrics of the plants. In E3S Web of Conferences (Vol. 154, p. 07012). EDP Sciences. [5] V. Martsenyuk, O. Bychkov, K. Merkulova and Y. Zhabska, "Exploring Image Unified Space for Improving Information Technology for Person Identification," in IEEE Access, vol. 11, pp. 76347-76358, 2023, doi: 10.1109/ACCESS.2023.3297488 [6] Bychkov, O., Zhabska, Y., Merkulova, K., Merkulov, M. Research and Comparative Analysis of Person Identification Information Technology. CEUR Workshop Proceedings, 2023, 3538, pp. 54–64. https://ceur-ws.org/Vol-3538/Paper_6.pdf [7] Korshun, N., Myshko, I., Tkachenko, O. Automation and Management in Operating Systems: The Role of Artificial Intelligence and Machine Learning. CEUR Workshop Proceedings, 2023, 3687, pp. 59–68. https://ceur-ws.org/Vol-3687/Paper_6.pdf 29 [8] Alonazi, M.; Alshahrani, H.J.; Alotaibi, F.A.; Maray, M.; Alghamdi, M.; Sayed, A. Automated Facial Emotion Recognition Using the Pelican Optimization Algorithm with a Deep Convolutional Neural Network. Electronics 2023, 12, 4608. https://doi.org/10.3390/electronics12224608 [9] Zheng, Y.; Blasch, E. Facial Micro-Expression Recognition Enhanced by Score Fusion and a Hybrid Model from Convolutional LSTM and Vision Transformer. Sensors 2023, 23, 5650. https://doi.org/10.3390/s23125650 [10] Raccanello, D.; Burro, R. Development of a Drawing Set for the Achievement Emotions Adjective List (DS-AEAL): Preliminary Data on a Pictorial Instrument for Children. Educ. Sci. 2024, 14, 756. https://doi.org/10.3390/educsci14070756 [11] Toliupa, S., Pylypenko, A., Tymchuk, O., Kohut, O. Simulated Datasets Generator for Testing Data Analytics Methods. CEUR Workshop Proceedings, 2023, 3687, pp. 11–24. https://ceur- ws.org/Vol-3687/Paper_2.pdf [12] M. J. Al Hamdi, W. Mumtaz, N. Albar, T. A. Mifta, J. Ridha and K. Muchtar, "A Low-cost Raspberry Pi and Deep Learning System for Customer Analysis," 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 2023, pp. 328-329, doi: 10.1109/GCCE59613.2023.10315605. [13] Dominique Dom Guinard; Vlad M. Trifa, Building the Web of Things: With examples in Node.js and Raspberry Pi , Manning, 2016. [14] Waveshare camera with optical stabilizer OV5647-70 5MP OIS Camera. Electronic resource: access mode - https://evo.net.ua/kamera-waveshare-z-optychnym-stabilizatorom-ov5647-70- 5mp-ois-camera-22980/ (date of application 01.07.2024 р.). [15] A. Ade-Ibijola and K. Aruleba, "Automatic Attendance Capturing Using Histogram of Oriented Gradients on Facial Images," 2018 IST-Africa Week Conference (IST-Africa), Gaborone, Botswana, 2018, pp. Page 1 of 8-Page 8 of 8. [16] C. -P. Huang, C. -H. Hsieh, K. -T. Lai and W. -Y. Huang, "Human Action Recognition Using Histogram of Oriented Gradient of Motion History Image," 2011 First International Conference on Instrumentation, Measurement, Computer, Communication and Control, Beijing, China, 2011, pp. 353-356, doi: 10.1109/IMCCC.2011.95. [17] C. -S. Fahn, C. -P. Lee and Y. -S. Yeh, "A real-time pedestrian legs detection and tracking system used for autonomous mobile robots," 2017 International Conference on Applied System Innovation (ICASI), Sapporo, Japan, 2017, pp. 1122-1125, doi: 10.1109/ICASI.2017.7988208. [18] Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). Ieee. [19] D. Ciregan, U. Meier and J. Schmidhuber, "Multi-column deep neural networks for image classification," 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 3642-3649, doi: 10.1109/CVPR.2012.6248110. [20] Y. Shi and Z. Yu, "Multi-Column Convolution Neural Network Model Based on Adaptive Enhancement," 2018 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), Xiamen, China, 2018, pp. 681-685, doi: 10.1109/ICITBS.2018.00177. [21] Dan Cireşan, Ueli Meier, Jonathan Masci, Jürgen Schmidhuber, Multi-column deep neural network for traffic sign classification, Neural Networks, Volume 32, 2012, Pages 333-338, ISSN 0893-6080, https://doi.org/10.1016/j.neunet.2012.02.023. [22] Berestov, D., Kurchenko, O., Zubyk, L., Kulibaba, S., Mazur, N. Assessment of Weather Risks for Agriculture using Big Data and Industrial Internet of Things Technologies. CEUR Workshop Proceedings, 2023, 3550, pp. 1–13. https://ceur-ws.org/Vol-3550/paper1.pdf [23] Nykyforova, L., Kiktev, N., Lendiel, T., Pavlov, S., & Mazurchuk, P. (2023). Computer- integrated control system for electrophysical methods of increasing plant productivity. Machinery & Energetics, 14(2), 34-45. doi: 10.31548/ machinery/2.2023.34. [24] Khort, D.; Kutyrev, A.; Kiktev, N.; Hutsol, T.; Glowacki, S.; Kuboń, M.; Nurek, T.; Rud, A.; Gródek-Szostak, Z. Automated Mobile Hot Mist Generator: A Quest for Effectiveness in Fruit Horticulture. Sensors 2022, 22, 3164. https://doi.org/10.3390/s22093164 30 7. Appendix Program code Esp8266 #include// Enter your Wi-Fi SSID and password const char* ssid = "TP-Link_F43C"; const char* password = "80818458"; // Specify the GPIO pin for the LED (eg D4) const int ledPin = D7; const int ledPin_0 = D5; // A variable to store the current brightness value int brightness = 0; WiFiServer server(80); void setup() { Serial.begin(115200); delay(10); // Setting the foam for the LED pinMode(ledPin, OUTPUT); pinMode(ledPin_0, OUTPUT); analogWrite(ledPin, brightness); // Connecting to a Wi-Fi network Serial.println(); Serial.print("Connecting to "); Serial.println(ssid); WiFi.begin(ssid, password); while (WiFi.status() != WL_CONNECTED) { delay(500); Serial.print("."); } Serial.println(""); Serial.println("WiFi connected"); Serial.println("IP address: "); Serial.println(WiFi.localIP()); // Starting the web server server.begin(); 31 Serial.println("HTTP server started"); } void loop() { WiFiClient client = server.available(); if (!client) { return; } Serial.println("New client"); while (!client.available()) { delay(1); } String request = client.readStringUntil('\r'); Serial.println(request); client.flush(); // Checking requests to change the brightness of the LED if (request.indexOf("/slider") != -1) { brightness = request.substring(request.indexOf("=") + 1).toInt(); brightness = constrain(brightness, 0, 1023); // limiting the brightness value from 0 to 1023 analogWrite(ledPin, brightness); } // Checking requests to control the LED if (request.indexOf("/LED=ON") != -1) { digitalWrite(ledPin_0, HIGH); } if (request.indexOf("/LED=OFF") != -1) { digitalWrite(ledPin_0, LOW); } if (request.indexOf("/getBrightness") != -1) { float sl = analogRead(A0); client.println("HTTP/1.1 200 OK"); client.println("Content-Type: application/json"); client.println(""); client.print("{\"value\": "); client.print(String(sl)); client.println("}"); } delay(1); Serial.println("Client disconnected"); Serial.println(""); } 32