Hybrid face recognition solution for security Y Donon1 1 Samara National Research University, Moskovskoe Shosse 34, Samara, Russia, 443086 Abstract. This article introduces a design that aims threw the combination of open source and closed source technologies, to make a, simple to implement,low-cost and high-performing face recognition solution. The solution provides identification, emotions and facial features recognition as well asdangerous objects spotting.This article exposes the concept of the solution, explains its importance on the market and provides details of a proof of concept prototype. 1. Introduction The market of face and image recognition technologies is booming and forecasted a brilliant future. Although it is seen more and more in specialised magazine or promoted by giants of information technologies, many smaller actors are left behind as they perceive the technology as inaccessible or too expensive. Numerous researchesaboutthose systems have been made in the recent times and during those years of research, computers science has evolved beyond measure.But what really have changed since a few years, are the cameras. What makes this ground of research more prolific than ever todays is that we all havephones in our pockets which sensors have an average of 14 megapixels, that we can buy full HD webcams for less than a hundred dollars. 15 years ago, a digital camera’s resolution would be a fifth of what a webcam hasnow and be ten times its price. [1] Although face recognitions attempts have been around for more than 50 years now, it still appears as a new technology to most of people. If we had indeed technologies able to perform those tasks back in the sixties, pictures would have to be taken according to very precise specifications. Attempts were multiplied; it became a trend in the nineties, some artefacts from that time, such as the ORL Database if Faces from Cambridge are even still in use today. In the beginning of the two-thousands, an international contest has even been thrown on the subject of face recognition. [2] Yet with all of that, it is only now and in the upcoming years that we really can and will perceive ground-breaking advance in those technologies. [3] Nowadays, we have the tools, we have the necessary sensors for an efficient recognition and new actors on this market are emerging every day.Those solutions represent a trend on the security market of course; it allows to recognise not only people, but also specific objects and track them if necessary. The industry alsostarts to use emotion recognition systems to understand better their customer. [4] In this paper, I will introduce a solution to exploit this new market and make it accessible to everyone threw a low-cost, high performing face recognition solution for security. A design that is easy to deploy without high computing capabilities. The goal of this paper is for everyone to understand the stakes of this market, how accessible it is now and how it can be used in our everyday life. IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) Image Processing and Earth Remote Sensing Y Donon 2. Market and projections 2.1. Hybrid As the market is still emerging but have been around for a long time, both open-source and closed source solutions exists. Closed source solutions are efficient to spot faces,can differentiate them, making an authentication possible. Those solutions, however, falls short when it comes to analyse a picture’s details, such as emotions, facial details or objects. Closed source image recognition providers, however, are usually specialized and therefore extremely good when it comes to identifying those details. [4] The design presented in this paper tries to take profit of this reality. Combining open source technologies and closed source ones, taking to both worlds what they are good at, allows making a first analysis on a local computer, even one having low computation capabilities and, over the internet, using solutions provided by the majors of image recognition, to analyse pictures in-depth,beyond the capacities of open source solutions. 2.2. Projection As mentioned, the face recognition market is still emerging. It is expected to be worth between 7.5 and 10 billion dollars by 2022, 2 to 3 time more than it was worth in 2016. The year before that, the main client of those systems was US Homeland Security.By now the use of such solution for security has already spread in several countries and is used by such actors as the British police. Since its beginning this technology has been viewed as a major asset in security systems. [4] Open source solutions are forecasted to improve their algorithms in 2D and thermal face recognition, while it is believed that online services will keep the specialized market (complex emotions, facial features details, 3D modelling, etc...) , although open source alternatives exists and will also improve, but not with the same precision rate. [5] The main uses between 2017 and 2022 are forecasted to be emotion recognition, tracking and monitoring, access control and law enforcement. [4] Therefore the design suggested in this paper fits the needs of the market to have an affordable solution, using the full capacities offered by the different actors of face recognitions solutions. It also is appropriate as thiscurrent is forecasted to be stable over at least the upcoming four years. Making profitable for SMEs (Small and medium-sized enterprises), which are 98% of economic environment, a multi-billions digital economy market threw the design presented is a breakthrough for face recognition as it makes it an accessible tool. 3. Results 3.1. Functioning In this design, if the picture is of sufficient quality for an optimal analysis[6], the system queries first a micro database of a handful of the most recent faces, loaded on the computer’s RAM (1). This reduces the load on the disk’s database and accelerates the program, as between two frames, it is usually the same faces that show up. If the face hasn’t been recognized on the first database, a query is sent to the second one, which can store up to a thousand of faces, depending of the capacities of the computer (2). This database is typically conceived to store the faces of all the employees of a company and manage access controls. If no match is found in the second database (the confidency of the comparison between the shown face and the existing ones is too low), the system querries online services, that can analyse the picture, confirm that the idividual in unknown via an online database (3), and differenciate its emotions, facial features as well as alaysing its environement, detecting immediate threats, such as weapons. Finnaly, the result of this detection will be added to the RAM-loaded database to avoid detecting and alaysing again the same face (4). Each query to an online service having of course a cost. 3.2. Results The performance reached by the test program fitted all of our expectations, if sometimes the description suffers small imprecisions it offers a real-time identification on video with 5 frames per IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 418 Image Processing and Earth Remote Sensing Y Donon second, spotting simultaneously several object[7], more than enough for a security camera, giving even an impression of relative fluidity in the capture. With identified facial features such as hair colour or emotions, a very precise recognition differentiating identical twins without any hesitation and beingable to detect some specific object such as weapons, we can say that, on a technical point of view, the performance test of the design is a complete success. Figure 1. Design’s business process. Figure 2. This image illustrates the analysis of a person caught on camera. Some of the information, Figure 3. Assets that although the quality on an such as the approximation of the age are not image might be poor, the program is able to correct, however, the capture allows a clear extrapolate correct information. identification (the program select the frame having the “best quality of face”). Table 1. Features analysis for figure 2 and 3. Feature Values figure 2 Precision Values figure 3 Precision (appreciation if data (appreciation if data unavailable) figure unavailable) figure 2 3 Facial 13.8 72% 20 97% Smile 0.0% Correct 0.0% Correct Emotion Neutral 99.7% Correct Neutral 98.7% Correct Glasses No glasses Correct Reading glasses Correct Hair Bald 33% 15% Bald 33% 15% Hair Black 99% 85% Black 100% Error Hair Blond 84% 10% Blond 53% Correct Hair - - Brown 42% Correct Hair Other 70% - Other 38% - Description A woman standing Correct A woman in a blue Correct in a room shirt Object Knife Correct gun Correct IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 419 Image Processing and Earth Remote Sensing Y Donon Although some obvious progress are to be made on the hair colour detection, the features calculated are generally close to reality and most importantly allowsa human identification of a person, even without a the subsequent picture. 3.3. Technical specifications A software has been developed as a design proof of concept. It has been developed in C#, using an OpenCV wrapper for this platform, OpenCvSharp, Fisherfaces recognition algorithm and Microsoft’s Face and Vision API. The use of Fisherfaces has been motivated over other methods for its search of discrimination criteria, which is more reliable to exclude possible faces match, enhancing the security offered by the solution. We widely favour a false negative, which leads to a control on the server that the person truly isn’t identified in our database, than a false positive, which would allow an intruder to get through the system. [8-11] The use of the Microsoft cognitive systems has been decided as it fitted the technical needs of the environment, offered a good transparency and as they send back details from the analysis of the image such as face coordinates, allowing further extrapolation. The other considered providers whichbilling systems were adapted to this design were Google Cloud Platform and IBM Watson. The goal being to make the market as accessible as possible, it was important to reduce every source of costs. The system has been tested on several Microsoft Windows platforms (Win 7 and superior versions), it function and manage real-time recognition on computers having 4Gb of RAM, a dual core processor and a SSD of 64 GB,inferior configurations haven’t been tested. 3.4. Costs The design described here is of course flexible, meaning any online service could be used alternatively to Microsoft’s. The calculation of costs for such an access control system was made considering an arbitrary a company size of a hundred workers (big company on the SME environment). Considering each of the employees comes into the company’s building twice a day every working day, it is 4000 controls a month. If those faces are all stored locally, they should be recognized and therefore not generate any cost. If every day fifty unknown person comes into the building it will make about a thousand controls that are not perceived by the local recognition system. Those numbers all falls under the “free calls pool” of Microsoft Azure subscription, even considering that some queries of the analysis must be done in several steps, generating as many calls. However, this represents a laboratory reality which always differs from the “field”. For the same amount of people, used in a production environment, the price of the online analysis has been calculated to be about 10 to 15 dollars a month, taking in account all the frequent errors of the software. [12] As a onetime cost, it is necessary to get a small computer and a webcam to run the software. Multiple devices have been assessed on that purpose, all in a price range of 250 to 350 dollars for the computer and as for the camera between 50 and 80 dollars. For a total cost of 300-430 dollars a door. Counting the cost of electricity to power the system, the total cost of the installation is estimated to 1500 dollars for a period of 5 years (total cost of ownership). 4. Reliability The test program realized to proof this design is able to distinguish similar faces such as twins easily. The confidence criterion has been configured severely, to make sure the local recognition system wouldn’t give any false positive. This confidence has been set according to previous researches. [13] Tests have been repeated several times on thousands of frames without any mistake from the software. To assess the efficiency of the software, some further tests and comparison have been conducted. The computer has been presented pictures from five pairs of twins identified in the database and two pairs of pictures of the same person on different pictures and has to differentiate them. Humans, on the other hand, have been presented a similar set of pictures and were simply asked, having two seconds for each picture to tell which subjects were twins and which were not. [14] The precision of the software couldn’t be assessed with accuracy, as so far, the program hasn’t been cheated on successfully. Whenever the confidence of the local face analyser is too low, faces are IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 420 Image Processing and Earth Remote Sensing Y Donon sent online for analysis. Since the program is at its final stage of development, the success rate has been of a hundred percent. Therefore, the upcoming paragraph, assessing the reliability of such systems, is based on external information and other systems. Figure 4. Twins differentiation. Figure 5. The common test sample between the computer and humans for twins’ assessment. Table 2. Software control. Feature Value Couple tested 7 Total frames 2500+ Accuracy 100% Table 3. Human control. Feature Value Human subjects 18 Couple tested 6 Total frames 72 Accuracy 61% Unveiling its last iPhone, Apple claimed its face recognition system has a reliability of one in a million, meaning that once in a million times two faces would be confused and recognised as being the same, this is the closest comparison possible to do to the online services used. [14] To correlate this number we can take the code of a credit card in Russia, 4 digits or 10’000 possibilities, fingerprints , reputable unreliable once in 50’000 samples or an average home key (6 tumblers, 7 heights), which makes about 120’000 possibilities. Weather the reliability of the system is comparable to Apple’s claim about its owns is discussable, but, nevertheless, the tests in laboratory are in favour of assessing a very high index of reliability forcomparable face recognition systems. IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 421 Image Processing and Earth Remote Sensing Y Donon 5. Conclusion As underlined in this presentation, face recognition is a fast-developing market at the moment, much is already done but much is left to be built and this design has a place in the development of the market. Every major actor involved in security should now consider getting themselves an access to this kind of technology, especially now it is more accessible than ever and as the market trend makes it very profitable. In the future, the detection will be improved by assessing the liveness of faces. Checking that we are not given a picture of a face but that it is a genuine face we have in front of the camera. This can be made by different methods, but the most adapted to a system of those dimensions is the analysis of the micro-behaviour of the eyes. [16-19] The identification system on RAM will also be compared in efficiency to a YOLO system (You Only Look Once) in order to asset their respective efficiency and choose the most appropriate technology to keep a target acquired and analyse it only once. This kind of system could also be used on security cameras to get frames with a higher resolution and filter them threw an artificial intelligence able to understand which frames are relevant by an analysis of the pictures. Allowing selecting only relevant frames for storage, gives the possibility to significantly augment the quality of the camera’s captors without being confronted to the problem of the storage space saturation. Emotion recognition and specifically this design can be adapted to the numerous of other uses such as home automations, alarms, research of wanted persons and many others that haven’t been mentioned in this article. It is up for everyone, on this new market, to develop their own ideas. Of course, this paper wasn’t about a purely technical breakthrough, however I hope that the reader understands better now the face recognition market, how to use it efficiently and make it profitable, in particular with the design offered.This kind of design will make the difference between an emerging market and a fully grown and accessible one, bringing a new technology to the consumer. In other words, I want everyone to understand how face recognition systems are now in the reach of their hands. 6. References [1] Digital Photography review (Access mode: https://www.dpreview.com/articles/5778663183/ ten-unique-cameras-from-the-dawn-of-consumer-digital-photography) (20.8.2013) [2] Philips P J, Flynn P J, Scruggs T, Bowyer K W, Chang J, Hoffman K, Marques J, Min J, Worek W 2005 Overview of the face recognition grand challenge, Computer Vision and Pattern Recognition IEEE Computer Society Conference on Computer Vision and Pattern Recognition DOI: 10.1109/CVPR.2005.268 [3] Zhao W, Chellappa R, Philips P J, Rosenfeld A 2003 Face recognition: A literature survey ACM Computing Surveys 35 399-458 [4] Gates K A 2011 Our biometric future: facial recognition technology and the culture of surveillance (New York University press) p 263 [5] Rybintsev A V, Konushin V S and Konushin A S 2015 Consecutive gender and age classification from facial images based on ranked local binary patterns Computer Optics 39(5) 762-769 DOI: 10.18287/0134-2452-2015-39-5-762-769 [6] Nikitin M Yu, Konushin V S and Konushin A S 2017 Neural network model for video-based face recognition with frames quality assessment Computer Optics 41(5) 732-742 DOI: 10.18287/2412-6179-2017-41-5-732-742 [7] Protsenko V I, Kazanskiy N L and Serafimovich P G 2015 Real-time analysis of parameters of multiple object detection systems Computer Optics 39(4) 582-591 DOI: 10.18287/0134-2452- 2015-39-4-582-591 [8] Jaiswal S, Bhadauria S S and Jadon R S 2011 Comparison between face recognition algorithm Eigenfaces, Fisherfaces and Elastic Bunch Graph Matching Journal of Global Research in Computer Science 2(7) 187-193 [9] Yang M-H 2002 Kernel Eigenfaces vs. Kernel Fisherfaces: Face Recognition Using Kernel Methods Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition 215-220 IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 422 Image Processing and Earth Remote Sensing Y Donon [10] Turk M A and Pentland A O 2002 Face recognition using Eigenface (The Media Laboratory MIT) [11] Kalinovskii I A and Spitsyn V G 2017 Review and testing of frontal face detectors Computer Optics 40(1) 99-111 DOI: 10.18287/2412-6179-2016-40-1-99-111 [12] Microsoft’s Computer Vision API Version 2.0 documentation, Microsoft (Access mode: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home) (22.8.2018) [13] Vizilter Yu V, Gorbatsevich V S, Vorotnikov A V and Kostromov N A 2017 Real-time face identification via CNN and boosted hashing forest Computer Optics 41(2) 254-265 DOI: 10.18287/2412-6179-2017-41-2-254-265 [14] How secure is Face ID? (Access mode: https://www.macworld.co.uk/feature/iphone/how- secure-is-face-id-3663992/) (01.11.2018) [15] Z Caplova, Obertov Z, Gibelli D M, Mazzarelli D, Fracasso T, Vanezis P, Sforza C and Cattaneo C 2017 The Reliability of Facial Recognition of Deceased Persons on Photographs Journal of Forensic Sciences 62 1286-1291 [16] Pan G, Wu Z and Sun L 2008 Liveness detection for face recognition, recent advances in face recognition IntechOpen 9 DOI: 10.5772/6397 [17] Blanz V and Vetter T 2003 Face recognition based on fitting a 3D morphable model IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9) [18] Kosinski M and Wang Y 2018 Deep neural networks are more accurate than humans at detecting sexual orientation from facial images Journal of Personality and Social Psychology 114(2) 246-257 [19] Pan G, Sun L and Wu Z 2017 Eyeblink-based Anti-Spoofing in Face Recognition from a Generic Webcamera IEEE 11th International Conference on Computer Vision DOI: 10.1109/ICCV.2007.4409068 IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018) 423