Digital Object Detection of Construction Site Based on Building Information Modeling and Artificial Intelligence Systems Denys Chernysheva, Serhii Dolhopolova, Tetyana Honcharenkoa, Viktor Sapaieva and Maksym Delembovskyia a Kyiv National University of Construction and Architecture, 31, Povitroflotsky Avenue, Kyiv, 03037, Ukraine Abstract This study is devoted to the problem of digital object detection of construction site based on Building Information Modeling (BIM) technology and Artificial Intelligence (AI) systems. To detect, classify and evaluate individual components and objects on a construction site, an artificial intelligence system is proposed that combines Convolutional Neural Network (CNN) and Feed-Forward Neural Network (FFNN) architectures. The authors propose the using of the Internet of Things (IoT) and Big Data technologies for real-time recognition of building objects on a construction site. The effectiveness of identifying a set of data attributes from a BIM model to create a digital twin of a construction site has been confirmed. The reliability of object recognition by the CNN model was proved with an indicator of 90.4%, which confirms the correlation of the FFNN model with the attributes of the reference BIM model. The results of this study can be used to further improve the concept of creating digital twins of construction objects throughout the entire life cycle and to monitor real estate objects at the operational stage. Keywords 1 Construction site, digital twin, artificial intelligence, Building Information Modeling, Convolutional Neural Network, Feed-Forward Neural Network 1. Introduction Nowadays, the integration of Building Information Modeling (BIM) technology in the construction industry is used for a huge range of operations and life cycle processes, including environmental monitoring; management of facilities and their processes; energy efficiency assessment; control of construction processes, etc. The rapid development of a complex of information technologies, in particular artificial intelligence systems, IoT, and Big Data technologies, allows us to assess the enormous prospects for the development of BIM technologies, including in the context of the introduction digital twin of the construction space. This research consists of the scientific substantiation of the combination of a complex of information technologies, in particular an artificial intelligence system, which at different stages uses two conceptually different architectures, namely: CNN, which uses the YOLOv3 model to detect a set of objects in the image in real-time, and FFNN, which performs multi-label classification according to the received and reference indicators of BIM design. Additional, but no less important, within the framework of the study, are the aspects of the inclusion of IoT and Big Data technologies in the system, which comprehensively allow obtaining enormous arrays of useful data, which, in combination with the mechanisms of the artificial intelligence system, can create a plausible digital twin of the construction site. The developed digital twin can be used to systematize and assess the quality of all construction life cycles. ITTAP’2022: 2nd International Workshop on Information Technologies: Theoretical and Applied Problems, November 22–24, 2022, Ternopil, Ukraine EMAIL: chernyshev.do@knuba.edu.ua (A. 1); dolhopolov@icloud.com (A. 2); goncharenko.ta@knuba.edu.ua (A. 3); sapaiev.viktor@gmail.com(A. 4); delembovsky.mm@knuba.edu.ua (A. 5) ORCID: 0000-0002-1946-9242 (A. 1); 0000-0001-9418-0943 (A. 2); 0000-0003-2577-6916 (A. 3); 0000-0002-7978-7226 (A. 4); 0000- 0002-6543-0701 (A. 5) 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) The theoretical analysis of the evolution of BIM technologies and the development of complex information technologies, which contributed to the creation of digital twins is engaged by the team of scientists at Michigan University in the composition by Min Deng, Carol Menassa, and Vinet Kamat. In their work [1] they have projected a five-level structure of the BIM evolution, according to which the digital twin (level 5) requires a step-by-step transformation of the static tool of 3D visualization BIM (level 1), analysis, and simulations with BIM models (level 2), integration of BIM and IoT methods (level 3), integration of BIM methods and artificial intelligence (level 4) into a single whole. In scientific papers [2] – [4], the authors consider BIM from other key positions, which take into account the concepts of designing engineering networks and reviewing BIM models of different levels by the life cycle of a construction object. Thus, this evolutionary approach and additional information about BIM models allow us to form an idea of a digital twin, which will not only visualize and predict decision-making in real-time but also implement automatic feedback support and control of the construction environment based on optimized results and management strategies. A group of Chinese scientists consisting of Hao Wu, Lingbo Liu, Yang Yu, Zhenghong Peng, Hongzan Jiao, and Qiang Niu is engaged in the research on the effectiveness of using Big Data technology in urban studies. The paper [5] considers a set of simulations using public spatial Big Data, which is identified as the best source of data for understanding human logistics. Thus, these positions of urban Big Data determine enormous prospects for modeling and forecasting human factors that affect the efficiency of construction projects. In the article [6], the team of authors – Shu Tang, Dennis R. Shelden, Charles M. Eastman, Pardis Pishdad Bozorgi, and Xinghua Gaob – defines the processes and trends in the integration of BIM and IoT, which have enormous prospects for improving efficiency throughout the life cycles of construction. Thus, receiving many real-time data streams from an extensive network of IoT sensors and sensors provides opportunities to increase efficiency during construction operation and monitoring, health and safety management, logistics, and facilities management. The duo of scientists Liquan Zhao and Shuaiyang Li are engaged in research aimed at improving object detection algorithms, in particular with the help of the YOLOv3 model. In the article [7], researchers note the considerable popularity of YOLOv3 among other methods of object recognition based on deep learning. In the publication [8], a group of scientists consisting of G. Sreenu and M. A. Saleem Durai focuses on a more detailed study of object detection during intelligent video surveillance. Thus, during intelligent video surveillance, YOLOv3 can be involved, which will allow the processing of Big Data that enters the system due to numerous video streams broadcast from video surveillance cameras. The research on the process of the multi-label classification is carried out by a team of Ukrainian scientists in the composition of S. Dolhopolov, T. Honcharenko, S. A. Dolhopolova, O. Riabchun, M. Delembovsky, and O. Omelianenko. In the paper [9], a systematic approach to the implementation of multi-label classification for solving problems with a large number of input and output classes is characterized. Thus, it is determined that each task of the multi-label classification contains its architecture, which consists of several parameters and quality indicators and, at the same time, requires management of a set of data with several parameters, performs regression on several parameters, carries out training on several parameters and can represent a space of parameters. This study aims to develop an information system that integrates BIM technologies, CNN, and FFNN systems of artificial intelligence, IoT, and Big Data technologies, which will allow the creation of a digital twin of the construction site. To achieve the goal of the study, it is necessary to characterize the aspects of BIM design that will allow the real-time collection of data by IoT and Big Data for intelligent video surveillance using the YOLOv3 method and multi-label classification of attributes BIM of the model of the future digital twin. 2. Main research Having analyzed the works of Ukrainian and foreign scientists, we have developed a model illustrating the natural way to achieve a high-quality digital twin. Fig. 1 shows the developed “Key to Digital Twin” model. Figure 1: The “Key to Digital Twin” model According to the developed “Key to Digital Twin” model, data in “raw” form is collected on the construction site using the IoT complex, which in turn is supplemented by Big Data and implements the flow of raw, but most valuable, information. On the other hand, in asynchronous mode, specialists from different fields form a BIM “Reference Model” of the construction object, the attributes of which are explicitly or implicitly entered into the BIM Data Storage and remain in standby mode until YOLOv3 performs the classification of actual data on the construction site. After both parts are ready, the FFNN model is used to match attributes, i.e. multi-label classification of matches. As a result, artificial intelligence provides a percentage correspondence of the actual results of construction processes and the state of the construction site in comparison with the reference object designed with the help of BIM technologies by multiple parameters. According to these data, a decision is made to create a digital twin using CAD systems using the defined data. In the future, the developed system can support the digital twin by processing information in real-time, which will provide an opportunity to make operational adjustments. The study uses YOLOv3, which is a modified version of the general YOLO architecture, consisting of 106 convolutional layers and capable of recognizing small enough objects. The main advantage of YOLO 3 is that it has 3 output layers, each of which filters objects of different sizes [10]. As part of the object search, the system performs several stages related to data acquisition, image selection, data separation, training, training, and model output [11] – [14]. To implement the search of some array of BIM objects of the construction site, we will form a small test Dataset consisting of 5 different classes, where the total number of images for each class is 100 units. Thus, the trained neural network model will be able to distinguish between 5 different objects at the construction site. Namely: BIM1 is iron reinforcement; BIM2 is worker (man); BIM3 is protective helmet; BIM4 is vegetation; BIM5 is earth embankments. In actual BIM design, there are many more attributes that can be recognized and classified with YOLOv3. The YOLOv3 model is trained in the ratio of 80% of training images, and 20% of test images. List of libraries and software tools that were used to test the proposed approach in the research: python >= 3.6, numpy, torch >= 1.0.0, opencv, json, keras, pandas, numpy matplotlib, tensorboardX, CUDA. It is necessary to determine the average precision per object and the average precision of the entire model to further verify the correctness of training the YOLOv3 model. The value of the average precision (AP) object is determined by equation: (1) ∑( ) ( ) ( ) ( ) where AP is Average Precision; is the i-object classification value; is the definition of the time series of the corresponding class; ( ) is the derivative interpolation function that determines the largest value in the time series. The value of the average precision of the entire model (mAP) is determined by equation: ∑ (2) where APi is Average Precision of i-class; is mean Average Precision; is the number of defined classes to recognize. Note: (1) is necessary to derive (2), but in the general case, it is a time series of the corresponding class that is recognized, among all classes available in the model, in this case, it is: BIM1, BIM2, BIM3, BIM4, BIM5. Thus, the model selects the maximum value from the time series, which allows us to conclude that the object is the most consistent with a definite class for each box (of each cell) and extract a probability that the box contains a specific class. After completing the training of the YOLOv3 model, we use part of the test material in manual mode to verify the correctness of the neural network. Fig. 2 presents one of the test materials that demonstrates the proper operation of the model by the identification of 3 available classes, namely: BIM1 – iron reinforcement; BIM2 – worker (man); BIM3 – protective helmet. The next step is to compare the obtained indicators with the reference ones using the FFNN model, which receives and processes the input data using the RELU activation function and provides the output data using the Soft plus activation function [15]. Thus, the multi-label classification for reference and actual indicators on the construction site is realized. Figure 2: The result of the YOLOv3 model on the example of one of the test materials For the operation of the FFNN model, pre-defined reference indicators according to the specified classes using the method of analysis of hierarchies, namely: BIM1 – 70%, BIM2 – 100%, BIM3 – 100%, BIM4 – 91%, BIM – 84%. In the last stage, with the help of (1) and (2), the mAP of the YOLOv3 model and the FFNN model, and the overall mAP indicator of the system are determined. 3. Results The training and validation results for the specified set of classes are shown in Fig. 3. Training continued for 25 generations, where one could observe a tendency to reduce both training and testing errors. Figure 3: Diagram of errors during training and testing Fig. 4 shows the graphical distribution of the average precision of classes according to the results of model YOLOv3, which are: BIM1 – 81%, BIM2 – 96%, BIM3 – 97%, BIM4 – 90%, and BIM5 – 88%. Thus, according to (2), the mean average precision of the model is: . Figure 4: Diagram of the distribution of the average precision of classes and the YOLOv3 model Thus, according to the results of the study, the average precision of the FFNN model for each classified YOLOv3 class was determined. Table 1 represents a generalized set of all initial data used in calculating the mean average precision of the system, as well as evaluating its suitability for approval in the project of a digital twin. Table 1 Generalized summary of all output data of the information system Object number Class name AP YOLOv3 Etalon BIM AP FFNN 1 BIM1 81% 70% 100% 2 BIM2 96% 100% 96% 3 BIM3 97% 100% 97% 4 BIM4 90% 91% 98,9% 5 BIM5 88% 84% 100% Thus, the Etalon BIM was set for each class manually as a definite limit of accuracy was required within the project. AP FFNN demonstrates the calculation of compliance of the obtained accuracy results with Etalon BIM. Thus, the mean average precision of the YOLOv3 model (AP YOLOv3) concerning all recognized classes is 90.4%, and the mean average precision of the FFNN model according to the ratio of reference and actual data is 98.38%. Thus, according to the obtained results, the mean average precision of the FFNN demonstrates that the mean average precision of the YOLOv3 exceeds the mean Etalon BIM by 10.54% and demonstrates that the specified objects of the construction site are successfully identified and can be represented in a digital twin. According to the study carried out, we consider it appropriate to recommend the developed information system that combines artificial intelligence systems, Big Data, IoT, and BIM technologies and allows for systematizing and evaluate an enormous set of parameters used in the design of BIM and other stages of the life cycle of the construction space. 4. Conclusion As part of the study, the “Key to Digital Twin” model is developed and presented, which allows to better systematize and understand the evolutionary path of BIM, and, at the same time, includes a set of combinations of two artificial intelligence systems, namely CNN and FFNN, which allow to recognize arrays of objects in real-time and compare them using multi-label classification. The developed software includes the ability to integrate all the latest information technologies, namely: IoT and Big Data, which allow to accumulation and process of an enormous amount of data related to the construction site. According to the results of the YOLOv3 artificial intelligence system, its effectiveness in recognizing objects that are pre-characterized as BIM classes during design is determined. Thus, the mean average precision (mAP) of this system is 90.4%. At the same time, the artificial intelligence system based on the FFNN model provided more accurate indicators by the BIM benchmarks, namely 98.38%. Therefore, according to the obtained results of mean average precision FFNN demonstrates that the mean average precision of the YOLOv3 exceeds the mean Etalon BIM by 10.54%, which allows us to conclude that the information system works properly and can provide correct data for further coordination and design of digital twins. 5. References [1] M. Deng, C. Menassa, and V. Kamat, “From BIM to digital twins: A systematic review of the evolution of intelligent building representations in the AEC-FM industry,” Journal of Information Technology in Construction, vol. 26, pp. 58-83, March 2021. https://doi.org/10.36680/j.itcon.2021.005 [2] T. Honcharenko, K. Kyivska, O. Serpinska, V. Savenko, D. Kysliuk, and Y. Orlyk, “Digital transformation of the construction design based on the Building Information Modeling and Internet of Things,” CEUR Workshop Proceedings, ITTAP, vol. 3039, pp. 267–279, November 2021. URL: https://cutt.ly/UCA8A7Z [3] T. Honcharenko, O. Terentyev, O. Malykhina, I. Druzhynina, and I. Gorbatyuk, “BIM-Concept for Design of Engineering Networks at the Stage of Urban Planning,” International Journal on Advanced Science, Engineering and Information Technology, vol. 11, no. 5, pp. 1728-1735, 2021 [Online]. https://doi.org/10.18517/ijaseit.11.5.13687 [4] R. Akselrod, A. Shpakov, G. Ryzhakova, I. Chupryna, H. Shpakova. «Integration of Data Flows of the Construction Project Life Cycle to Create a Digital Enterprise Based on Building Information Modeling», International Journal of Emerging Technology and Advanced Engineering, 2022, 1, pp. 40-50. DOI: 10.46338/IJETAE0122_05 [5] H. Wu, L. Lingbo, Y. Yang, P. Zhenghong, J. Hongzan, and N. Qiang, “An Agent-based Model Simulation of Human Mobility Based on Mobile Phone Data: How Commuting Relates to Congestion,” ISPRS International Journal of Geo-Information, vol. 8, no. 7, p. 313, July 2019. https://doi.org/10.3390/ijgi8070313 [6] S. Tang, D. R. Shelden, C. M. Eastman, P. Pishdad-Bozorgi, and X. Gao, “A review of building information modeling (BIM) and the internet of things (IoT) devices integration: Present status and future trends,” Automation in Construction, vol. 101, pp. 127-139, May 2019. https://doi.org/10.1016/j.autcon.2019.01.020 [7] L. Zhao, and L. Shuaiyang, “Object Detection Algorithm Based on Improved YOLOv3,” Electronics, vol. 9, no. 3, p. 537, March 2020. https://doi.org/10.3390/electronics9030537 [8] G. Sreenu, and M. A. Saleem Durai, “Intelligent video surveillance: a review through deep learning techniques for crowd analysis,” J. Big Data, vol. 6, pp. 48-75, June 2019. https://doi.org/10.1186/s40537-019-0212-5 [9] S. Dolhopolov, T. Honcharenko, S. A. Dolhopolova, O. Riabchun, M. Delembovskyi, and O. Omelianenko, “Use of Artificial Intelligence Systems for Determining the Career Guidance of Future University Student,” 2022 IEEE International Conference on Smart Information Systems and Technologies (SIST), pp. 1-6, 2022. To appear. [10] J. Redmon, and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv, pp. 1-6, April 2018. https://doi.org/10.48550/arXiv.1804.02767 [11] C. Kumar B., R. Punitha, and Mohana, “YOLOv3 and YOLOv4: Multiple Object Detection for Surveillance Applications,” Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 1316-1321, October 2020. https://doi.org/10.1109/icssit48917.2020.9214094 [12] J. R. Macalisang, A. S. Alon, M. F. Jardiniano, D. C. P. Evangelista, J. C. Castro, and M. L. Tria, “Drive-Awake: A YOLOv3 Machine Vision Inference Approach of Eyes Closure for Drowsy Driving Detection,” 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), pp. 1-5, October 2021. https://doi.org/10.1109/IICAIET51634.2021.9573811 [13] R. V. Sevilla, A. S. Alon, M. P. Melegrito, R. C. Reyes, B. M. Bastes, and R. P. Cimagala, “Mask-Vision: A Machine Vision-Based Inference System of Face Mask Detection for Monitoring Health Protocol Safety,” 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), pp. 1-5, October 2021. https://doi.org/10.1109/IICAIET51634.2021.9573664 [14] T. Honcharenko, V. Mihaylenko, Y. Borodavka, E. Dolya, and V. Savenko, “Information tools for project management of the building territory at the stage of urban planning,” CEUR Workshop Proceedings, 2851, pp. 22-33, 2021. URL: https://cutt.ly/bI3fAEo [15] P. Szymański, and T. Kajdanowicz, “Scikit-multilearn: a scikit-based Python environment for performing multi-label classification,” The Journal of Machine Learning Research, vol. 20, no. 6, pp. 209-230, December 2019. URL: https://cutt.ly/vCOGhw