Digital Object Detection of Construction Site Based on Building
Information Modeling and Artificial Intelligence Systems
Denys Chernysheva, Serhii Dolhopolova, Tetyana Honcharenkoa, Viktor Sapaieva and
Maksym Delembovskyia
a
    Kyiv National University of Construction and Architecture, 31, Povitroflotsky Avenue, Kyiv, 03037, Ukraine

                 Abstract
                 This study is devoted to the problem of digital object detection of construction site based on
                 Building Information Modeling (BIM) technology and Artificial Intelligence (AI) systems.
                 To detect, classify and evaluate individual components and objects on a construction site, an
                 artificial intelligence system is proposed that combines Convolutional Neural Network
                 (CNN) and Feed-Forward Neural Network (FFNN) architectures. The authors propose the
                 using of the Internet of Things (IoT) and Big Data technologies for real-time recognition of
                 building objects on a construction site. The effectiveness of identifying a set of data attributes
                 from a BIM model to create a digital twin of a construction site has been confirmed. The
                 reliability of object recognition by the CNN model was proved with an indicator of 90.4%,
                 which confirms the correlation of the FFNN model with the attributes of the reference BIM
                 model. The results of this study can be used to further improve the concept of creating digital
                 twins of construction objects throughout the entire life cycle and to monitor real estate
                 objects at the operational stage.

                 Keywords 1
                 Construction site, digital twin, artificial intelligence, Building Information Modeling,
                 Convolutional Neural Network, Feed-Forward Neural Network

1. Introduction
    Nowadays, the integration of Building Information Modeling (BIM) technology in the construction
industry is used for a huge range of operations and life cycle processes, including environmental
monitoring; management of facilities and their processes; energy efficiency assessment; control of
construction processes, etc. The rapid development of a complex of information technologies, in
particular artificial intelligence systems, IoT, and Big Data technologies, allows us to assess the
enormous prospects for the development of BIM technologies, including in the context of the
introduction digital twin of the construction space.
    This research consists of the scientific substantiation of the combination of a complex of
information technologies, in particular an artificial intelligence system, which at different stages uses
two conceptually different architectures, namely: CNN, which uses the YOLOv3 model to detect a set
of objects in the image in real-time, and FFNN, which performs multi-label classification according to
the received and reference indicators of BIM design. Additional, but no less important, within the
framework of the study, are the aspects of the inclusion of IoT and Big Data technologies in the
system, which comprehensively allow obtaining enormous arrays of useful data, which, in combination
with the mechanisms of the artificial intelligence system, can create a plausible digital twin of the
construction site. The developed digital twin can be used to systematize and assess the quality of all
construction life cycles.

ITTAP’2022: 2nd International Workshop on Information Technologies: Theoretical and Applied Problems, November 22–24, 2022,
Ternopil, Ukraine
EMAIL: chernyshev.do@knuba.edu.ua (A. 1); dolhopolov@icloud.com (A. 2); goncharenko.ta@knuba.edu.ua (A. 3);
sapaiev.viktor@gmail.com(A. 4); delembovsky.mm@knuba.edu.ua (A. 5)
ORCID: 0000-0002-1946-9242 (A. 1); 0000-0001-9418-0943 (A. 2); 0000-0003-2577-6916 (A. 3); 0000-0002-7978-7226 (A. 4); 0000-
0002-6543-0701 (A. 5)
                 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
    The theoretical analysis of the evolution of BIM technologies and the development of complex
information technologies, which contributed to the creation of digital twins is engaged by the team of
scientists at Michigan University in the composition by Min Deng, Carol Menassa, and Vinet Kamat.
In their work [1] they have projected a five-level structure of the BIM evolution, according to which
the digital twin (level 5) requires a step-by-step transformation of the static tool of 3D visualization
BIM (level 1), analysis, and simulations with BIM models (level 2), integration of BIM and IoT
methods (level 3), integration of BIM methods and artificial intelligence (level 4) into a single whole.
    In scientific papers [2] – [4], the authors consider BIM from other key positions, which take into
account the concepts of designing engineering networks and reviewing BIM models of different levels
by the life cycle of a construction object. Thus, this evolutionary approach and additional information
about BIM models allow us to form an idea of a digital twin, which will not only visualize and predict
decision-making in real-time but also implement automatic feedback support and control of the
construction environment based on optimized results and management strategies.
    A group of Chinese scientists consisting of Hao Wu, Lingbo Liu, Yang Yu, Zhenghong Peng,
Hongzan Jiao, and Qiang Niu is engaged in the research on the effectiveness of using Big Data
technology in urban studies. The paper [5] considers a set of simulations using public spatial Big Data,
which is identified as the best source of data for understanding human logistics. Thus, these positions
of urban Big Data determine enormous prospects for modeling and forecasting human factors that
affect the efficiency of construction projects.
    In the article [6], the team of authors – Shu Tang, Dennis R. Shelden, Charles M. Eastman, Pardis
Pishdad Bozorgi, and Xinghua Gaob – defines the processes and trends in the integration of BIM and
IoT, which have enormous prospects for improving efficiency throughout the life cycles of
construction. Thus, receiving many real-time data streams from an extensive network of IoT sensors
and sensors provides opportunities to increase efficiency during construction operation and monitoring,
health and safety management, logistics, and facilities management.
    The duo of scientists Liquan Zhao and Shuaiyang Li are engaged in research aimed at improving
object detection algorithms, in particular with the help of the YOLOv3 model. In the article [7],
researchers note the considerable popularity of YOLOv3 among other methods of object recognition
based on deep learning.
    In the publication [8], a group of scientists consisting of G. Sreenu and M. A. Saleem Durai focuses
on a more detailed study of object detection during intelligent video surveillance. Thus, during
intelligent video surveillance, YOLOv3 can be involved, which will allow the processing of Big Data
that enters the system due to numerous video streams broadcast from video surveillance cameras.
    The research on the process of the multi-label classification is carried out by a team of Ukrainian
scientists in the composition of S. Dolhopolov, T. Honcharenko, S. A. Dolhopolova, O. Riabchun, M.
Delembovsky, and O. Omelianenko. In the paper [9], a systematic approach to the implementation of
multi-label classification for solving problems with a large number of input and output classes is
characterized. Thus, it is determined that each task of the multi-label classification contains its
architecture, which consists of several parameters and quality indicators and, at the same time, requires
management of a set of data with several parameters, performs regression on several parameters,
carries out training on several parameters and can represent a space of parameters.
    This study aims to develop an information system that integrates BIM technologies, CNN, and
FFNN systems of artificial intelligence, IoT, and Big Data technologies, which will allow the creation
of a digital twin of the construction site. To achieve the goal of the study, it is necessary to
characterize the aspects of BIM design that will allow the real-time collection of data by IoT and Big
Data for intelligent video surveillance using the YOLOv3 method and multi-label classification of
attributes BIM of the model of the future digital twin.

2. Main research
    Having analyzed the works of Ukrainian and foreign scientists, we have developed a model
illustrating the natural way to achieve a high-quality digital twin. Fig. 1 shows the developed “Key to
Digital Twin” model.
Figure 1: The “Key to Digital Twin” model

   According to the developed “Key to Digital Twin” model, data in “raw” form is collected on the
construction site using the IoT complex, which in turn is supplemented by Big Data and implements
the flow of raw, but most valuable, information. On the other hand, in asynchronous mode, specialists
from different fields form a BIM “Reference Model” of the construction object, the attributes of which
are explicitly or implicitly entered into the BIM Data Storage and remain in standby mode until
YOLOv3 performs the classification of actual data on the construction site. After both parts are ready,
the FFNN model is used to match attributes, i.e. multi-label classification of matches. As a result,
artificial intelligence provides a percentage correspondence of the actual results of construction
processes and the state of the construction site in comparison with the reference object designed with
the help of BIM technologies by multiple parameters. According to these data, a decision is made to
create a digital twin using CAD systems using the defined data. In the future, the developed system can
support the digital twin by processing information in real-time, which will provide an opportunity to
make operational adjustments.
    The study uses YOLOv3, which is a modified version of the general YOLO architecture, consisting
of 106 convolutional layers and capable of recognizing small enough objects. The main advantage of
YOLO 3 is that it has 3 output layers, each of which filters objects of different sizes [10]. As part of the
object search, the system performs several stages related to data acquisition, image selection, data
separation, training, training, and model output [11] – [14].
    To implement the search of some array of BIM objects of the construction site, we will form a small
test Dataset consisting of 5 different classes, where the total number of images for each class is 100
units. Thus, the trained neural network model will be able to distinguish between 5 different objects at
the construction site. Namely: BIM1 is iron reinforcement; BIM2 is worker (man); BIM3 is protective
helmet; BIM4 is vegetation; BIM5 is earth embankments. In actual BIM design, there are many more
attributes that can be recognized and classified with YOLOv3. The YOLOv3 model is trained in the
ratio of 80% of training images, and 20% of test images.
    List of libraries and software tools that were used to test the proposed approach in the research:
python >= 3.6, numpy, torch >= 1.0.0, opencv, json, keras, pandas, numpy matplotlib, tensorboardX,
CUDA.
    It is necessary to determine the average precision per object and the average precision of the entire
model to further verify the correctness of training the YOLOv3 model. The value of the average
precision (AP) object is determined by equation:
                                                                                                     (1)
                                    ∑(            )         (     )

                                        (     )            ( )

where AP is Average Precision; is the i-object classification value;            is the definition of the
time series of the corresponding class;           (    ) is the derivative interpolation function that
determines the largest value in the time series.
   The value of the average precision of the entire model (mAP) is determined by equation:

                                                  ∑                                                  (2)


where APi is Average Precision of i-class;         is mean Average Precision; is the number of defined
classes to recognize.
    Note: (1) is necessary to derive (2), but in the general case, it is a time series of the corresponding
class that is recognized, among all classes available in the model, in this case, it is: BIM1, BIM2,
BIM3, BIM4, BIM5. Thus, the model selects the maximum value from the time series, which allows us
to conclude that the object is the most consistent with a definite class for each box (of each cell) and
extract a probability that the box contains a specific class.
    After completing the training of the YOLOv3 model, we use part of the test material in manual
mode to verify the correctness of the neural network. Fig. 2 presents one of the test materials that
demonstrates the proper operation of the model by the identification of 3 available classes, namely:
BIM1 – iron reinforcement; BIM2 – worker (man); BIM3 – protective helmet.
    The next step is to compare the obtained indicators with the reference ones using the FFNN model,
which receives and processes the input data using the RELU activation function and provides the
output data using the Soft plus activation function [15]. Thus, the multi-label classification for
reference and actual indicators on the construction site is realized.
Figure 2: The result of the YOLOv3 model on the example of one of the test materials

   For the operation of the FFNN model, pre-defined reference indicators according to the specified
classes using the method of analysis of hierarchies, namely: BIM1 – 70%, BIM2 – 100%, BIM3 –
100%, BIM4 – 91%, BIM – 84%.
   In the last stage, with the help of (1) and (2), the mAP of the YOLOv3 model and the FFNN
model, and the overall mAP indicator of the system are determined.

3. Results
   The training and validation results for the specified set of classes are shown in Fig. 3. Training
continued for 25 generations, where one could observe a tendency to reduce both training and testing
errors.


Figure 3: Diagram of errors during training and testing

  Fig. 4 shows the graphical distribution of the average precision of classes according to the results of
model YOLOv3, which are: BIM1 – 81%, BIM2 – 96%, BIM3 – 97%, BIM4 – 90%, and BIM5 –
88%. Thus, according to (2), the mean average precision of the model is:

                                                                                   .
Figure 4: Diagram of the distribution of the average precision of classes and the YOLOv3 model

   Thus, according to the results of the study, the average precision of the FFNN model for each
classified YOLOv3 class was determined. Table 1 represents a generalized set of all initial data used
in calculating the mean average precision of the system, as well as evaluating its suitability for
approval in the project of a digital twin.

Table 1
Generalized summary of all output data of the information system
  Object number         Class name             AP YOLOv3         Etalon BIM            AP FFNN
         1                  BIM1                  81%                70%                100%
         2                  BIM2                  96%               100%                 96%
         3                  BIM3                  97%               100%                 97%
         4                  BIM4                  90%                91%                98,9%
         5                  BIM5                  88%                84%                100%

   Thus, the Etalon BIM was set for each class manually as a definite limit of accuracy was required
within the project. AP FFNN demonstrates the calculation of compliance of the obtained accuracy
results with Etalon BIM.
   Thus, the mean average precision of the YOLOv3 model (AP YOLOv3) concerning all recognized
classes is 90.4%, and the mean average precision of the FFNN model according to the ratio of
reference and actual data is 98.38%. Thus, according to the obtained results, the mean average
precision of the FFNN demonstrates that the mean average precision of the YOLOv3 exceeds the mean
Etalon BIM by 10.54% and demonstrates that the specified objects of the construction site are
successfully identified and can be represented in a digital twin.
   According to the study carried out, we consider it appropriate to recommend the developed
information system that combines artificial intelligence systems, Big Data, IoT, and BIM technologies
and allows for systematizing and evaluate an enormous set of parameters used in the design of BIM
and other stages of the life cycle of the construction space.

4. Conclusion
   As part of the study, the “Key to Digital Twin” model is developed and presented, which allows to
better systematize and understand the evolutionary path of BIM, and, at the same time, includes a set
of combinations of two artificial intelligence systems, namely CNN and FFNN, which allow to
recognize arrays of objects in real-time and compare them using multi-label classification. The
developed software includes the ability to integrate all the latest information technologies, namely:
IoT and Big Data, which allow to accumulation and process of an enormous amount of data related to
the construction site. According to the results of the YOLOv3 artificial intelligence system, its
effectiveness in recognizing objects that are pre-characterized as BIM classes during design is
determined. Thus, the mean average precision (mAP) of this system is 90.4%. At the same time, the
artificial intelligence system based on the FFNN model provided more accurate indicators by the BIM
benchmarks, namely 98.38%. Therefore, according to the obtained results of mean average precision
FFNN demonstrates that the mean average precision of the YOLOv3 exceeds the mean Etalon BIM
by 10.54%, which allows us to conclude that the information system works properly and can provide
correct data for further coordination and design of digital twins.

5. References
[1] M. Deng, C. Menassa, and V. Kamat, “From BIM to digital twins: A systematic review of the
     evolution of intelligent building representations in the AEC-FM industry,” Journal of
     Information Technology in Construction, vol. 26, pp. 58-83, March 2021.
     https://doi.org/10.36680/j.itcon.2021.005
[2] T. Honcharenko, K. Kyivska, O. Serpinska, V. Savenko, D. Kysliuk, and Y. Orlyk, “Digital
     transformation of the construction design based on the Building Information Modeling and
     Internet of Things,” CEUR Workshop Proceedings, ITTAP, vol. 3039, pp. 267–279, November
     2021. URL: https://cutt.ly/UCA8A7Z
[3] T. Honcharenko, O. Terentyev, O. Malykhina, I. Druzhynina, and I. Gorbatyuk, “BIM-Concept
     for Design of Engineering Networks at the Stage of Urban Planning,” International Journal on
     Advanced Science, Engineering and Information Technology, vol. 11, no. 5, pp. 1728-1735,
     2021 [Online]. https://doi.org/10.18517/ijaseit.11.5.13687
[4] R. Akselrod, A. Shpakov, G. Ryzhakova, I. Chupryna, H. Shpakova. «Integration of Data Flows
     of the Construction Project Life Cycle to Create a Digital Enterprise Based on Building
     Information Modeling», International Journal of Emerging Technology and Advanced
     Engineering, 2022, 1, pp. 40-50. DOI: 10.46338/IJETAE0122_05
[5] H. Wu, L. Lingbo, Y. Yang, P. Zhenghong, J. Hongzan, and N. Qiang, “An Agent-based Model
     Simulation of Human Mobility Based on Mobile Phone Data: How Commuting Relates to
     Congestion,” ISPRS International Journal of Geo-Information, vol. 8, no. 7, p. 313, July 2019.
     https://doi.org/10.3390/ijgi8070313
[6] S. Tang, D. R. Shelden, C. M. Eastman, P. Pishdad-Bozorgi, and X. Gao, “A review of building
     information modeling (BIM) and the internet of things (IoT) devices integration: Present status
     and future trends,” Automation in Construction, vol. 101, pp. 127-139, May 2019.
     https://doi.org/10.1016/j.autcon.2019.01.020
[7] L. Zhao, and L. Shuaiyang, “Object Detection Algorithm Based on Improved YOLOv3,”
     Electronics, vol. 9, no. 3, p. 537, March 2020. https://doi.org/10.3390/electronics9030537
[8] G. Sreenu, and M. A. Saleem Durai, “Intelligent video surveillance: a review through deep
     learning techniques for crowd analysis,” J. Big Data, vol. 6, pp. 48-75, June 2019.
     https://doi.org/10.1186/s40537-019-0212-5
[9] S. Dolhopolov, T. Honcharenko, S. A. Dolhopolova, O. Riabchun, M. Delembovskyi, and O.
     Omelianenko, “Use of Artificial Intelligence Systems for Determining the Career Guidance of
     Future University Student,” 2022 IEEE International Conference on Smart Information Systems
     and Technologies (SIST), pp. 1-6, 2022. To appear.
[10] J. Redmon, and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv, pp. 1-6, April
     2018. https://doi.org/10.48550/arXiv.1804.02767
[11] C. Kumar B., R. Punitha, and Mohana, “YOLOv3 and YOLOv4: Multiple Object Detection for
     Surveillance Applications,” Third International Conference on Smart Systems and Inventive
     Technology             (ICSSIT),          pp.           1316-1321,          October          2020.
     https://doi.org/10.1109/icssit48917.2020.9214094
[12] J. R. Macalisang, A. S. Alon, M. F. Jardiniano, D. C. P. Evangelista, J. C. Castro, and M. L. Tria,
     “Drive-Awake: A YOLOv3 Machine Vision Inference Approach of Eyes Closure for Drowsy
     Driving Detection,” 2021 IEEE International Conference on Artificial Intelligence in
     Engineering        and       Technology      (IICAIET),        pp.     1-5,     October      2021.
     https://doi.org/10.1109/IICAIET51634.2021.9573811
[13] R. V. Sevilla, A. S. Alon, M. P. Melegrito, R. C. Reyes, B. M. Bastes, and R. P. Cimagala,
     “Mask-Vision: A Machine Vision-Based Inference System of Face Mask Detection for
     Monitoring Health Protocol Safety,” 2021 IEEE International Conference on Artificial
     Intelligence in Engineering and Technology (IICAIET), pp. 1-5, October 2021.
     https://doi.org/10.1109/IICAIET51634.2021.9573664
[14] T. Honcharenko, V. Mihaylenko, Y. Borodavka, E. Dolya, and V. Savenko, “Information tools
     for project management of the building territory at the stage of urban planning,” CEUR
     Workshop Proceedings, 2851, pp. 22-33, 2021. URL: https://cutt.ly/bI3fAEo
[15] P. Szymański, and T. Kajdanowicz, “Scikit-multilearn: a scikit-based Python environment for
     performing multi-label classification,” The Journal of Machine Learning Research, vol. 20, no. 6,
     pp. 209-230, December 2019. URL: https://cutt.ly/vCOGhw