Design of a Data Warehouse for a Dynamic Greenhouse Control System Nikolay Kiktev 1,2, Maryna Lendiel 1 and Taras Lendiel 1 1 National University of Life and Environmental Sciences of Ukraine, Heroiv Oborony str., Kyiv, 03041, Ukraine 2 Taras Shevchenko National University of Kyiv, 64/13, Volodymyrs’ka str., Kyiv, 01601, Ukraine Abstract In the process of developing the system, the system was modeled using the time series algorithm and the structure of the dynamic database was developed. Data input, storage and analysis modules were developed. During the analysis, the application of OLAP and Data Mining technologies was proposed for intellectual analysis of large volumes of information. The obtained results of the system can be used in the process of forming management decisions of the greenhouse economy. This will allow you to direct the strategy for managing individual business processes in such a way as to increase the yield in greenhouses and, accordingly, the profitability of the farm in general. Keywords 1 dynamic system; greenhouse; database, monitoring, Data Mining, data storage, multidimensional cube. 1. Introduction During cultivation, an important stage is the study and analysis of all the conditions that are necessary for the normal growth and development of the plant. In the process of plant growth, it is important to take into account the optimal indicators of the microclimate of the greenhouse to increase the efficiency of the use of resources for crop cultivation [1]. Looking at the mentioned features, it is useful to use software tools for monitoring, saving and analyzing indicators that are important for increasing the efficiency of cultivation. Such developed systems provide constant control of indicators in structures of closed soil, reporting on the current state in real time and conducting analysis based on available data. Based on all available data in the system, the manufacturer can analyze all key indicators, their changes and impact over time and make appropriate decisions for their enterprise. However, the created systems expand over time, and accordingly, the information in them too, so it becomes inconvenient and inefficient to analyze previously entered data. In this case, there is a need to create a system that will analyze indicators based on accumulated data. Therefore, it is proposed to carry out analysis using OLAP and Data Mining technologies [2]. The purpose of the research is the implementation of a data warehouse using OLAP and Data Mining technologies to increase the efficiency of growing vegetables and fruits in closed soil structures. 2. Literature review. The problem of creating a dynamic data warehouse, including for storing and processing measurement results from sensors in a greenhouse, has been addressed by many researchers in Ukraine and around the world. The article by M. Brazhenenko et al. [3] examines trends in the integration of modern CAD systems into enterprise integration, outlines the consequences of the industry-wide transition of CAD systems Dynamical System Modeling and Stability Investigation (DSMSI-2023), December 19-21, 2023, Kyiv, Ukraine EMAIL: nkiktev@ukr.net (N. Kiktev); marynalendel@gmail.com (M. Lendiel); taraslendel@gmail.com (T. Lendiel) ORCID: 0000-0001-7682-280X (N. Kiktev); 0000-0002-6356-1230 (T. Lendiel) ©️ 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR Workshop ceur-ws.org ISSN 1613-0073 69 Proceedings to the cloud and the creation of a roadmap for cross-cloud integration of enterprise systems. Building redundant data warehouses is one of the possible ways to increase the reliability of corporate software systems, productivity and reduce the cost of maintaining software systems. The article by V. A. Porev and G. V. Porev [4] presents the results of an experimental determination of the lower limit of the temperature range of a television pyrometer. The study by D. Berestov et al. [5] analyzes the use of big data in agriculture based on Internet of Things (IoT) technology. An IoT platform for collecting information about agricultural land is proposed. The latest addition is IoT connected sensors installed by individual farmers in their fields. All of this provides enough input to teach algorithms to discern cloud patterns, recognize the effects of minute changes in cloud temperature and humidity, and identify potential hazards based on changes in wind direction that weather fronts from other areas may cause. The three main technologies that will contribute to the development of intelligent weather monitoring in agriculture are smart IoT sensors for data collection and analysis, satellites and weather stations, as well as artificial intelligence systems and weather forecasting learning machines. Mexican authors Méndez-Guzmán, H.A. et al. [6] developed a monitoring system in an aeroponics greenhouse based on the Internet of Things. The system provides the greenhouse manager with the information necessary to make decisions in search of the optimal harvest, namely the state of climate variables and the appearance of the crop. The system also controls the timing of watering and the frequency of visual inspection using a developed application for Android mobile devices called Aeroponics Monitor. At the cloud level, One of the database servers analyzes information about the variables monitored in the greenhouse using its IoT analytical tools to create historical data and visualize their behavior, as well as analyze the operating state of the system [6]. Another server is used as a database to store the results of processing images taken in the fog layer to observe leaves and roots. The Aeroponics Monitor contains eight main windows: the main window [6], which allows you to quickly view the status of environmental variables of the cultural system, access through the menu section to reports, records, configurations, manual control, manual control. measurements, manual adjustments and image analysis. Greek researchers Elvanidi, A. and Katsoulas, N. [7] solved the problem of crop stress detection based on machine learning in greenhouses. To record data on physiological parameters and microclimate, a multi-sensor tower was built, which consists of two air temperature sensors Thygro SDI-12, Symmetron and relative humidity Thygro SDI-12, solar radiation sensor SP-SS, leaf temperature sensors - T-type thermocouples and a sensor PRI type SRS-PRI. The authors evaluated two different algorithms: Gradient Boosting (GB) and MultiLayer Perceptron (MLP). Both models can be incorporated into existing greenhouse climate monitoring and control systems, replacing human expertise in detecting stress conditions for greenhouse crops. Researchers from Cranfield University (UK) Faniyi, B. and Luo, Z. A [8] performed physical modeling and control of air temperature in a greenhouse system using Internet of Things technology. The authors developed a PID algorithm in the Arduino IDE and, after modifying it using the Arduino Create web editor, uploaded the sketch to the Arduino microcontroller directly from the Internet. The model was validated using measured data from a cloud-based IoT management system platform deployed in a greenhouse. The modified model fitted using the optimization-based model fitting method had a maximum deviation of 2 °C between the simulated and actual measured indoor air temperature. Internet of Things approaches for monitoring and managing smart greenhouses in Industry 4.0 were summarized in a review by Italian researchers Bersani, C. et al. [9]. Monitoring and control of hydroponic systems based on the Internet of Things are described in an article by Cypriot researchers Tatas, K. et al. [10]. The system is based on three types of sensor nodes: the main one is responsible for controlling the pump, monitoring water quality in the greenhouse, aggregating and transmitting data from slave nodes. Slave nodes monitor environmental conditions in the greenhouse and transmit data to the master node. The system monitors water quality, temperature and humidity in the greenhouse, ensuring crops are growing in optimal conditions according to hydroponic guidelines. Remote monitoring for greenhouse owners is made easier by tracking these parameters through a website connection. The system measures the quality of circulating water using four sensors: temperature, pH, electrical conductivity (EC) and dissolved oxygen (DO). Sensors from Atlas Scientific were used, and Arduino Mega 2560 was used as the SCU processor [10]. The ubidots platform was used to load the data, and 70 the application was built using an API to transfer and display data in .csv format, as well as using javascript, css and html. The user can control several greenhouses at once. It provides a dashboard for monitoring water quality and a sidebar for alerts and alerts such as sensor and SD card errors. The user can create graphs of monitored parameters. Greek authors C. Maraveas and T. Bartzanas [11] compiled a review regarding the application of the Internet of Things (IoT) to optimize greenhouse environments. The article by Italian scientists Pisanu, T. et al. [12] presents the development of an electronic platform for monitoring the greenhouse environment in real time: a view of agriculture 4.0. The authors designed and implemented a prototype of an electronic platform for environmental monitoring in a greenhouse. The electronic board consists of main board, Green House core, Wi-Fi module, RS485 module, A/D converter module and USB module. The system allows you to collect data using external sensors, process and send it to external devices: laptop, smartphone and Internet gateway using a wired and wireless connection. These data relate to the main greenhouse environmental parameters: air temperature, humidity, solar radiation, air speed and CO2 concentration. A web application has been implemented that allows users to obtain information about the state of the environment in greenhouses A web application (a client-server computer program that the client runs in a web browser) is developed using a Java servlet and Java server pages (JSP). Java Servlet, it allows users to synchronize the board over the Internet, receive information about the data collected by the board, store this data in a special online database and display it through a web browser. The article by D. Khort et al. [13] presents a control system for agricultural technology in horticulture with an automated meteorological complex. The server module is the core of the system, containing all the business logic of the software package and the data visualization system. The weather module is a complex consisting of weather sensors and a GSM modem that provides remote transmission of data from the sensors to the server. The mobile application collects data about gardens with photographic recording and location of objects and transfers them to the server. The system proposed by the authors provides rapid processing of information flows that determine the characteristics of the growth and condition of plants in critical phases of their development, from modern recording instruments (weather stations, samplers, analyzers). Based on the reviewed articles, we can conclude that there remains an undisclosed section on organizing a data warehouse for storing information obtained from sensors of greenhouse parameters. This is what our research will be devoted to. 3. Research materials and methods. To take into account the peculiarities of the change in air temperature in the greenhouse, a mathematical model was refined that took into account the temperature change and made it possible to calculate the air temperature depending on the influence of external disturbances [14]. At the same time, the space of the greenhouse is conditionally divided into temperature blocks according to the width of the greenhouse, taking into account the design features of the block greenhouse. It was assumed that each block affects the temperature balance of the greenhouse, and the amount of heat for the i-th block Qi depends on the amount of heat Qi+1 and Qi-1, given or received from neighboring blocks (i+1, i-1). Thus, you can write: Q Q Q Q Q Q Q (1) i t ,і s ,і sr,i v,i i 1 i 1 where i = 1...n; n is number of blocks; Qt is the amount of heat coming from the heating system; Qs is amount of heat received from the sun, J; Qv is heat loss through the greenhouse roof and end walls, J; Qі+1, Q і-1 – amount of heat coming from neighboring temperature blocks, J; Qsr,i is the amount of heat absorbed by plants in the i-th block, J. Taking into account the geometric dimensions of the temperature blocks of the greenhouse, its thermophysical characteristics (heat transfer coefficients from water to the pipe wall, pipe wall to the air of the greenhouse, from the air of the greenhouse to the glass of the greenhouse wall, glass to the air 71 of the external environment) the heat balance equation for the ith block of the greenhouse will have appearance: ) (2) 644341 ⋅ 𝛩𝑖 = 1657,92 ⋅ (𝛩𝑤,𝑖 − 𝛩𝑖 ) + 3 ⋅ 𝑆𝑘,і + 4 ⋅ (𝑆𝑏,𝑖 − 𝑆𝑘,𝑖 )(𝛩𝑖 − 20 + 0,026 ⋅ 𝑆𝑖−1,𝑖 (𝛩𝑖−1 − 𝛩𝑖 ) + 0,026 ⋅ 𝑆𝑖+1,𝑖 (𝛩𝑖+1 − 𝛩𝑖 ) − 3,3 ⋅ 𝑆𝑘,і .) where Θi is the air temperature in the i-th block; Θ w,i is the temperature of the coolant in the i-th unit; Sb,i is the area of the side surface in the i-th block; S k,i is roof area in the i-th block. After creating a mathematical model of the greenhouse based on the contour of the air temperature, the data storage for the decision support system was created. For a qualitative analysis of the subject area and requirements in the design process, a precedent diagram and a deployment diagram were created to display the architecture of the decision support system (Fig. 1). Greenhouse Database server workstation Data storage server 1 1 Data entry module Database of greenhouses 1 1..* Data storage 1..* 1..* Temperature 1..* sensor Control unit Analyst workstation 1 ADC Data analytics module Humidity sensor 1..* Figure 1: System architecture The main physical nodes in the system are the greenhouse workstation, the database server, the data warehouse server, and the analytics workstation. 4. Research results. For the greenhouse workstation, a monitoring subsystem was developed that works with data in an operational database, as well as hardware using temperature and humidity sensors that are connected to a Raspberry Pi single-board computer. A DHT22 air temperature and humidity sensor was used as a sensing element for measurement. The monitoring subsystem is implemented using the Python programming language, which ensures the transfer of data on the measured temperature and humidity indicators to the operational database. The mnemonic diagram of the subsystem for monitoring technological parameters of cultivation is shown in Fig. 2. A data warehouse server and an analyst workstation are used to perform data analysis. The data warehouse will allow analysis in different sections of the input data. The signified is provided by the presence of measurements, which are a set of reference information about the measured event. In the 72 context of data warehouses, events are facts that describe the results of a certain business process. The structure of the data warehouse is shown in Fig. 3. The developed data warehouse was used to deploy a multidimensional cube using SQL Server Analysis Services (SSAS). For direct data analysis, filling of the data storage (SD) was implemented, which takes place on the basis of data from the operational database. The data transfer process was implemented using SQL Server Integration Services. Figure 2: Mnemonic diagram of the monitoring subsystem of technological parameters of cultivation A single-chamber Raspberry Pi computer is running on the basis of the pre-installed Raspbian operating system. Raspbian OS is a free operating system based on the Debian platform, optimized for Raspberry hardware, which has a wide range of basic programs and utilities. According to the given connection of the receiving element, the Raspberry Pi single-chamber computer polls the information provider and forms the measured data into a specified format for recording in the database. Data exchange between the Raspberry Pi and the DHT22 information transmitter is performed using a serial data transmission protocol - one and zero are coded by different pulse lengths. The data consists of a decimal and an integer part. A full data transfer is 40 bits, and the DHT22 sends the higher data bit first. Raspberry Pi acts as a bus master and is responsible for initiating communication (reading data). The DHT22 is always the slave. When transferring data from the DHT 22 to the Raspberry Pi, the humidity is transferred, followed by the temperature and the checksum. The checksum is 8 bits and contains the sum of all temperature and humidity data in case of successful data transfer. After collecting data, the DHT22 goes into low power mode until it receives a start signal from the controller again. In the process of analyzing the received data, the key performance indicator was calculated [15, 16], which was implemented using the MDX query language for accessing multidimensional data structures (Fig. 4). Based on the calculated data, it can be concluded that the average temperature and humidity by varieties is within the normal range, but has decreased compared to last year. Such results give reason to check the microclimate support system in greenhouses and review the microclimate regulation process. In such cases, incorrect operation of some executive mechanisms or unsatisfactory condition of greenhouses is possible. If this trend continues in the future, it will affect the future yield, considering 73 that the recorded indicators may go beyond the optimal limits. This condition will affect the growth of plants grown on the farm. The SQL Server Data Tools-Business Intelligence (SSDT-BI) tool was used for intellectual analysis, which includes technologies for business analysis: creation of Analysis Services (AS) data models, Integration Services (IS) packages, and Reporting Services (RS) reports. Using the expanded structure [17], a network of dependences of the name of the crop and variety on the given ranges of temperature and humidity was built (Fig. 5). Figure 3: Showcase of the greenhouse economy Figure 4: Results of the calculation of the key performance indicator 74 Figure 5: The network of dependencies based on the implemented algorithm of associative rules 75 Expanding the structure of the analysis by the clustering algorithm, the model shown in Fig. 6 was obtained. As a result of the expanded structure, it can be concluded that the data is homogeneous and ordered. This is especially noticeable on the diagram of the network of connections, the clusters are almost not connected to each other. Since most of the data is concentrated in cluster 3, the average value of the maximum humidity is within 85%, the maximum temperature is about 24.5℃. The minimum humidity is about 70% and the minimum temperature is about 18℃, which is most suitable for growing pepper culture. The output of the results of the system using the time series algorithm is shown in Fig. 7. Based on the graph obtained, it can be seen that starting from the beginning of 2021, the total yield in greenhouses has decreased. For example, in the winter of 2021, the yield reached its lowest value, but the predicted value is significantly different from the actual one. Figure 6: Clustering of existing facts in the data warehouse Figure 7: Formed graph using the time series algorithm 5. Discussion and prospects This work is a development of the project described in article [18], which describes a hardware and software implementation of an information management system applicable to various biotechnical 76 objects (including greenhouses) based on an integrated Arduino board and the LabView visual programming environment. In addition to reading information from the sensors, an operator interface in the form of a web page was developed and the measured values were recorded in a database for further processing. Data is stored on a storage device in the form of tables unified with data processing programs, or in the cloud with the ability to remotely control technological processes. The program provides targeted polling of sensors, which makes it possible to evaluate changes in the controlled parameter at the location of the sensor. This project can be used in robotic systems in greenhouses and gardens [19-21] for storing dynamic information and remotely controlling the robot. In our further research, we plan to develop the project in the following directions: assess the quality of the data model, assess the risks of information security of the control system, optimize big data, develop correlations between the time of ripening and storage of fruits and temperature. The definition of data model quality indicators for evaluating the Data Vault 2.0 model is shown in [22]. An assessment of the information security risks of an information system can be carried out using the methodology described in article [23]. A model for increasing the execution time of unstructured big data storage, including dynamic data, is described in article [24]. Mathematical models of the time correlation coefficients of temperature and storage time in various planes, as well as an analysis of the effect of changing storage conditions on temperature correlation are described in article [25]. 6. Conclusions In the process of developing the system, the structure of the operational database was presented. Data input, storage and analysis modules were developed. During the analysis, the application of OLAP and Data Mining technologies was proposed for intellectual analysis of large volumes of information. The obtained results of the system can be used in the process of forming management decisions of the greenhouse economy. This will allow you to direct the strategy for managing individual business processes in such a way as to increase the yield in greenhouses and, accordingly, the profitability of the farm in general. 7. References [1] V. Lysenko, I. Bolbot, T. Lendiel, V. Koval and I. Nakonechnyy, "Genetic Algorithm in Optimization Problems for Greenhouse Facilities," 2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT), Lviv, Ukraine, 2022, pp. 185-188, doi: 10.1109/CSIT56902.2022.10000750. [2] Silberschatz, Abraham; Korth, Henry F.; Sudarshan, S. (2011). Database system concepts (V. 6). New York: McGraw-Hill. ISBN 978-0-07-352332-3. OCLC 436031093. [3] M. Brazhenenko, V. Petrivskyi, O. Bychkov, I. Sinitcyn and V. Shevchenko, "Enabling Big Data Query with Modern CAD Systems Redundant Data Stores," 2021 IEEE 16th International Conference on the Experience of Designing and Application of CAD Systems (CADSM), Lviv, Ukraine, 2021, pp. 5-9, doi: 10.1109/CADSM52681.2021.9385265 [4] V. A. Porev and G. V. Porev , "Experimental determination of the temperature range of a television pyrometer," J. Opt. Technol. 71, 62-62 (2004) [5] Berestov, D., Kurchenko, O., Zubyk, L., Kulibaba, S., Mazur, N. Assessment of Weather Risks for Agriculture using Big Data and Industrial Internet of Things Technologies. CEUR Workshop Proceedings, 2023, vol. 3550, pp. 1–13 [6] Méndez-Guzmán, H.A.; Padilla-Medina, J.A.; Martínez-Nolasco, C.; Martinez-Nolasco, J.J.; Barranco-Gutiérrez, A.I.; Contreras-Medina, L.M.; Leon-Rodriguez, M. IoT-Based Monitoring System Applied to Aeroponics Greenhouse. Sensors 2022, 22, 5646. https://doi.org/10.3390/s22155646 [7] Elvanidi, A.; Katsoulas, N. Machine Learning-Based Crop Stress Detection in Greenhouses. Plants 2023, 12, 52. https://doi.org/10.3390/plants12010052 77 [8] Faniyi, B.; Luo, Z. A Physics-Based Modelling and Control of Greenhouse System Air Temperature Aided by IoT Technology. Energies 2023, 16, 2708. https://doi.org/10.3390/en16062708 [9] Bersani, C.; Ruggiero, C.; Sacile, R.; Soussi, A.; Zero, E. Internet of Things Approaches for Monitoring and Control of Smart Greenhouses in Industry 4.0. Energies 2022, 15, 3834. https://doi.org/10.3390/en15103834 [10] Tatas, K.; Al-Zoubi, A.; Christofides, N.; Zannettis, C.; Chrysostomou, M.; Panteli, S.; Antoniou, A. Reliable IoT-Based Monitoring and Control of Hydroponic Systems. Technologies 2022, 10, 26. https://doi.org/10.3390/technologies10010026 [11] Maraveas, C.; Bartzanas, T. Application of Internet of Things (IoT) for Optimized Greenhouse Environments. AgriEngineering 2021, 3, 954-970. https://doi.org/10.3390/agriengineering3040060 [12] Pisanu, T.; Garau, S.; Ortu, P.; Schirru, L.; Macciò, C. Prototype of a Low-Cost Electronic Platform for Real Time Greenhouse Environment Monitoring: An Agriculture 4.0 Perspective. Electronics 2020, 9, 726. https://doi.org/10.3390/electronics9050726 [13] D. Khort, I. Smirnov, A. Kutyrev. Development of an automated weather complex for managing agricultural technologies in horticulture. E3S Web of Conferences 193, 01049 (2020) ICMTMTE 2020. https://doi.org/10.1051/e3sconf/202019301049 [14] V. Lysenko, T. Lendiel and D. Komarchuk, "Phytomonitoring in a Greenhouse Based on Arduino Hardware," 2018 International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T), Kharkiv, Ukraine, 2018, pp. 365-368, doi: 10.1109/INFOCOMMST.2018.8632030. [15] S. V. Tsiucsyura, Key performance indicators. Principles of development of key indicators for the budget sphere / S. V. Tsyuksyura, O Kryvoruchko, V. V., M. I. Tsyuksyura // Management of the development of complex systems. - 2012. - Issue 10. - P. 87-91. - Access mode:http://nbuv.gov.ua/UJRN/Urss_2012_10_17. (In Ukranian) [16] Key Performance Indicators (KPIs) in Multidimensional Models [Електронний ресурс] – Режим доступу до сайту: https://docs.microsoft.com/en-us/analysis-services/multidimensional- models/key-performance-indicators-kpis-in-multidimensional-models?view=asallproducts- allversions [17] Kroenke, D. M., Auer, D. J., Vandenberg, S. L., & Yoder, R. C. (2010). Database concepts (pp. 1480-1486). Upper Saddle River, NJ: Prentice Hall. [18] N. Kiktev, T. Lendiel, N. Pasichnyk, D. Khort and A. Kutyrev, "Using IoT Technology to Automate Complex Biotechnical Objects," 2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T), Kharkiv, Ukraine, 2021, pp. 17-22, doi: 10.1109/PICST54195.2021.9772218. [19] Cañadas-Aránega, F.; Blanco-Claraco, J.L.; Moreno, J.C.; Rodriguez-Diaz, F. Multimodal Mobile Robotic Dataset for a Typical Mediterranean Greenhouse: The GREENBOT Dataset. Sensors 2024, 24, 1874. https://doi.org/10.3390/s24061874 [20] Seo, D.; Cho, B.-H.; Kim, K.-C. Development of Monitoring Robot System for Tomato Fruits in Hydroponic Greenhouses. Agronomy 2021, 11, 2211. https://doi.org/10.3390/agronomy11112211 [21] Kutyrev, A.; Kiktev, N.; Jewiarz, M.; Khort, D.; Smirnov, I.; Zubina, V.; Hutsol, T.; Tomasik, M.; Biliuk, M. Robotic Platform for Horticulture: Assessment Methodology and Increasing the Level of Autonomy. Sensors 2022, 22, 8901. https://doi.org/10.3390/s22228901 [22] Helskyaho, H.; Ruotsalainen, L.; Männistö, T. Defining Data Model Quality Metrics for Data Vault 2.0 Model Evaluation. Inventions 2024, 9, 21. https://doi.org/10.3390/inventions9010021 [23] Palko, D.; Babenko, T.; Bigdan, A.; Kiktev, N.; Hutsol, T.; Kuboń, M.; Hnatiienko, H.; Tabor, S.; Gorbovy, O.; Borusiewicz, A. Cyber Security Risk Modeling in Distributed Information Systems. Appl. Sci. 2023, 13, 2393. https://doi.org/10.3390/app13042393 [24] Farhan, M.S.; Youssef, A.; Abdelhamid, L. A Model for Enhancing Unstructured Big Data Warehouse Execution Time. Big Data Cogn. Comput. 2024, 8, 17. https://doi.org/10.3390/bdcc8020017 [25] Cui, H.; Zhang, Q.; Wu, W.; Zhang, H.; Ji, J.; Ma, H. Modeling and Application of Temporal Correlation of Grain Temperature during Grain Storage. Agriculture 2022, 12, 1883. https://doi.org/10.3390/agriculture12111883 78