<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Formation of Reports for Model-Driven Data Consolidation System *</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Aleksei</forename><surname>Korobko</surname></persName>
							<email>agok@icm.krasn.ru</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Institute of Computational Modeling</orgName>
								<orgName type="department" key="dep2">Siberian Branch</orgName>
								<orgName type="institution">Russian Academy of Sciences</orgName>
								<address>
									<addrLine>50/44 Akademgorodok</addrLine>
									<postCode>660036</postCode>
									<settlement>Krasnoyarsk</settlement>
									<country key="RU">Russia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Formation of Reports for Model-Driven Data Consolidation System *</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">07346C77F64AB974B01A22A9271B0996</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T19:56+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Web-system</term>
					<term>Ad-hoc Data Consolidation</term>
					<term>Model-driven Development</term>
					<term>Dynamic User Interface</term>
					<term>Metadata</term>
					<term>Report</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>A platform for constructing model-driven systems was developed by the staff of ICM SB RAS to support the consolidation, storage and analytical processing of monitoring data for the state of natural and technogenic objects, medical data, scientific research data, etc. The basis for the creation of the platform was an original approach to the construction of model-driven systems, a meta-meta model and a set of algorithms. Algorithms are responsible for navigating between models, creating structures in the database, building a user interface, etc. One of the problems that have not been solved within the framework of the platform for building data consolidation systems is the automation of the creation of analytical reports. The existing solution makes it difficult for users to independently conduct analytical experiments, as it involves contacting the administrator or mastering the skills of working with SQL queries. The task of developing tools for the native formation of analytical queries to data is urgent. The article proposes an algorithm for generating queries and a constructor of research queries that implements this algorithm.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The staff of ICM SB RAS have developed many systems focused on data collection. With the help of the system for consolidation of monitoring data of the regional center for monitoring and forecasting emergencies, it is possible to keep under observation the state of natural and technogenic objects, for example, the water level in the rivers and the number of accidents on the roads <ref type="bibr" target="#b0">[1]</ref>. A consolidation system has been developed for scientific organizations which allows recording the results of scientific work: publication activity of employees, patent activity, etc. <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. Much work has been done to facilitate the collection and storage of research results <ref type="bibr" target="#b3">[4]</ref>. Systems have been developed for collecting medical questionnaire information, environmental monitoring data of the Krasnoyarsk reservoir <ref type="bibr" target="#b4">[5]</ref> and data for assessing the ecological state of soils in the Krasnoyarsk Region.</p><p>The main feature of the constructed data consolidation systems is the use of the original modification of the model-driven approach in their development (MDDmodel-driven development) <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7]</ref>. MDD uses models as a central component of the software development process. All the models are divided into levels of abstraction, the transition from the models of one level to the models of another level is regulated. The result of the development is the program code of the system. The modification consists in using the tools of the data consolidation system to edit the control model and instantly change the behavior of the system in response to the change in the model <ref type="bibr" target="#b7">[8]</ref>. The author's implementation of the model-driven approach allows the dynamic evolution of the created data consolidation systems by the users themselves.</p><p>Experience (more than five years) in creating model-driven systems for consolidating and storing data, as well as systematizing the accumulated knowledge, made it possible to develop a universal meta-metamodel of the data collection process. The meta-metamodel describes the main classes of entities and relationships involved in the data collection process. They are used to build control models for specific application systems. Also, a set of algorithms for automatic generation of the applied models and a system interface based on the control model was created and tested. Together with the original approach to the construction of model-driven systems, the meta-metamodel and a set of algorithms made it possible to create a platform for building data consolidation systems <ref type="bibr" target="#b8">[9]</ref>. The use of a software platform for building data consolidation systems ensures the systematization of the consolidated data and their consistent storage. In addition, the control model formed during the construction and development of the system is, in fact, a description of the subject area and can be reused.</p><p>One of the problems that have not been solved within the framework of the platform for building data consolidation systems is the automation of the creation of analytical reports. The existing solution is not optimal, the user defines the composition of the report and, together with the administrator, builds a view in the database. The metadata of the created view is saved using the tools built into the database and then read by the analytical module. The user is given an opportunity to view the data included in the report in the form of tables, graphs, and cartograms. The existing solution makes it difficult for users to independently manage analytical experiments, since it is associated with an appeal to the administrator or skills in working with SQL queries.</p><p>Using the control model allows us to create a tool that greatly facilitates the formation of analytical reports. The analytical reporting tool, as well as the entire model-driven system, should be based on specific models. The CWM specification <ref type="bibr" target="#b9">[10]</ref> (Common Warehouse Metamodel) is the most widely used in the field of analytical data processing. The analytical model of the CWM specification is based on the fact-dimension model proposed by <ref type="bibr">Mateo Golfarelli (et al.)</ref> in 1998 <ref type="bibr" target="#b10">[11]</ref>.</p><p>According to the proposed model, data are divided into aspects of analysisdimensions, and aggregated numerical characteristics -measurements. In turn, the measurements are combined into facts, called cubes in the CWM terminology. An algorithm for constructing an analytical model (dimensions, cubes, and connections between them) based on the data of the control model was developed and presented in <ref type="bibr" target="#b3">[4]</ref>.</p><p>An urgent task is to create a visual designer of exploration queries to the accumulated data using the resulting model. The following describes the algorithm for forming a user request to the database based on the analytical model and shows the interface for its formation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2</head><p>Exploration Query Builder</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Algorithm for Generating Exploration Queries</head><p>The existing analytical module of the platform for building data consolidation systems has been tested on several large tasks, about a hundred reports have been created in it.</p><p>It is very costly to redo the existing reports, the same concerns the implemented primary analysis tools. Therefore, it is necessary to give users a tool which allows them to create reports in an intuitive manner in the format accepted in the system. But, besides this, it is necessary to save the description of the report (analytical model) in the form of metadata. The analytical model will allow us to edit and supplement reports in the future. User's behavior when creating a report without using the constructor is described above, and it consists of several simple steps. First, the user defines the classes and attributes involved in building a report and the aggregation function, and then the administrator writes a request. In the exploration query builder, we need to facilitate the process of selecting classes and attributes, as well as to automate the construction of a view in the database.</p><p>In terms of the CWM specification, the exploration request generation algorithm is described in Fig. <ref type="figure" target="#fig_0">1</ref>. According to this algorithm, at the first stage, the system generates a connected graph of cubes and displays it to the user. Showing only the cubes to the user can significantly reduce the amount of the displayed information. In the process of selecting the cubes, unrelated branches are hidden, and this eliminates the possibility of building an incorrect query. In the next step, the user sees the measures and dimension attributes of the selected cubes. The user, using the graphical interface, selects the aspects of analysis which interest him in the graph and defines the aggregation functions for the selected measures. Further, according to the choice, an expanded table with data is built. The user can assess the completeness of the sample and its compliance with the research objectives. The result of the work is a customized analytical model. The system saves the model to the database and automatically builds a view based on it. The view is made in accordance with the metadata accepted for the analytical module, and this allows one to see the resulting report in this module immediately after the creation or modification.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Exploration Query Builder Interface</head><p>The created exploration query constructor works according to the above algorithm. Let us consider its work using the example of the Soil Condition Research System. Initially, the system included the results of field studies, namely the assessment of the ecological state of the soil of the settlements of the Krasnoyarsk Region. At each of the objects of research full-profile soil sections were made, binding and description of the soils based on the morphological features were performed, the full name of the soil was given, and the soil samples were selected at different depths. To store this information, the following classes were created: "Sampling Place", "Soil section", "Soil type", "Section horizon", etc. The next step was to study the chemical composition of the natural samples, content of mobile nitrogen and phosphorus, and humus state. To store the information, classes of the same name were created. Later, to study the dependence of the results of biotesting on the storage duration of the soil samples and conditions of the sample preparation, carried out by a group of students, a separate branch was created. The branch contained the classes "Biotesting of natural samples", "Biotesting of reference samples", etc. Unfortunately, the unsuccessful numbering of the natural samples did not allow them to be unambiguously compared with the data already available. The comparison is still underway.</p><p>To continue the study of natural samples, sampling and analysis of samples of technogenic surface formations near the industrial enterprises of the city of Krasnoyarsk were carried out. The samples were analyzed for the concentration of lead, arsenic and fluorine. Agrochemical indicators were studied. The control model of the system was supplemented with the appropriate classes.</p><p>Last year, within the framework of a grant the Krasnoyarsk reference center conducted an experiment with the reference soils to calibrate and determine the boundaries of the effectiveness of bioluminescent enzymatic analysis as a biotesting method to assess the degree of soil degradation, as well as anthropogenic and technogenic pressure on soils. The studies were carried out using various enzyme systems using standard methods of reference samples for common pollutants. The following classes were added to the system: "Analysis for pesticides", "Analysis for copper", "Physicochemical properties of the sample", etc.</p><p>Thus, at present, the Soil Condition Research System contains data from five different experiments. An analytical model containing the information about all the cubes created during the development of the system can be seen in Fig. <ref type="figure" target="#fig_1">2</ref>. At the first stage of constructing a research query, the user can select any cube from the proposed ones, but after selecting the first cube, the unrelated branch will disappear from the graph. For example, if you select the cube "Sampling location", only the "Field studies" branch associated with this cube is available (Fig. <ref type="figure" target="#fig_2">3</ref>). In Figure <ref type="figure" target="#fig_2">3</ref>, we can also see that the user has selected the cubes: 'Humus state" and "Soil type". The cubes selected by the user display the related elements; these are the dimension attributes and measures. With the help of simple mouse manipulations, the user can select attributes and indicators of interest for him. The cube -"Sampling location" has two attributes: "Settlement" and "Description", the "Settlement" attribute is selected. "Soil Type" has four attributes. Have been selected: "Department" and "Name". In the "Humus state" cube, four indicators. Have been selected: "Humus, %", "Maximum depth, cm", "Minimum depth, cm", "Residual luminescence (T), %" and "Soil sample".</p><p>After selecting the interesting aspects of the analysis, the user can view a detailed description of the selected attributes and indicators, start generating a database query based on them and examine the data obtained in detail (Fig. <ref type="figure" target="#fig_3">4</ref>). In Figure <ref type="figure" target="#fig_3">4</ref>, we can see that almost all the indicators are filled, but some of the data is not indicated. For example, the section of the soil type is empty, the locality is filled with service information. If necessary, one can add or edit these data in the data entry module. If the user is not satisfied with the composition of the report, he can return to the previous stage of selecting attributes and measures. If the data is selected correctly, the user proceeds to the selection of the aggregation function for the indicators, in our case it is "Humus,%", "Maximum depth, cm", "Minimum depth, cm", "Residual luminescence (T), %" and "Soil sample". After selecting the aggregation functions, the user starts the save process. As a result, the analytical model is saved in the database and a report is created. The report is available for viewing in the analysis module. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Conclusion</head><p>The platform for building data consolidation systems is used in many subject areas to support the collection, storage, and analytical processing of data. The proposed algorithm for the formation of exploration queries and the query designer built on its basis will facilitate the process of data analysis by the users and attract new researchers to use the platform.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Algorithm for generating a report by the user.</figDesc><graphic coords="4,185.16,147.48,224.40,171.00" type="vector_box" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Graph of the cubes.</figDesc><graphic coords="5,136.08,147.48,324.72,204.24" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Graph with the measures and dimensions.</figDesc><graphic coords="6,124.80,147.48,345.84,207.84" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4.Table with the data.</figDesc><graphic coords="7,127.80,183.48,339.84,203.16" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgments. The study was carried out with the financial support of RFBR and the Government of Krasnoyarsk region, research project №18-47-240005.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Dynamic Generating User Interface of the Data Consolidation Web-system for Emergency Monitoring</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Nicheporchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nozhenkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Informatization and communication</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="59" to="64" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note>in Russian</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Applied model of the scientific activity accounting system</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Scientific and Practical Conference on the Actual Problems of Mathematical Modeling and Information Technologies</title>
				<meeting>the International Scientific and Practical Conference on the Actual Problems of Mathematical Modeling and Information Technologies<address><addrLine>Sochi</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="71" to="75" />
		</imprint>
	</monogr>
	<note>in Russian</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Certificate of state registration of a computer program</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Karepova</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>FRC KSC SB RAS</publisher>
			<biblScope unit="page">22</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Constructing the analytical model for specialized model-driven system of scientific data consolidation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">2534</biblScope>
			<biblScope unit="page" from="377" to="383" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Information Modeling of Temporal Spatial Data for Ecological Monitoring of the Krasnoyarsk Reservoir</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CEUR Workshop Proceedings</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">2033</biblScope>
			<biblScope unit="page" from="319" to="323" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Object Management Group (OMG): Model Driven Architecture (MDA)</title>
		<idno>ormsc/2014-06-01</idno>
	</analytic>
	<monogr>
		<title level="m">MDA Guide Revision</title>
				<imprint>
			<date type="published" when="2014-06">June (2014</date>
			<biblScope unit="volume">2</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">What models mean</title>
		<author>
			<persName><forename type="first">E</forename><surname>Seidewitz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Softw</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="26" to="32" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">An original approach to the construction of a model-driven data consolidation system</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Informatization and Communication</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="232" to="238" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note>in Russian</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A software platform for constructing model-driven systems for primary data consolidation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Korobko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proveedings of the VII International Conference &quot;Knowledge-Ontology-Theories</title>
				<meeting><address><addrLine>Novosibirsk</addrLine></address></meeting>
		<imprint>
			<publisher>ZONT</publisher>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="203" to="212" />
		</imprint>
	</monogr>
	<note>in Russian</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Common Warehouse Metamodel</title>
		<author>
			<persName><forename type="first">L</forename><surname>Peyton</surname></persName>
		</author>
		<idno type="DOI">10.1007/978</idno>
		<idno>-0-387- 39940-9_900</idno>
		<ptr target="https://doi.org/10.1007/978" />
	</analytic>
	<monogr>
		<title level="m">Encyclopedia of Database Systems</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Liu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Özsu</surname></persName>
		</editor>
		<meeting><address><addrLine>Boston, MA</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">The Dimensional Fact Model: A Conceptual Model for Data Warehouses</title>
		<author>
			<persName><forename type="first">M</forename><surname>Golfarelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Maio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rizzi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Int. J. Cooperative Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="215" to="247" />
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
