<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Use of Ontologies for Metadata Records Analysis in Big Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julia Rogushina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anatoly Gladun</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serhii Pryima</string-name>
          <email>pryima.serhii@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Software Systems of National Academy of Sciences of Ukraine</institution>
          ,
          <addr-line>Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Special Communication and Information Protection of National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"</institution>
          ,
          <addr-line>Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Tavria State Agrotechnological University</institution>
          ,
          <addr-line>Melitopol</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>46</fpage>
      <lpage>63</lpage>
      <abstract>
        <p>Big Data deals with the sets of information (structured, unstructured, or semi structured) so large that traditional ways and approaches (based on business intelligence decisions and database management systems) cannot be applied to them. Big Data is characterized by phenomenal acceleration of data accumulation and its complication. In different contexts Big Data often means both data of large volume and a set of tools and methods for their processing. Big Data sets are accompanied by metadata which contains a large amount of information about the data, including significant descriptive text information whose understanding by machines lead to better results of Big Data processing. Methods of artificial intelligence and intelligent Web-technologies improve the efficiency of all stages of Big Data processing. Most often this integration concerns the use of machine learning that provides the knowledge acquisition from Big Data and ontological analysis that formalizes for domain knowledge for Big Data analysis. In the paper, the authors present a method for analyzing the Big Data metadata which allows selecting those blocks of information among the heterogeneous sources and data repositories that are pertinent for solving the customer task. Much attention is paid to the matching of the text part of the metadata (metadata annotations) with the text describing the task. We suggest to use for these purposes the methods and instruments of natural language analysis and the Big Data ontology which contains knowledge about the specifics of this domain.</p>
      </abstract>
      <kwd-group>
        <kwd>Big Data</kwd>
        <kwd>metadata</kwd>
        <kwd>domain ontology</kwd>
        <kwd>thesaurus</kwd>
        <kwd>natural language text</kwd>
        <kwd>homonymy</kwd>
        <kwd>multimedia data</kwd>
        <kwd>standard</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The term "Big Data" refers to a group of technologies and methods oriented on
analysis and processing of large amounts of data (both structured and unstructured)
that cannot be processed by traditional methods; they serves to obtain qualitatively
new knowledge. The actuality of this IT direction is determined by the exponential
growth in the amount of data generated in electronic form and stored in data banks for
future use. The analysis of large data sets is an interdisciplinary task that combines
mathematics, statistics, computer science and special knowledge of the domain.</p>
      <p>For effective practical use of Big Data we need to analyze them at the semantic
level with use of domain knowledge.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Big Data Definition</title>
      <p>A particular set of data should be considered as Big Data if it has one or more of
following features named 5V:
- Volume refers to the vast amounts of data generated every second that
require specialized processing facilities;
- Velocity refers to the speed at which new data is generated and the speed at
which data moves around.
- Variety refers to the different formats and types of data which makes its
integration difficult;
- Veracity refers to the messiness or trustworthiness of the data that cannot be
converted into information and, therefore, have no value;
- Value – only part of data may be useful for users.</p>
      <p>The main types of Big Data are: structured data (SQL databases); semi structured
data (information security instructions, customer profile data, Web server logs,
websites, emails, etc.) and unstructured data (audio files, video files, images,
information cubes, etc.) that can be stored in NoSQL databases. Big Data provides the
binding of geographically distributed data sets, taking into account operations such as
replication and sharring (split into fragments).</p>
      <p>Moreover, Big Data combines various unrelated data sets, processing large
volumes of unstructured data (the part of such data in the total amount of Big Data is
the largest).</p>
      <p>Today mankind generates more and more Big Data volumes. However, this
information has no direct value, but is obtained only as a result of data processing and
analysis. Due to the enormous volumes and velocity of information receiving, such
processing can be performed only automatically. The knowledge obtained by
processing may have a practical value of such type:
- rules built by means of machine learning;
- results of these rules application to the analysis of new data.</p>
      <p>Examples of the first type knowledge are the decision tree for the task of medical
diagnostics or the multilayer neural network that identifies people by their
photographs from the social network. Examples of the second type knowledge are the
diagnosis for particular patient based on the decision tree and the recognition by the
neural network of the person whose image was received by the social network user.</p>
      <p>
        Obtaining of this knowledge from Big Data is based on statistical processing and
machine learning (ML) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Without examining in detail the methods and possibilities
of ML, it should be noted that machine learning is a synthesis of some system
experience stored electronically to further improve the behavior of this system that
becomes more effective.
      </p>
      <p>ML results are probabilistic and statistic, their quality depends on how much the
data processed are close to those used in practice. Thus, the actual problem is finding
exactly those arrays of Big Data, which are pertinent to a specific user's task
(containing implicitly necessary knowledge), reliable, actual and qualitative. These
Big Data parameters are not evaluated directly, but through an analysis of their
metadata.</p>
    </sec>
    <sec id="sec-3">
      <title>3 State of Art in Big Data</title>
      <p>
        The major problems that exist today in Big Data technology and require solution are
defined in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] :
      </p>
      <p>1. The problem of data integration that can be presented as a combined problem
that requires: (1) determining the problem to be solved with the help of Big Data; (2)
detection (search) of relevant parts of the data in Big Data repositories and sources;
(3) executing ETL (extraction, transformation, loading) in appropriate formats and
storing data for further processing; (4) removal of data ambiguity (for example,
homonymy); (5) data processing for solving the problem.</p>
      <p>2. The problem of heterogeneity overcoming between different sets of Big Data.
Semantics can be considered as a means of creating a bridge between heterogeneous
data.</p>
      <p>3. The problem of open data linking.</p>
      <p>4. The problem of semantics use for data integration and for development of
database management systems. Moreover, semantics can be used in the existing
system to identify data inconsistencies, to generate new knowledge using a logical
inference machine, or to link specific data that are not related to machine learning
methods more precisely.</p>
      <p>
        As the analysis of scientific publications [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] shows, today the problems deal with
metadata used in Big Data are the most acute than ever. More and more organizations
realize that their business efficiency depends from robust (workable) metadata of Big
Data that obtains the necessary context and the origin of key data assets.
      </p>
      <p>Although metadata management has been known for decades, today new strategies
and approaches are being developed:
- support for the continuous development of data processing environment;
- search for more effective ways of business management with metadata.</p>
      <p>Therefore, we need to review the strategies and techniques of working with the
metadata available to modern organization, and to find out how to build successful
strategies for the adoption and use of metadata.</p>
      <p>Organizations and companies are interested in two types of Big Data. 1. data
created by people, which are mainly distributed through the Web (social networks,
cookies, emails, online television, online broadcasting, etc.). 2. heterogeneous data
generated by various electronic devices. For example, such technologies as Internet of
Human and Internet of Things generate mixed traffic of Big Data that is used
cooperatively for predicative analysis to knowledge acquisition for understanding,
planning and anticipation of actions for these systems. At the same time, the question
of data quality is urgent. In fact, since Big Data is characterized by large volumes, it is
"raw" by nature. Therefore, a solution of this problem is required.</p>
    </sec>
    <sec id="sec-4">
      <title>4 Problem Definition</title>
      <p>It is necessary to develop a method of Big Data metadata analysis which allows
selecting those blocks of data among heterogeneous Big Data sources and data
warehouse that are pertinent for solving the customer’s problem. It should be born in
mind that both the task definition and the annotations of Big Data are natural
language (NL) unstructured or semistructured texts. Therefore their matching can be
based on methods of NL analysis but with the Big Data ontology, which contains
knowledge about the specifics of this domain and allows semantical processing of
other elements of Big Data metadata (to match the parameters of the metadata
structure with the domain concepts). Creating a prototype of such ontology is also a
part of this work.</p>
    </sec>
    <sec id="sec-5">
      <title>5 Directions of Integration of Intelligent Web Technologies with</title>
    </sec>
    <sec id="sec-6">
      <title>Big Data Processing</title>
      <p>- Usage of background knowledge to improve machine learning results –
appraisal of challenges and role of background knowledge and ontologies in
improvement of ML results, the requirements for ontologies used in ML.
- Usage of ontologies in logical reasoning and vice versa – the reasoning
techniques and mechanisms oriented on ontological knowledge
representation in various forms.</p>
      <p>Ontological analysis and logical inference in Big Data processing by means of ML
provides the use of background knowledge to prepare data for training and testing
(reduction of large, noisy data sets to managed ones) and eliminating the ambiguity of
terms.</p>
      <p>Learning phase of ML needs in definition of:
- task that is solved by computer system;
- direction of system’s behavior improvement (for example, increase the
recognition accuracy, expand the number of identified persons, accelerate
recognition);
- sources of data that contains the information required for analysis (from the
experience of the interaction of this system with a specific user or with the
entire community of users, from external sources, from similar systems,
etc.);
- means of integration of the obtained results with the system knowledge.</p>
      <p>If we need to use the external experience presented in Big Data then we have to
find relevant Big Data sources. Such finding uses the metadata that accompanies Big
Data and analyze its semantics. Part of the metadata that is generated automatically
does not contain enough information about content. Possibility to obtain the necessary
knowledge from Big Data is defined by semantic analysis of their annotations.</p>
      <p>Such annotations that are created in process of Big Data storing in the
corresponding repositories. They can be considered as unstructured or semistructured
NL texts and we can apply to them standard tools of NL analysis similar to the Web
search. Unfortunately, in the general case such problem is not solved effectively, and
therefore it is advisable to apply a priori additional knowledge about Big Data
domain.</p>
      <p>Despite the high interest in Big Data and variety of technological means for their
processing, there are no metadata standards specific to Big Data. The reason for this is
the complexity and variety of Big Data.</p>
      <p>Available metadata is technical information that characterizes the time of the
content creation, its volume, formats, etc., but does not relate to the information
content of the data. This makes it impossible to provide a uniform description of the
data semantics. But a big part of Big Data is accompanied by annotations or
explanations, usually provided in natural language. Therefore, matching of
annotations with task definition determines the pertinence of certain arrays of Big
Data to this task.</p>
      <p>If organization analyzes the Big Data that accumulates in the process of its own
operation there is no need for such a comparison. But quite often Big Data for
analysis is obtained from various external sources. Big Data analysis is based on ML
methods which velocity depends on the amount of information being processed. Prior
filtering of content decreases the time of it’s analysis. For example in case of analysis
of the television streams it is better not to process all of them but first select the part
of the programs pertinent to user’s problem. The source of annotations of such Big
Data is a TV guide.</p>
      <p>Another example of Big Dataderived from various sources and annotated only by
NL descriptions is the information resources on the availability of job vacancies
offered by the European Employment Services (EURES) which brings together about
400 "euro-advisers" from national employment services, associations of employers,
trade unions, local and regional authorities and higher education institutions; they are
actively used by the ESCO (European Skills, Competences, Qualifications and
Occupations), the multilingual classifier of European Skills, Competences,
Qualifications and Professions. Their annotations can be filtered with the help of
competence descriptions, semantically marked by the domain ontology concepts.</p>
    </sec>
    <sec id="sec-7">
      <title>6 Metadata for Big Data</title>
      <p>Metadata is a structured, coded data that describes the characteristics of media objects
that facilitates the identification, detection, evaluation and management of these
objects. Metadata is used to describe the meaning and properties of information in
order to better understand, classify, manage and exploitation the data.</p>
      <p>
        Metadata for Big Data is a data block physically joined to Big Data in it’s storage.
This metadata provides information on the characteristics and structure of Big Data
set: name; the origin of data, data source information; of the source; XML tags
indicating the author and date of the document creation; attributes indicating the size
and format, control total; number of dataset records; resolution of image; a brief
description of the data etc. [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ].
      </p>
      <p>The properties of metadata, its composition and functions depend considerably on
the technological realization, on the features of the resources they describe, as well as
on the scope and specificity of applications.</p>
      <p>The vast number of publications is devoted to the metadata, however, the
interpretation of the term "metadata" has not been formed completely yet. Metadata is
a special kind of information resources, their creation often requires considerable
effort and substantial costs, but they significantly increase the value of the data and
provide extended opportunities for their use.</p>
      <p>
        Now a lot of metadata definitions are used by specialists. We have chosen the most
significant ones: metadata is data about data [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]; metadata is information that makes
the data useful [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]; metadata is machine-processed data that describes some resources,
both digital and non-digital [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]; metadata is information that implies its computer
processing and interpretation of digital and non-digital objects by people [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ];
metadata is structured information that describes, explains, indicates location and,
thus, facilitates the retrieval, use and management of information resources [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ];
metadata in the Web it is semistructured data, usually agreed with the corresponding
models that provide operational interoperability in a heterogeneous environment [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>Metadata contributes to improving the quality of data, which is determined by the
following characteristics: consistency (whether the submission is homogeneous, or
whether there is duplicate data that is overlapping or conflicting); completeness
(whether all data is available); accuracy (coincidence of saved and actual values);
timeliness (whether the current saved value is relevant). Metadata also provides
improved data analysis (OLAP, OLTP, Data Mining), where it is necessary to
understand the domain of the data source in order to ensure adequate computation and
interpretation of results. Metadata provides the use of general terminology and
language of interaction within the organization, eliminating ambiguity and ensuring
the consistency of conclusions within the company.</p>
      <p>Big Data processing is closely linked to it’s metadata, especially for semistructured
and unstructured data. It is important to note that all changes of Big Data state initiate
changes of information about the origin immediately recorded as metadata. The goal
of obtaining the origin and the life cycle of the data is the possibility of argumentation
of analytical results: similar to scientific research, if the results cannot be justified and
repeated, they do not deserve trust.</p>
      <p>Thus, effective processing of Big Data and acquisition of valuable knowledge
demand a flexible framework for management of it’s metadata-based processing. It
allows to creating universal environment for interoperability of heterogeneous data
blocks, to standardize the processing stages and to develop the processing platforms.</p>
    </sec>
    <sec id="sec-8">
      <title>7 Metadata Standards Applicable to Big Data</title>
      <p>Matching of Big Data annotation from metadata with the user's task description is
carried out t the stage of data retrieval and selection, because direct comparison of Big
Data content with this description inappropriate due to the extremely large volume
and absence of structuring.</p>
      <p>In the standards of the ISO/IEC 11179 series, metadata is defined as data that
defines and describes other data. This means that the metadata is data, and data
becomes metadata when they are used in this way. This occurs in specific
circumstances, for specific purposes, with defined prospects, without which data is
not metadata. A set of circumstances, goals, or prospects for which some data is used
as metadata is called a context. Thus, metadata is data in a given context.</p>
      <p>Metadata can be stored in a database and be organized with the use of any model.
The model describing metadata is called a meta-model. For example, the conceptual
model presented in ISO/IEC 11179-3 is a meta-model in this content.</p>
      <p>Taking into account the lack of specific for Big Data standards for metadata, it is
reasonable to analyze the existing metadata standards used for information that can
have 5V properties and able to represent the content semantics.</p>
      <p>A significant part of Big Data is multimedia information. Now a lot of formats for
multimedia representation are developed by different software and hardware
manufacturers, but there is no unique standard common to everyone, because each
manufacturer develops its own convenient approach that can subsequently be
disseminated. Existing formats for saving multimedia in electronic form (GIF, TIFF,
PIC, PCX, JPEG, PNG, etc.) differ in methods of information compression, types of
encodings, purpose etc.</p>
      <p>
        The Moving Picture Experts Group for the Joint Standardization Committee
propose a family of multi-media standard MPEG [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Some of them (MPEG-1
(ISO/IEC 11172) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], MPEG-2 (ISO/IEC 13818) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], MPEG-4 (ISO/IEC 14496))
deal only with compression of multimedia information [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Other ones describes the
semantics of multimedia content.
      </p>
      <p>
        MPEG-7 ("Multimedia Content Description Interface" ISO/IEC) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is a semantic
multimedia-oriented standard. It assumes a different degree of attention to details in
its descriptions. MPEG-7 contains description tools – DT (Description Tools);
Definition Language DDL (Description Definition Language) and System Tools. It
defines a standard set of descriptors for different types of information, standardizes
the way of defining its descriptors and their interconnections. DT contains two
components: descriptors that define the syntax and semantics of each property
(metadata element), and the description schemes that set the structure and semantics
of the relationships between their components, which can be either descriptors, or
description schemes.
      </p>
      <p>Since descriptive possibilities have to be unambiguously and completely
interpreted in the application context, they can differs for different user domains and
different applications, that is the same material can be described through different
types of properties that are relevant to the scope of use and application possibilities.
For example, a graphic image at the lowest level of abstraction can be described by
form, size, texture, colour, palette, trajectory and position; and audio can be described
through tone, tempo change, position in the soundtrack, while the upper level will
contain semantic information "This is a scene with a green car going along the road
on the left and a dog crossing the road on the right, accompanied by a background
sound of rain". There may also be intermediate levels of abstraction. The level of
abstraction is related to the way of information obtaining: many low-level properties
can be extracted automatically, while high-level properties require human
participation. In many cases multimedia resources are described by non-structured or
semistructured NL text. However, the problem with dependent of these descriptions
from particular NL. This is especially important for processing of names, titles,
places, etc. The Description Tools of MPEG-7 allow creating content descriptions
(that is, a set of description schemes DS and the corresponding descriptors D)
containing information on the creation and use of content; reality displayed in the
content; set of objects, etc.</p>
      <p>
        MPEG-21 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] is a "Multimedia Framework" standard designed to create a content
management infrastructure in a distributed environment for semantic search. It defines
the basic syntax and semantics of multimedia elements, dependencies between them
and the operations that they support. It is serving to establish interoperability between
multimedia information resources.
      </p>
      <p>Such metadata are available for representation semantics of multimedia Big Data.
NL annotations of the semantic content of the material should be included in these
meta-descriptions.</p>
      <p>RDF (Resource Description Framework) is another promising approach to creating
semantic metadata for various types of information created within the Semantic Web
project. RDF is intended to standardize the definition and use of Web metadata
resources, but it is also applicable to the description of Big Data. RDF uses the base
data model "object – attribute – value". RDF Schema gives a possibility to define a
specific dictionary for RDF data and specify the types of objects to which these
attributes can be applied, that is, mechanism of RDF Schema provides a basic system
of types for RDF models.</p>
      <p>An important feature of the RDF standard is extensibility: RDF gives a possibility
to specify the structure of the source description by using and extending the built-in
concepts of RDF schemes, such as classes, properties, types, collections. The RDF
model scheme includes inheritance of classes and properties.</p>
      <p>
        Certain patterns and standards for describing typical resources are provided to
users to simplify and unify the creation of resource meta-descriptions. The most
thoroughly developed set of elements for metadata creation is "Dublin Core Metadata
Elements" [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
    </sec>
    <sec id="sec-9">
      <title>8 Lifecycle of Big Data Analysis</title>
      <p>Big Data analysis differs from traditional data analysis primarily because of the
characteristics of the processed data, such as volume, velocity and diversity. In order
to meet the various requirements for implementing Big Data analysis, a step-by-step
methodology is required for organizing activities and tasks related to the acquisition,
processing, analysis and reuse of data. The traditional life cycle of the Big Data
Analysis can be divided into the following stages, as shown in Fig. 1: 1) Assessment
of the task that solution requires results of Big Data analysis; 2) Identification of data
(internal, external, location); 3) Data collection and cleaning; 4) Data extracting
(receiving, forwarding, entry into the data bank); 5) Data inspection and clearing; 6)
Data aggregation and submission for analysis; 7) Data analysis; 8) Data visualization;
9) Use of the analysis results.</p>
      <p>We changed some stages of the traditional life cycle of Big Data analysis by
adding Semantic Web elements, in particular deal with ontological modeling of
domain knowledge, into certain stages.</p>
      <p>At the stage of data identification the data sets necessary for carrying out analytical
projects (tasks) and their sources are defined. Domain ontology helps in appropriated
data sources – for example, NoSQL data warehouses with some relevant to task
characters (geographic, temporal, domain-specific) and data types. Detecting a wider
range of relevant data sources may increase the likelihood of detecting hidden
regularities and correlations in Big Data.</p>
      <p>At the stage of data collection and cleaning, the final formation of Big Data
packages for the purposes of the task is accomplished using semantic analysis of
metadata text annotations and the selection of relevant data sets for solving the
problem. Semantic approach is used for selection of Big Data sets that are relevant to
the user’s task.
1. Assessment
of the problem
2. Identification</p>
      <p>of data
3. Data collection and</p>
      <p>cleaning
4. Data extracting</p>
      <sec id="sec-9-1">
        <title>Big Data</title>
        <p>Ontologies
9. Use of the analysis
results
8. Data
visualization
7. Data analysis
5. Data inspection
and clearing</p>
        <p>6. Data
aggregation</p>
        <p>Some data identified as input data for analysis may come in formats that are
incompatible with the Big Data application. This is especially true for data from
external sources. The stage of the lifecycle of data extraction is designed to extract
incomparable data and convert it into format that the base Big Data software can use
to analyze the data.</p>
        <p>The stage of data inspection and clearing is designed to create complex rules for
checking and deleting any known inaccurate data (duplicate data, data omissions,
excess data, etc.). For package analysis, data inspection and clearing can be
performed using the offline ETL operation. For real-time analysis, a more complex
internal memory system is needed to validate and clear data as they flow from the
source.</p>
        <p>The stage of data aggregation and presentation serves for consolidation of data
sets that can be distributed across multiple data sets through common fields, such as
by date or identifier (ID). In other cases, the same data fields can be displayed in
multiple data sets. In any case, you need a data convolution method or you need to
define a data set representing the correct value. Completion of this stage can be
complicated due to differences in: data structure – although the data formats may be
the same, the data structure model may differ; semantics – the value marked
differently in two different sets of data may mean the same thing, for example,
"surname" and "last name". At this stage domain ontology can be used for matching
of various names of the same concepts, for determining of relations between them
(hierarchy, synonymy, semantic closeness, etc.).</p>
        <p>The stage of data analysis use Data Mining and ML techniques for generation of
new knowledge by Bid Data processing. Ontologies are used on this stage for
integration of these new rules with concepts of user task domain.</p>
        <p>It is necessary to note the importance of the first two stages of this life cycle – the
task setting, for which the Big Data analysis is carried out, and the selection of the
Big Data set, which is pertinent to this task. Supposing these steps are unsuccessful,
then, despite the complexity and effectiveness of the data analysis methods, the
results will not meet the user's needs.</p>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>9 Ontologies and Big Data</title>
      <p>
        In knowledge engineering, ontology is understood as a detailed description of some
problem area, which is used for the formal and declarative definition of its
conceptualization [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Often ontology is called the knowledge base of a special type,
which can be divided, alienated and used independently in the framework of the
considered domain [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Now use of ontologies as adequate means for describing
different domains is a generally accepted fact, and a wide range of ontologies
available through the Web confirms the popularity of this approach among various
groups of developers and users of Web applications, including applications with Big
Data.
      </p>
      <p>
        Such ontologies are describes in various languages and associated with a wide
variety of domains. They differ by the volume, expressive means, purpose, degree of
knowledge formalization, etc. Classifications of ontologies differ in the parameters
the classification an in general, can be divided into two groups – semantic and
pragmatic. Semantic classifications group ontologies according to the parameters
connected with the content of information: domain; the degree of formality of the
knowledge presented; the level of expressiveness and the level of information detailed
description [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Pragmatic classifications group ontologies according to the purposes
of their development and sphere of use.
      </p>
      <p>
        Domain ontology is the part of the domain knowledge that limits the meaning of its
terms which do not depend on another (changing) part of knowledge of this domain.
Such domain ontology can be considered as a set of agreements on the domain, and
the rest of domain knowledge is a set of empirical and other laws of this area. Thus,
the ontology determines the degree of agreement of terms by the specialists in this
domain [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
      </p>
      <p>Different sources offer different formal models for ontology representation.
However, every of them contains set of terms (notions, concepts) that can be divided
into set of classes and set of instances; set of relations between concepts where some
relation groups (relations "class-subclass", hierarchical and taxonomic relations and
the synonymy relations) can be clearly distinguished, as well as functions – a special
case of relations for which the n-th element of the relations is uniquely determined by
n-1-th previous elements; axioms and functions of interpreting concepts and relations.</p>
      <p>To build an ontological Big Data model, it is necessary to separate the set of
classes from the set of class instances. It is also advisable to separate object relations
between instances of different classes from data relations, namely the relations
between instances of attributes and their values. To describe the Big Data ontologies
we use the following formal model:</p>
      <p>O  X, R, F, T, M
(1),
that contains the following elements:
- X  Xcl  Xind – the set of ontology concepts, where Xcl is the set of classes,
Xind is the set of class instances, such where a  Xind A  Xcl , a  A ;
- R  rier _ cl  {ri}  rier _ prop  {p j}  pier _ prop is the set of relations between
elements of ontology, where</p>
      <p>- rier _ cl is the hierarchical relation between the classes of ontology – they are
structures of partial ordering with the upper element Thing that can be established
between classes of ontology and are characterized by such properties as antisymmetry
and transitivity, rier _ cl : Xcl  Xcl ;</p>
      <p>- {ri } is the set of object properties that establish the relationship between instances
of classes: ri a, a  Xind   b, b  Xind ; ri : Xind  Xind ;</p>
      <p>- rier _ prop is hierarchical relations between the object properties of the ontology
classes;</p>
      <p>- {p j} is a set of data properties that establish the relations between instances of
classes and values with T: pi a, a  Xind   t, t  T , pi : Xind  T ;</p>
      <p>- pier _ prop is hierarchical relations between the properties of these instances of
ontology classes;</p>
      <p>- F  {Fcl  Fprop } is a set of characteristics that can be used for logical inference
above the ontology;</p>
      <p>- T is a set of data types (for example, a line, a whole) for values of data
properties of ontology classes;
- M is the set of non-logical rules of SA.</p>
      <p>Such an ontology Big Data contains classes for selection of typical for Big Data
information objects (video, audio, streaming video, semistructured data from sensors)
with sets of relevant semantic properties:
- different formats of devices generating Big Data;
- the purpose of these devices;
- geographical location;
- time characteristics;
- reliability of the source;
- conditions for access;
- volumes and speed of the update.</p>
      <p>Big Data can be both created by human and generated by electronic devices, they
can come from different sources and be presented in different formats or types.
Therefore, Big Data ontology displays typical sources of Big Data – from the
activities of people (both individuals and organizations) through information and
communication equipment (from social networks, smart phones, computers, cash
registers, ATMs, etc.) and from automated devices (sensors, sensor networks,
camcorders, GPS, Internet of Things devices, automated productions, drones).</p>
      <p>Ontology can also fix the quality parameters of Big Data – noise, accuracy, degree
of trust to the source, signal quality, completeness, etc.</p>
      <p>Ontology allows to represent the semantics of links between individual Big Data
fragments (temporal, geographic, communicational (for example, information from
smart phones that were in the conversation mode), by device identifiers, by subject,
by purpose, etc.). Below (Fig. 2), examples of Big Data ontology elements
corresponded to various elements of its ontological model (1) are demonstrated:</p>
      <p>Xcl  {"Big _ Data _ resource","s tan dard","type","format","metadata _ format",...}
Xind  {XXX101,...,MPEG7,...,JPG,...} ;
"metadata_ fornat"rier _ cl"format" ;
{ri}  {" has _ type","has _ resource","based _ on",...} ;
{p j}  {"annotation","size","date",...}.</p>
      <p>The elements of this ontology are matched with the ontology of the user's task to
search the pertinent sources of Big Data.
10 Comparison of Natural Language Texts
To use ontological knowledge for comparing such information objects as annotations
– unstructured NL texts – it is necessary to provide mechanisms for linking elements
of their content with ontology terms. Such a mechanism can be based the task
thesaurus which represents the user needs of base of the domain ontology.</p>
      <p>
        In general, a thesaurus is a dictionary of the basic concepts of language linked with
separate words or phrases with certain semantic connections between them [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
Thesaurus can be considered as a special case of ontology [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Task thesaurus is a
set of concepts necessary to describe and solve a problem for which the user is trying
to find some information by analysis of some Big Data set. The weight of each of
thesaurus concepts characterizes the importance and pertinence of this concept for the
current user task. Thesaurus concepts can be imported from domain ontology.
Thesauri are used in semantic markup of NL texts [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. The similarity of two NL
texts is estimated by the semantic proximity function between their thesauri.
      </p>
      <p>For search of the pertinent Big Data sets the user’s task thesaurus Тh task ,
Тh task   t m , w m , m  1, q is compared with the thesauri of the Big Data
annotations from the sets I, I  annot(Big _ Data _ resource j ), j  1, n , and the
coefficients of their proximity K j, m  1,q, is calculated:</p>
      <p>q
K j  f t m * w m , m  1, q, where
m1
0, t m  annot (Big _ Data _ resource j)
f t m   </p>
      <p> 1, t m  annot (Big _ Data _ resource j )</p>
      <p>There is assumed that t m  annot(Big _ Data _ resourcej ) if the annotation of the
resource annot(Big _ Data _ resource j ) has fragment of the text that in accordance
with the lexical knowledge base correlates with the term of thesaurus t m . Processed
resources are ordered in dependence on the values K j , and user receives for further
analysis such Big Data sets where the value of the semantic proximity function is
higher than the given value of estimation K.
11 Solution of Homonymy in Big Data Annotations
The NL ambiguity causes different interpretations of words. One of commonly
encountered problems deals with homonyms. Examples of homonyms: “hyperbole”
as a stylistic figure in which an attribute is exaggerated and “hyperbole” as a flat
curve, “bow” of a ship – “bow” and arrow, “row” (line) – “row” (quarrel).</p>
      <p>Recognition is also applied to various types lexical homonyms: homophones
(words pronounced alike but different in meaning: too – two, here – hear, meat –
meet, see – sea); homoforms (words having the same sound composition only in a
certain grammatical form), homographs (words spelled alike but different in
meaning).</p>
      <p>
        If a word in NL text has several variants of semantic meaning, then it is necessary
to choose the proper variant by the context (with the help of knowledge from domain
ontology). All recognized examples are gathered into the set of precedents that can be
processed by one of ML algorithm for generation of some rules for homonymy
solution (e.g. the decision tree). These rules usually depend on particular NL.
Information for recognition can be acquired from semantic or non-semantic
Wikiresources, as well as from dictionaries of homonyms of various natural languages. If
we use Wiki then the content of the word is defined as the text of the relevant Wiki
article, from which links to other articles are obtained [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. We can use Wiki for
recognition of new terminology or for new meanings of existing terms because Wiki
resources are much more dynamic and up-to-date versus traditional vocabularies.
      </p>
      <p>The procedure for homonym recognizing consists of the following steps:
1. The text for recognition comes to the input.
2. Pre-processing of the text (fragmentation of text on sentences and words).
3. Normalization of the text (conversion of words into the infinitive, change of
endings, etc.)|
4. Comparison of every word with homonym database.</p>
      <p>5. If word match with homonym from database then the algorithm of homonym
recognition is executed.</p>
      <p>The algorithm of homonym recognition is based on the decision tree. The decision
tree (also called the tree of classifications) is used in Data Mining for predicted
models. The structure of this tree contains the following elements: "leaves" and
"branches". The branches correspond with values of attributes – parameters that
defines value of target function, the leaves correspond with values of result (target
function). In order to recognize (classify) a new case, we go down from the root of
this tree according to values of case attributes to the leaf.</p>
      <p>The process going from top to bottom is an example of an absorbing "greedy"
algorithm, and today it is one of widespread strategies.</p>
      <p>Web</p>
      <sec id="sec-10-1">
        <title>System of Big Data resources selection</title>
        <p>Preprocessor of NL
-description of the task</p>
        <p>in thesaurus
Block of machine learning
Generation of training sample
Construction of the decision tree</p>
        <p>Interpretation of rules</p>
        <p>Rules of
homonymy
solution</p>
        <p>Big Data
thesauruses</p>
        <p>Block of
thesauruses comparison
Search for synonyms</p>
        <p>Homonymy solution
Evaluation of semantic proximity
of Big Data resources
to the task description</p>
        <p>с1,…,сn
Big Data Resources</p>
        <p>B1,…,Вn</p>
        <p>Preprocessor
of annotations
in thesauruses</p>
        <p>Linguistic
knowledge</p>
        <p>base
Knowledge base
of NL synonyms</p>
        <p>Big Data
Repositories</p>
        <p>Big Data
Annotations
Description
of the task
Semantic</p>
        <p>Wiki
Ontologies</p>
        <p>of SA</p>
        <p>The general recursive scheme of the decision tree constructing by the learning
sample:</p>
        <p>Step 1. If learning sample has examples with different results then go to 2. Else
link the last branch with leave corresponded with this result and stop.</p>
        <p>Step 2. Select one of m attributes of learning sample (various algorithms differ one
from another by criteria of this selection) and link the last branch with node
corresponded with this attribute Ai that has qAi values.</p>
        <p>Step 3. Divide learning sample on qAi subsets with m-1 attributes, link the
attribute node with branches corresponded with all these values and execute step 1 for
all these subsets of learning sample with m-1 attributes.</p>
        <p>In this case, attributes are linked with presence or absence of various terms (their
forms, synonyms etc.) in context of description of various meanings of term-homonim
in learning sample. Such learning sample is generated by Wikipedia, other Wiki
resources, vocabularies and definitions. Size and pertinence of learning sample
defines the quality of recognition.</p>
        <p>Decision rule based on the decision tree: "if the word {c1, c2 ... cn} is combined
with the word D in the text, then the word D is selected". The intelligence system of
annotation comparisons (Fig. 4) has to perform these actions.
12 Conclusions</p>
        <p>The results of analyzing the existing means of Big Data description show the lack
of generally accepted standards for such metadata representation. Therefore, the
proposed methods for the analysis of natural language annotations of Big Data are by
far the most adequate means of comparing the semantics of Big Data sets with
particular user tasks to select the pertinent information for analysis.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>T.</given-names>
            <surname>Erl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Khattak</surname>
          </string-name>
          , &amp; P.
          <string-name>
            <surname>Buhler</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Big Data Fundamentals</article-title>
          . Prentice Hall: Upper Saddle River, NJ, USA.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>N.</given-names>
            <surname>Marz</surname>
          </string-name>
          &amp; J.
          <string-name>
            <surname>Warren</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Big Data: Principles and best practices of scalable real-time data systems</article-title>
          . New York; Manning Publications Co.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Boncz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.L.</given-names>
            <surname>Brodie</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Erling,</surname>
          </string-name>
          <article-title>The meaningful use of Big Data: four perspectives - four challenges</article-title>
          ,
          <source>SIGMOD Rec</source>
          .
          <volume>40</volume>
          (
          <issue>4</issue>
          ) (
          <year>2012</year>
          )
          <fpage>56</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>H.</given-names>
            <surname>Abbes</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Gargouri: M2Onto: an approach and a tool to learn OWL ontology from MongoDB database /</article-title>
          / Madureira,
          <string-name>
            <given-names>A.M.</given-names>
            ,
            <surname>Abraham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Gamboa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Novais</surname>
          </string-name>
          , P. (eds.)
          <article-title>ISDA 2016</article-title>
          .
          <article-title>AISC</article-title>
          , vol.
          <volume>557</volume>
          ,
          <year>2017</year>
          . - Pp.
          <fpage>612</fpage>
          -
          <lpage>621</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -53480-0_
          <fpage>60</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>K.</given-names>
            <surname>Baclawski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Berg-Cross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fritzsche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Westerninen Ontology Summit 2017 communiqué-AI, learning, reasoning and ontologies</article-title>
          .
          <source>Applied Ontology</source>
          ,
          <year>2018</year>
          , P.
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          . - http://www.ccs.neu.edu/home/kenb/ pub/2017/09/public.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>K.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Seligman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          , Ch. Kurcz,
          <string-name>
            <given-names>M.</given-names>
            <surname>Greer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macheret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sexton</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Eckstein “Big Metadata”: The Need for Principled Metadata Management in Big Data Ecosystems /</article-title>
          / Proceedings of the Company DanaC@SIGMOD, Snowbird,
          <string-name>
            <surname>UT</surname>
          </string-name>
          , USA,
          <year>2014</year>
          . - P.
          <fpage>46</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Chinchwadkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fekete</surname>
          </string-name>
          , Ramachandran K.
          <article-title>Metadata-as-a-</article-title>
          <source>Service //in Proceedings of the 31st IEEE International Conference on Data Engineering Workshops (ICDEW)</source>
          ,
          <year>2015</year>
          . - P.
          <fpage>6</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>A</article-title>
          . Jeusfeld Metadata // Encyclopedia of Database Systems, Springer,
          <year>2009</year>
          . -
          <fpage>З</fpage>
          .
          <fpage>1723</fpage>
          -
          <lpage>1724</lpage>
          . - http://www.springerlink.com/content/h241167167r35055/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>M.</given-names>
            <surname>Grotschel</surname>
          </string-name>
          , J.
          <source>Lugger Scientific Information System and Metadata. Konrad-ZuseZentrum fur Informationstechnik</source>
          , Berlin. - http://www.zib.de/ groetschel/pubnew/paper/groetschelluegger
          <year>1999</year>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>B.</given-names>
            <surname>Halshofer</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. Klas</surname>
          </string-name>
          <article-title>A Survey of Techniques for Achieving Metadata Interoperability // ACM Computing Surveys</article-title>
          , Vol.
          <volume>42</volume>
          , No.
          <volume>2</volume>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <article-title>Metadata Standards and Applications</article-title>
          . Introduction: Background, Goals, and
          <string-name>
            <given-names>Course</given-names>
            <surname>Outline. ALCTS</surname>
          </string-name>
          . - http://www.loc.gov/catworkshop/courses/ metadatastandards/pdf/MSA Instructor Manual.pdf
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Uniform Resource Identifier (URI): Generic Syntax</surname>
          </string-name>
          . - http://tools.ietf.org/html/rfc3986 .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Lagose</surname>
            <given-names>C</given-names>
          </string-name>
          .
          <article-title>Metadata for the Web</article-title>
          . Cornell University.
          <source>CS 431 - March 2</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. MPEG-21
          <string-name>
            <given-names>Multimedia</given-names>
            <surname>Framework</surname>
          </string-name>
          , Introduction, ISO/IEC, http://mpeg.telecomitalialab.com/standards/mpeg-21/mpeg-21.htm .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. MPEG-1, ISO/IEC,
          <year>1996</year>
          . - http://mpeg.telecomitalialab.com/standards/mpeg-1/mpeg1.htm
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. MPEG-2, ISO/IEC,
          <year>2000</year>
          . - http://mpeg.telecomitalialab.com/standards/mpeg-2/mpeg2.htm
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <article-title>Overview of the MPEG-4 Standard</article-title>
          , ISO/IEC,
          <year>2002</year>
          . - http://mpeg.telecomitalialab.com/standards/mpeg-4/mpeg-4.htm
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. MPEG-7
          <string-name>
            <surname>Overview</surname>
          </string-name>
          , ISO/IEC,
          <year>2002</year>
          . - http://mpeg.telecomitalialab.com/standards/mpeg7/mpeg-7.htm
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. MPEG-21 Overview v.
          <volume>4</volume>
          ,
          <year>2002</year>
          . - http://mpeg.telecomitalialab.com/standards/mpeg21/mpeg-21.htm.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>20. Dublin Core Metadata Elements http://www.faqs.org/rfcs/rfc2413.html .</mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>T. Gruber</surname>
          </string-name>
          <article-title>What is an Ontology?</article-title>
          - http://www-ksl.stanford.edu/kst/what-is-anontology.html.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22. N. Guarino Formal Ontology in Information Systems // Formal Ontology in
          <source>Information Systems. Proc. of FOIS'98</source>
          ,
          <year>1998</year>
          . - P.
          <fpage>3</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. L.
          <string-name>
            <surname>Obrst</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Ceusters</surname>
            , I. Mani,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ray</surname>
            ,
            <given-names>B. Smith</given-names>
          </string-name>
          <article-title>The evaluation</article-title>
          of ontologies // Semantic Web,
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          ,
          <year>2007</year>
          . - P.
          <fpage>139</fpage>
          -
          <lpage>158</lpage>
          . - http://philpapers.org/archive/OBRTEO-6.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>A.Y. Gladun</surname>
            ,
            <given-names>J.V.</given-names>
          </string-name>
          <string-name>
            <surname>Rogushina</surname>
          </string-name>
          <article-title>Semantic technologies: principles and practics</article-title>
          . - K.:
          <string-name>
            <surname>ADEF-Ukraine</surname>
          </string-name>
          ,
          <year>2016</year>
          . -
          <fpage>308</fpage>
          с. [in Ukrainian]
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>ISO</surname>
          </string-name>
          25964-1:
          <fpage>2011</fpage>
          ,
          <article-title>Thesauri and interoperability with other vocabularies. Part 1: Thesauri for information retrieval</article-title>
          / Geneva: International Organization for Standards,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gladun</surname>
          </string-name>
          , &amp; J.
          <string-name>
            <surname>Rogushina</surname>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Use of semantic web technologies and multilinguistic thesauri for knowledge-based access to biomedical resources</article-title>
          .
          <source>International Journal of Intelligent Systems and Applications</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gladun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rogushina</surname>
          </string-name>
          ,
          <string-name>
            <surname>Valencia-García</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Béjar</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Semantics-driven modelling of user preferences for information retrieval in the biomedical domain</article-title>
          .
          <source>Informatics for health and social care</source>
          ,
          <volume>38</volume>
          (
          <issue>2</issue>
          ),
          <fpage>150</fpage>
          -
          <lpage>170</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>J. Rogushina</surname>
          </string-name>
          <article-title>Semantic Wiki resources and their use for the construction of personalized ontologies /</article-title>
          / CEUR Workshop Proceedings 1631 ,
          <year>2016</year>
          . - P.
          <fpage>188</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>