<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Curation Approach to Management of Research Data. Use Cases for a Upgrade of the Thermophysical Database THERMAL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrey Kosinov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adilbek Erkimbaev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Geirgy Kobzev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vladimir Zitserman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Joint Institute for High Temperatures, Russian Academy of Sciences</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>325</fpage>
      <lpage>335</lpage>
      <abstract>
        <p>Procedures are considered to support extensive archives of digital data called “data storage”. Particular attention is paid to the support of scientific data. It is shown that the activities aimed at updating the thermophysical THERMAL database correspond to the approaches provided by the “data curation”. A communication system for metadata with external ontology is proposed. The new version of metadata provides the possibility of multilateral assessment of the origin, quality and status of scientific data. It is shown that the use of new metadata provides a significant increase in the value of these studies.</p>
      </abstract>
      <kwd-group>
        <kwd>Research Data</kwd>
        <kwd>Data Curation</kwd>
        <kwd>Data Quality</kwd>
        <kwd>Thermophysical Database</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Digital data uploaded to repositories or databases require permanent procedures that
guarantee safety, quality top-level and enduring access to data. The set of such
procedures is the subject of particular activity of managing the use of data called data
curation. This term is rarely used in the Russian literature, although all the necessary
actions for data integrity and management are of course performed to support digital
repositories or databases. The meaning and content of the concept of curation of data
can be revealed by referring to the history of its appearance. This concept originates
from the Museum's practice, which is traditionally based on the curator's work on
preservation, renovation and description of exhibits.</p>
      <p>
        As for the term “data curation”, apparently, it first appeared in the article by Diana
Zorich [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which pointed to the common problems facing libraries, museums and
research centers involved in supporting digital collections. According to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], digital
archives, as well as their supporting tools (vocabularies, thesauruses, metadata)
should be regularly monitored and updated for data consistency, maintaining its
quality, availability, etc., and activities in this direction is the essence of curating process.
      </p>
      <p>Oddly enough, digital data is subject to erosion, as are physical artifacts,
manuscripts or museum exhibits. It can be related to the use of outdated metadata,
terminology, dictionaries, formats, software, as well as the absence of references to more
relevant documents or external resources. By analogy with engineering, it can be
considered as technological obsolescence (or deterioration) of the data structure, file
formats, software, etc. An unrecoverable failure of storage media (bit rot in IT slang)
during data storage is possible in parallel with content obsolescence.</p>
      <p>
        History of the “Digital Curation” concept and its gradual adoption in data
manager’s community is considered by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In particular, the concept of “data curation” here
is clearly separated from the narrower and service-oriented concepts: “archiving” and
“preserving”. Unlike the latter, curation involves not only the preservation and
maintenance of digital storage, but its indispensable enrichment by expanding the
functions and content. For example, a common and effective way of enriching content
is to place it in a wider context by linking the data set to thematically related
resources, so called Contextualizing.
      </p>
      <p>
        Among the main objectives of data curation, as a rule, such are mentioned as their
storage, description, safety measures, the so-called cleaning, that is, monitoring and
restoring quality, as well as a number of other measures. The expanded definition of
the Digital Curation Center [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] covers all activities related to data management,
starting with data creation, digitization, documentation, and accessibility and future reuse.
The detailing of these processes, carried out in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], allowed us to identify about 50
curation practices, most of which also fall into such categories as data preservation,
data cleaning, and finally, description in terms of the complex structure of metadata.
1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Curation of Research Data</title>
      <p>
        The purpose of this report is to discuss the specific recipes foreseen in the framework
of digital curation in the implementation of the project for updating the
Thermophysical Database THERMAL. The project considered in the report at the previous
conference [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] includes: a significant expansion of the database volume due to the rejection
of the restrictions adopted at the stage of creation in the 70s of the last century;
creation of tools for flexible variation of the data structure, reflecting the uniqueness of
objects and their characteristics; transition to a new platform that allows you to store
and process data of various structures and formats. A significant part of the activities
performed during the project, in fact, refers specifically to the data curation, as it
comes down to checking and correcting old documents in accordance with the newly
adopted format, vocabulary and requirements for data completeness and quality.
      </p>
      <p>In general, curation refers to digital objects of arbitrary origin and kind. Therefore,
numerous measures for data preservation such as regular back up, defect detection at
the bit level, overcoming technological obsolescence of hardware or file formats are
applicable in all cases regardless of the content. In solving scientific problems, the
curation process provides not just conservation, but confirmation and reliable
expansion of previous data of the experiment or simulation. On the contrary, the absence or
poor quality of the curation process inevitably leads to the loss, distortion or
misinterpretation of data.</p>
      <p>
        A brief list of features and capabilities of “data curation” as applied to e-Science
(or Data-intensive science) was given by the Digital Curation Centre [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this list,
the specificity of scientific data, to a certain extent, is taken into account at all stages
of the curation process. For example, long-term data storage may require replacement
of obsolescence of storage devices, which has already been encountered in astronomy.
At the stage of data cleaning (that is, data correction and updating), it is important to
establish links between different versions of evolving datasets or between primary and
secondary data. However, the most noticeable specificity of data curation is
manifested in their description, that is, in the composition and structure of metadata.
2
      </p>
    </sec>
    <sec id="sec-3">
      <title>Metadata (Update and Extension)</title>
      <p>
        In general terms, metadata document the context and record information about how
the data was obtained and what processing and verification procedures were
performed during the retention period. There is an extensive literature on scientific
metadata and their use in various disciplines [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7–9</xref>
        ]. Metadata, accompanying subject
information, allow you to: identify a dataset with its position in the repository; define
access rules; describe the logical structure and data formats; to ensure the operation of
various data analysis tools. Metadata standards for different disciplines and types of
documents are collected in the catalog (rd-alliance.github.io/metadata-directory/) and
the “Disciplinary metadata” section of the Digital Curation Center
(www.dcc.ac.uk/resources/metadata-standards). Both mentioned sources contain also
references to domain-agnostic standards for formal description of digital resources
(e.g. the Dublin Core metadata set), or for the identification and citation of digital
resources (e.g. DataCite Metadata Store). Metadata for thermophysical properties
(ThermoML), characteristics of ordinary materials (MatML) and nanomaterials are
described in detail in [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref7">7, 10–12</xref>
        ].
      </p>
      <p>Regardless of the subject area, scientific metadata must satisfy a number of
requirements that guarantee sufficient completeness and accuracy in the presentation of
each data set. Relevant elements should provide: unambiguous identification of the
object of study; presentation of information about the source and data acquisition
method (research method, equipment, program code etc); uncertainty and data quality
information; linking with controlled vocabularies or ontologies; the possibility of
flexible adjustment to the features of the object and its characteristics. The expansion
of metadata carried out in the updating the database THERMAL, provides for the
implementation of each of these requirements, see Table 1.</p>
      <p>First of all, in the progress of curation, the possibilities of identifying objects are
expanded, which can now include, along with inorganic substances, complex
organics, natural and industrial materials, and so on. The “Identification’’ metadata
element provides, along with a stoichiometric formula, the use of several common names
(synonyms), as well as links to publicly available databases. The pointer to the
database and the corresponding identifier uniquely identify the object, providing, in
addition, access to information that complements the information stored in THERMAL.
As an example in Fig. 1 the identification of the compound called epoxyethane
(oxirane, ethylene oxide) in the old and new versions of the database is shown.</p>
      <sec id="sec-3-1">
        <title>Stoichiometric formula</title>
      </sec>
      <sec id="sec-3-2">
        <title>Substance class</title>
      </sec>
      <sec id="sec-3-3">
        <title>Properties</title>
      </sec>
      <sec id="sec-3-4">
        <title>Properties type</title>
      </sec>
      <sec id="sec-3-5">
        <title>Phase</title>
      </sec>
      <sec id="sec-3-6">
        <title>Phase transition</title>
        <sec id="sec-3-6-1">
          <title>New metadata structure</title>
        </sec>
      </sec>
      <sec id="sec-3-7">
        <title>Unique record ID</title>
      </sec>
      <sec id="sec-3-8">
        <title>Data type [bibl, full-text, factual]</title>
      </sec>
      <sec id="sec-3-9">
        <title>Data status</title>
        <p>[experiment or simulation, predicted, critical evaluated,
recommended, stale]
Research type [experimental, theory, simulation, review]
Source
Provenance [bibl, database, external agency]
Data origin
[method, equipment, software]</p>
      </sec>
      <sec id="sec-3-10">
        <title>Identification</title>
        <p>[common names, stoichiometric formula, public database</p>
        <p>ID]</p>
        <p>Linking to ontology classes
[Linking to sub- classes of the Chemical_entity]</p>
      </sec>
      <sec id="sec-3-11">
        <title>Linking to ontology classes [Linking to sub- classes of the Quality]</title>
      </sec>
      <sec id="sec-3-12">
        <title>Data Quality References</title>
      </sec>
      <sec id="sec-3-13">
        <title>Linking to ontology classes</title>
        <p>[Linking to sub-classes of the State_of_matter]</p>
        <p>Linking to ontology classes
[Linking to sub-classes of the Transition]</p>
        <p>Uncertainty [type, value]</p>
        <p>Data quality attributes (timeliness,
reliability, currency, completeness etc)</p>
        <p>Data Features
[SubstanceFeatures, Sample, Influence Factors]</p>
        <p>Full-text
Tables or equations
External documents
(from Web or Server)
Fig. 1. Identification of the substance “ethylene oxide” in the old (top) and new (bottom)
versions of the database.</p>
        <p>Another example is the natural mineral mullite, where variations of the elemental
composition allow the use of several stoichiometric formulas (for instance,
Al6Si2O13 и Al4SiO8), and the exact identification is provided by the record (URL:
www.webmineral.com/data/Mullite.shtml#.WaMcG_hJaUk) in the mineralogical
database WEBMINERAL.</p>
        <p>A more complete identification is provided by linking metadata with ontology
classes, which includes entities that reflect the types of objects, their states and
properties, Fig. 2.</p>
        <p>In particular, chemical_entity identifies types of substances or materials, focusing
on the systematics adopted in chemistry (elements, oxides, acids ...), as well as on
categories determined by properties or by application (polymer, solution, mixture,
refrigerant, fuel ...). A pointer to subclasses in relation to the class mixture allows to
identify binary and multicomponent alloys and solutions, for example, such relevant
objects in thermophysics as air, humid air, combustion products, etc. The
State_of_matter class allows you to detail the phase and type of the crystal lattice
based on an extensive hierarchy of child classes.</p>
        <p>
          Similarly, linking to classes that inherit the Quality and Transition classes reflects
the rich variety of physical properties inherent in an object and the phase transitions
that occur in it. It is essential that linking of metadata to ontology during the curation
provides unambiguous interpretation of terms and concept, and through editing of
ontology, the possibility of flexible adjustment in connection with the emergence of
new objects and concepts. For example, concepts such as second critical point [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ],
topological insulator [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] or previously unknown allotropic forms of carbon
(metallic carbon, T-carbon) have recently been included.
New curatorial perspectives have emerged with concepts such as “Data Quality” and
“Data Features” in the metadata set, see Table 1. Alternative choices for evaluation
the scientific data quality are discussed in detail in article [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. It was shown that the
best way to assess the research data quality was to combine the traditional uncertainty
assessment with a certification procedure based on several quality attributes, every of
them represents some aspect of quality. It is necessary to select a specific metric for
the multidimensional certification of the dataset and the data compliance indicator for
each of the quality attributes. In many cases, it is useful to use a domain-agnostic
metrics reflecting quality factors that are important to data consumers, for example
accuracy, timeliness, reliability, completeness, relevancy, interpretability etc. Such an
approach to quality certification is most justified in interdisciplinary projects, for
example, when integrating thermal data with the performance characteristics of
structural materials.
        </p>
        <p>
          However, when you update the thermophysical database THERMAL, you can
reduce the number of quality attributes. Data certification proposed by the authors [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
in relation to physico-chemical properties is based on the use of three attributes:
completeness of information about the state and preparing the sample in the experiment;
completeness of the method and measuring instruments description or codes and data
processing in the case of simulation; the consistency of the numerical data with
ground rules and regularities аs well as with previous fairly reliable measurements.
Combined with uncertainty assessment, certification identifies three main aspects of
data quality: accuracy, completeness, and consistency. The technique provides a
generalized evaluation of the data set, by assigning each of the attributes of the quality
level (high, medium and low), focusing on compliance with the requirements of
completeness and consistency. The expert gets the opportunity, based on this data curation
technique [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] select the data set of top-quality by assigning them the special
Recommended status (for what to use the Data Stutus element). The information needed
to assess the data quality allows it to be carried out only for two types of data
(Fulltext and Factual) specified in the Data Type element, Table 1. At the same time, for
unstructured data in the form of the text of an article (full-text data) it is justified to
conduct only certification with indication of quality attributes, but without its own
assessment of uncertainty.
        </p>
        <sec id="sec-3-13-1">
          <title>Data Set Title</title>
        </sec>
      </sec>
      <sec id="sec-3-14">
        <title>Amorphous polymeric nitrogentoward an equation of state</title>
      </sec>
      <sec id="sec-3-15">
        <title>Melting point of high-purity ger</title>
        <p>manium stable isotopes</p>
        <p>Relationship between changes in
the crystal lattice strain and thermal
conductivity of high burnup UO2
pellets</p>
        <p>Study of near-critical states of
liquid - vapor phase transition of metals
by isentropic expansion method of
shock-compressed porous sample</p>
        <p>Thermophysical properties of
liquid Co measured by electromagnetic
levitation technique in a static
magnetic field
Phase diagram of water under an
applied electric field
Shock compression of preheated
molybdenum</p>
        <sec id="sec-3-15-1">
          <title>Data Feature</title>
        </sec>
      </sec>
      <sec id="sec-3-16">
        <title>SubstanceFeatures [amorphous, polymeric]</title>
      </sec>
      <sec id="sec-3-17">
        <title>SubstanceFeatures [stable isotopes]</title>
      </sec>
      <sec id="sec-3-18">
        <title>Sample [pellet]</title>
      </sec>
      <sec id="sec-3-19">
        <title>Influence Factors [high burnup]</title>
      </sec>
      <sec id="sec-3-20">
        <title>Sample [porous]</title>
      </sec>
      <sec id="sec-3-21">
        <title>Influence factors [field]</title>
      </sec>
      <sec id="sec-3-22">
        <title>Influence factors [field]</title>
      </sec>
      <sec id="sec-3-23">
        <title>Influence factors [prehistory]</title>
        <p>The concept of Data Features in a metadata set is based on other data evaluation
criterion. It allows you to select those data sets where there is any deviation from the
standard (i.e. an anomaly) in the characterization of the object or its properties. The
well-defined specificity (features) that distinguishes one dataset from another allows
you to overcome the inevitable contradiction between structured data and poorly
formalized information hidden in context. As can be seen from the Table 1, “Data
Features’’ element includes three groups of the features: SubstanceFeatures, Sample,
Influence Factor. The first allows you to extend the traditional substance
identification, indicating the isomeric form, nonstoichiometry, isotopic composition, etc. The
indication Sample includes pointers to the features of the sample: shape, size, surface
condition, prehistory, etc. Finally, the Influence factor sign includes pointers to
external factors that determine the experiment and properties of the substance: external
field, mechanical load, environment, radiation, etc. Some examples of non-standard
data sets from Table 2 illustrate the signs defining the specifics of a substance and its
properties. In so doing the specificity can be attributed to any data set of the three
types indicated in the Data Type element (Table 1), in contrast to the quality
assessment, depending on the type of data.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Data Cleaning and Preservation</title>
      <p>
        The update procedure of the THERMAL database, includes (along with the volume
expansion) revision of old records based on the new metadata system. As a result, the
data curator needs to check the completed conversion. This activity, called “Data
Cleaning” (or cleansing), involves the detection and correction of “dirty”, that is
distorted or incomplete data [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Data pollution occurs for various reasons, among
which may be distortions of old records (input errors and duplication, incorrect data
distribution by fields) and errors when using new metadata to determine the object
and properties, see Table 1.
      </p>
      <p>Previous data were entered without the use of controlled dictionaries, so the most
important task of cleaning is to eliminate ambiguity in terms and concepts, subjecting
them to ontology classes. For example, earlier a whole set of lexical elements
(including English and Russian terms) was associated with the concept of dynamic
compressibility, namely: Hugoniot, Hugoniot data, Hugoniot adiabat, shock Hugoniot,
shock adiabat, shock compression, release isentrope etc. Linking this whole set of
terms with a single ontology class (Dynamic Compressibility) eliminates synonymy
and dramatically facilitates semantic search. The same procedure requires that the
names of substances used in different records now appear in each record as a set of
common names, which eliminates search losses.</p>
      <p>In addition to linking and unifying terms, data cleaning also provides for the
correction of content, first of all, the correction (or fixation) of obviously obsolete
numerical data. At the same time, old records with correct filling of fields have
historical value, even with outdated data. Therefore, measures are proposed to clean the
data, excluding the physical removal of the record. One of them is to add to the old
record an indication of the unreliability of the data by linking it with later (or network
resources), including reliable data. Another measure is to assign the obsolete data to
the sign “low quality” using the attribute “consistency” (see above “Data Quality”).
Finally, you can assign the status to “Stale” or “Recommended”, which allows you
to immediately separate high-quality data from clearly obsolete data during the
search.</p>
      <p>Regular data cleaning inevitably requires entering them into a new context,
explaining concepts, offering an introduction to the available handbooks, databases,
manuals, and so on. In the list of data curation practices such activity was named as
Contextualizing, i.e. “Use metadata to link the data set to related publications,
dissertations, and/or projects that provide added context for how the data were generated
and why”. A new set of metadata (Table 1) allows linking with external resources
through two elements, “Identification” and “References”. The first uses a link to
Public Databases to accurately identify a substance with access to additional data.</p>
      <p>For example, by including the reference CSID: 6114 (ID from Database
ChemSpider) in the “Identification” field, we can select “Ethylene oxide” from the group of
substances with the same formula C2H4O, gaining access to the reference data,
Fig. 3.</p>
      <p>The References element (Table 1) provides a link to thematically pertaining
resources, but without requiring exact identification of the object. An example is the
linking with the Sacada database [Samara Carbon Allotrope Database,
http://sacada.sctms.ru/], which expands the set of information on carbon allotropes
presented in the THERMAL database.</p>
      <p>Obviously, contextualization as an data curation practice requires the participation
of human experts, but not programmers. Therefore, the term “clearing” within the
framework of data curation means not only the rejection of dirty data, but also data
analysis and decision making. Thus, contextualization as an element of data curation
completely corresponds to the expression “added value to digital research data
throughout its lifecycle”, in response to the question “What is digital curation?”
4.1</p>
      <sec id="sec-4-1">
        <title>Preservation</title>
        <p>Long-term storage, to a large extent, a purely technological problem related to service
life of memory devices (a maximum of 100 years), the solution of which requires
significant funding. The required measures include regular backup and failure
detection at the bit level. Particular attention should be paid to the physical protection of
the data storage, as the frequency of bit rot in data significantly increases due to
pollution, thermal and radiation exposure and other external agents. Some of the
protections activities are most adequate for research data. Among them are format migration
(i.e. consistent change in line with technological changes) and emulation recreating
outdated hardware and software on a modern platform.</p>
        <p>
          When transferring the THERMAL database to the Big Data platform, the obsolete
ISO-2709 format, adopted in the 60s of the last century [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] as ISO standard for
bibliographic description, is discarded. In the new version, documents in ISO format are
converted to structured text in JSON format, one of the most convenient for
exchanging data and metadata [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The advantage of a text document is the possibility of
simple reading and editing, accessibility for human perception, convenient form of
storage and exchange of arbitrary structured information. The JSON-format (unlike ISO)
is convenient for storing factual information in the form of tables and nested
structures, as well as numerous links to files of different formats (images, presentation
files, Web-pages, etc.), which is especially important when expanding the functions of
the THERMAL database. It is also important that the JSON format is a working
object for some platforms, in particular for Apache Spark, allowing for the exchange,
storage and queries for distributed data. There is already an experience of using
structured text as a means of thermal properties data interchange [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>
          Along with the obsolescence of formats, outdated software that is incompatible
with more modern platforms affects long-term storage. This fully applies to the
database THERMAL, built on the basis of the documentary Database Management
System (DBMS) CDS/ISIS [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] with a fairly limited scope, mainly for the storage of
catalogue cards. The transition to the Apache Spark platform
(http://spark.apache.org/docs/) in combination with ontology-based data management
opens up much greater opportunities in storing and integrating data of arbitrary
format, converted into JSON-format. In turn, ontology supports a single vocabulary of
all concepts, expanded by logical connections and axioms. An ontology encoded as an
OWL file becomes a control superstructure capable of semantic integration of
heterogeneous data.
        </p>
        <p>The arsenal of tools proposed in the project (JSON, Apache Spark, SPARQL)
meets modern standards for storing and processing heterogeneous data and is capable
of supporting interaction with many types of storage. Thereby, long-term data storage
will be provided with the possibility of painless migration to subsequent versions of
the software.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Zorich</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          :
          <article-title>Data management: Managing electronic information: Data curation in museums</article-title>
          .
          <source>Museum Management and Curatorship</source>
          <volume>14</volume>
          (
          <issue>4</issue>
          ),
          <volume>431</volume>
          (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Beagrie</surname>
          </string-name>
          , N.:
          <article-title>Digital curation for science, digital libraries, and individuals</article-title>
          .
          <source>The International Journal of Digital Curation</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          ),
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Abbott</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>"What is Digital Curation?". DCC Briefing Papers: Introduction to Curation. Edinburgh: Digital Curation Centre</article-title>
          . Handle:
          <year>1842</year>
          /3362 (
          <year>2008</year>
          ). Available online: http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>L.R.</given-names>
          </string-name>
          , et al.:
          <article-title>How important are data curation activities to researchers? Gaps and opportunities for academic libraries</article-title>
          .
          <source>Journal of Librarianship and Scholarly Communication</source>
          ,
          <volume>6</volume>
          (General Issue),
          <year>eP2198</year>
          (
          <year>2018</year>
          ). Available online: https://doi.org/10.7710/
          <fpage>2162</fpage>
          -
          <lpage>3309</lpage>
          .
          <fpage>2198</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kosinov</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitserman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .,
          <string-name>
            <surname>Kobzev</surname>
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>Ontology-based methods of thermophysical data integration</article-title>
          .
          <source>In: XV Russian Conference (with international participation) on Thermophysical Properties of Substances (RCTP-15)</source>
          ,
          <fpage>103</fpage>
          -
          <lpage>104</lpage>
          . Book of Abstracts. Moscow, Russia, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pennock</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Curating e-Science Data. DCC Briefing Papers: Introduction to Curation. Edinburgh: Digital Curation Centre</article-title>
          . Handle:
          <year>1842</year>
          /3330 (
          <year>2006</year>
          ). Available online: http://www.dcc.ac.uk/resources/briefing-papers/introduction-curation
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Yerkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitserman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          ., and
          <string-name>
            <surname>Kobzev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>The role of metadata in the creation and application of information resources on the properties of substances and materials</article-title>
          .
          <source>Sci. Tech. Information Process</source>
          <volume>35</volume>
          (
          <issue>6</issue>
          ),
          <fpage>47</fpage>
          -
          <lpage>255</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Davenhall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : “Scientific Metadata”, DCC Digital Curation Manual,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ross</surname>
          </string-name>
          , M. Day (eds), (
          <year>2011</year>
          ). Available online: http://www.dcc.ac.uk/resources/curationreference-manual/scientific-metadata
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Willis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>White</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Analysis and synthesis of metadata goals for scientific data</article-title>
          .
          <source>J. American Soc. for Information Science and Technology</source>
          <volume>63</volume>
          (
          <issue>8</issue>
          ),
          <fpage>1505</fpage>
          -
          <lpage>1520</lpage>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Erkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitserman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .,
          <string-name>
            <surname>Kobzev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Fokin L.R.:</surname>
          </string-name>
          <article-title>The logical structure of physicochemical data: problems of numerical data standardization and exchange</article-title>
          .
          <source>Russian Journal of Physical Chemistry A</source>
          .
          <volume>82</volume>
          (
          <issue>1</issue>
          ),
          <fpage>15</fpage>
          -
          <lpage>25</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Erkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitserman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .,
          <string-name>
            <surname>Kobzev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Trakhtenhers</surname>
            ,
            <given-names>M.S.:</given-names>
          </string-name>
          <article-title>A universal metadata system for the characterization of nanomaterials</article-title>
          .
          <source>Sci. Tech. Inf. Process</source>
          <volume>42</volume>
          (
          <issue>4</issue>
          ),
          <fpage>211</fpage>
          -
          <lpage>222</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Erkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitserman</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          ., and
          <string-name>
            <surname>Kobzev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>The intensive use of digital data in modern natural science</article-title>
          .
          <source>Automatic Documentation and Mathematical Linguistics</source>
          <volume>51</volume>
          (
          <issue>5</issue>
          ),
          <fpage>201</fpage>
          -
          <lpage>213</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <article-title>Water Structure and Science</article-title>
          . P7.
          <article-title>Supercooled water has two phases and a second critical point</article-title>
          . Available online: http://www1.lsbu.ac.uk/water/phase_anomalies.html
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kane</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Moore</surname>
          </string-name>
          , J.:
          <article-title>Topological insulators</article-title>
          .
          <source>Physics World 32-36</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Eletskii</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erkimbaev</surname>
            ,
            <given-names>A.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobzev</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trachtengerts</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          , and Zitserman V.Y.:
          <article-title>Properties of nanostructures: data acquisition, categorization, and evaluation</article-title>
          . Data
          <source>Science Journal</source>
          <volume>11</volume>
          ,
          <fpage>126</fpage>
          -
          <lpage>139</lpage>
          (
          <year>2012</year>
          ). Available online: https://www.jstage.jst.go.jp/browse/dsj
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Rahm</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>H.H.</given-names>
          </string-name>
          :
          <article-title>Data cleaning: problems and current approaches</article-title>
          .
          <source>IEEE Data Eng. Bull</source>
          .
          <volume>23</volume>
          (
          <issue>4</issue>
          ),
          <fpage>3</fpage>
          -
          <lpage>13</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. CDS/ISIS for Windows:
          <source>Reference Manual (Version 1.31)</source>
          . Paris: UNESCO (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>