<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Developing a Machine-readable Catalog of Computer Programs and Tools for Extracting and Analyzing Contextual Knowledge</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>ITMO University</institution>
          ,
          <addr-line>49 Kronverksky, 197101 St Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>St. Petersburg State University</institution>
          ,
          <addr-line>7/9 Universitetskaya emb., 199034 St Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>For researchers in modern conditions of development and the total application of information and communication technologies, there is an issue of choosing effective tools for research purposes. A huge amount of the existing software lacks classifications of the software and information systems to consider research task classes. The project implemented by the authors aimed to develop an approach to research the evolution of the thematic and terminological apparatus of interdisciplinary scientific fields. The following methods: search, extraction, clarification, explication, analysis, and presentation of contextual knowledgewith software and information systems were considered and applied. The specifics of the research limits software and information systems to the tasks of contextual scientific knowledge processing. The main types of software and information systems used for these purposes were analyzed, and their main functional characteristics were identified. Based on the typology of contexts and the groups of characteristics identified, an approach is proposed to develop a catalog of software and information systems analysis of contextual knowledge with the functions of allocation, classification, and explication of scientific content. To provide information about software and information systems in the catalog a Dublin Core metadata model is proposed. This model allows not only to describe and structure the main characteristics of software and information systems, but also to present the catalog in a machine-readable form to add new records, efficiently search the necessary software and information systems subject to the research tasks, and integrate it into the scientific information network based on open science principles. A palliative solution for testing the correctness of Dublin Core metadata presentation and metadata exchange via the OAI-PMH Protocol is presented.</p>
      </abstract>
      <kwd-group>
        <kwd>Software</kwd>
        <kwd>Classification</kwd>
        <kwd>Catalog</kwd>
        <kwd>Contextual Knowledge</kwd>
        <kwd>Dublin Core</kwd>
        <kwd>OAI-PMH</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The variety of software, application environments, and web-oriented services for
various purposes makes it difficult for modern researchers to choose tools that can be
effectively used in scientific research. They have to focus on the existing approaches</p>
      <p>Copyright ©2020 for this paper by its authors.</p>
      <p>
        Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
to software classification. Both general approaches that distinguish the common
software classes and special approaches that focus on a more detailed description of
software subclasses for various applications have been developed. The most common
classifications include, for example, the Classifier of programs for computers and databases
used by government agencies in Russia [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. One of the general approaches to classify
the Business intelligence (BI) class software is the annually updated analytical report
"Infrastructure and Applications Worldwide Software Market Definitions" by Gartner
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which reflects the general approach of Gartner to assess the software market
development [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. This approach is followed by the International Data Corporation (IDC),
one of the world's largest consulting companies, a leading provider of information and
consulting services, and an organizer of events in the information technology,
telecommunications, and consumer technology markets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The Computing Classification
System (developed by the Association for Computing Machinery, the latest version was
introduced in 2012) can also be used to classify software and information systems. The
system is presented as a single source of categories and concepts that reflect the current
state of the fields related to computational engineering, computer science, and
information and communication technologies [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Other approaches to classification, which
have historically been formed from the logic of computer technology and software
development, are also widespread [
        <xref ref-type="bibr" rid="ref15 ref16 ref22">15, 16, 22</xref>
        ].
      </p>
      <p>The analysis of approaches to software classification makes it possible to identify the
problem of choosing a specific tool with which researchers, as users of software products,
can solve research and analytical tasks. This problem arises from the fact that software
developers try to integrate the maximum set of functionalities into their systems and thus
ensure that a whole range of tasks is performed. Unfortunately, the results obtained are often
not equivalent. A specific computer application has its own "specialization", that is, a
minimal set of functionalities that ensures the performance of a narrower range of tasks, but
most optimally and successfully. There are usually many analogs, their choice by a
researcher is based on compliance with certain initial requirements to the system (for example,
a particular language support or implementation of certain methods in the software). There
are information systems that were not designed by the developers to process scientific
information or analyze information for scientific purposes. Although, if there is an
introductory, explanatory information or a developed method, they can be efficient for scientific
purposes. Even with detailed information about a specific computer program or an
information system, it is not always possible to categorize it with the existing classifications.
Therefore, both classification and identification of specific functionality can be made or
refined only in the process of software application to solve specific research tasks. The
presence of analogs initiates creation of software catalogs focused on solving a certain class of
research tasks based on the classification developed.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Approach to software search and selection</title>
      <p>As part of the ongoing project to develop an approach (synthetic method) to In the
ongoing project framework,the principle of the synthetic method independence
from the specific tools was chosen as the main one.</p>
      <p>The research specifics led to the development of a typology of contextual knowledge
of textual modality, which must be considered when choosing an appropriate software
for processing contexts of certain types (corpus, fragment, paragraph, sentence,
termconcept, thesaurus, meta-description, semantic group, thematic collection). The
topology allows software application for multi-level structural analysis that increases the
research task effectiveness. Focusing on the methods used in the study (search,
extraction, explication, analysis, and representation of contextual knowledge), the general
classifications revealed a lack in clear division into classes for these methods as
enlarged functions. It is also not possible to determine the types of contexts being
processed using these classifications.</p>
      <p>For example, Gartner identifies the following market segments:
─ Data Warehouse;
─ On-Line Analytical Processing, OLAP;
─ Enterprise Information Systems, EIS and Decision Support Systems, DSS;
─ Data Mining;
─ Query and Reporting Tools.</p>
      <p>In the Computing Classification System, the following subclasses correspond to the
systems considered:
─ Specialized information retrieval – Structure and multilingual text search;
─ Document management and text processing – Document capture – Document
searching; Document analysis.</p>
      <p>In the Russian Classifier, the following subclasses of software application can be
associated to the programs under consideration:
─ Search engines – software systems that search for text, graphics, and other
information in local, corporate, and other repositories, including consulting and
information systems for searching and viewing information in specialized multi
industry databases;
─ Linguistic software – parsers and semantic analyzers/systems for natural language
text analysis with the selection of syntactic sentence structures or semantic relations
between text elements and general text meaning;
─ Systems for data sets collection, storage, processing, analyzing, modeling and
visualizing – business analysis systems (BI)/programs focused on big unstructured data
processing to facilitate their interpretation, including tools for data extraction and
transformation (ETL), subject-oriented information databases (EDW), tools for
realtime analytical processing (OLAP), data mining, generating reports, graphs, charts
and other visual forms, decision support (DSS).</p>
      <p>
        To search and identify a software with the functions of extracting, classifying, and
explicating scientific content to support scientific research, both open network sources
and scientific publications with similar software were used. The analysis of the
identified systems showed that the vast majority are used as text linguistic analysis tools
in linguistics [
        <xref ref-type="bibr" rid="ref13 ref33">13, 33</xref>
        ], sociology [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ], cybersecurity [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], as well as in interdisciplinary
fields [
        <xref ref-type="bibr" rid="ref10 ref18 ref4">4, 10, 18</xref>
        ].
      </p>
      <p>
        Some linguistic software catalogs are presented in the Internet [
        <xref ref-type="bibr" rid="ref1 ref14 ref18 ref22 ref28 ref29 ref30">1, 14, 18, 22, 28, 29,
30</xref>
        ]. This software can be used for research purposes.
      </p>
      <p>Catalog analysis allows to identify common areas of software used for the tasks
listed above:
─ Text Mining;
─ Text Analytics/Analysis;
─ Information Retrieval/Extraction;
─ Text Comparison;
─ Topic Clustering/Modelling;
─ Text Visualization.</p>
      <p>Such catalogs do not have classifiers that consider their main functional purpose and
the types of contexts being processed. Therefore, they do not perform the tasks of
selecting effective tools for research. It was decided to reject the use of other software
classifications widely presented in the network, including those based on the signs
"scope of application" and "system functionality", for the same reasons. Such
classifications are intended for the business community, operate with business concepts, focus
on the range of business tasks of an enterprise, organization, or market analysis.</p>
      <p>In this regard, the purpose of this study is to develop a structured description of a
software using the main characteristics: software class, main functions, and types of the
processed contexts. Based on a structured description, the software catalog
development is performed, which allows researchers to solve the problem of effective software
selection to meet the research aims and objectives. The creation of the catalog solves
the researcher’s pragmatic task to make an informed choice of a required software set.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Development of a catalog of a software with functions and services for extracting and analyzing contextual knowledge for scientific research</title>
      <p>2.1</p>
      <sec id="sec-3-1">
        <title>Defining the main software classes</title>
        <p>When forming the structure of a software description for the developed catalog, the
general classification was chosen based on the this software applicability for the
analysis of contextual knowledge with the functions of extracting, classifying, and
explicating scientific content to support scientific research.</p>
        <p>The target group of catalog users includes scientists and teachers specializing in
interdisciplinary research and working with various sources of information and big data.
The following specific task classes for the interdisciplinary studies were set:
─ Neural Network;
─ Machine Learning;
─ Natural Language Processing;
─ Information Extraction;
─ Ontology;
─ Forecasting Systems;
─ Creation and Use of Thesauri;
─ Topic Clustering/Modelling;
─ Full-text Databases;
─ Abstract Databases (metadata only).</p>
        <p>These classes of tasks in the catalog correspond to the "type of software" characteristic.
The latter two types are characterized only by full-text and abstract databases that have
their engines for searching, selecting, and analyzing information.</p>
        <p>The integrated types of software do not always reflect the diversity of their functional
capabilities. Therefore, for a more complete understanding of their capabilities and
rational choice for specific research purposes, the main functions of the software are
grouped separately. The following main functions of the software are distinguished:
─ Classification;
─ Forecasting;
─ Contextual analysis;
─ Selection of data according to various criteria (smart search);
─ Automated data/metadata exchange;
─ Visualization.</p>
        <p>This is a set of basic functions. However, when a specific software application is
included in the catalog, its analysis may reveal other specific functions. Therefore, the
classification of the functions is extensible.</p>
        <p>The proposed classification does not consider some important classes of tasks, such
as "scientific communication" and "issues of science management and research
coordination”, as they go beyond research tasks.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>A typology of contexts for software features</title>
        <p>
          The choice of specific software for research purposes is related to its ability to process
certain types of contexts. Within the framework of this research, the concept of context
is understood as an independent conceptual unit of the thesauri, used as a basis for
classifying scientific texts, as well as for visualizing hierarchical and associative
relations between terms. The explication and analysis of contextual knowledge resulted in
a typology of contextual knowledge developed in this project [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>This classification can be further specified for the study of more specific subject areas.
The correlation of the software with the types of the processed contexts also allows
researchers to choose a software more rationally. Therefore, when classifying a software, it is
proposed to use the type of the processed contexts as an essential characteristic.</p>
        <p>Based on the types of stored, extracted, and processed (analyzed) contexts and their
specification, the software for the catalog can be divided into the following enlarged
categories:
─ An information search system that processes a large number of unstructured texts
and multimedia information, with limitations in the user dialogue with the system,
as well as limited graphematic analysis and low reliability of link detection (Yandex,
Google);
─ Information systems that represent text databases, digital online archives of
scientific publications and abstract databases of multidisciplinary areas, that significantly
differ in the content analysis functionality (eLibrary, T-Libra, Science Direct,
Scopus and WoS);
─ Information and analytical systems that have various degrees of completeness of fact
detection and self-learning, levels of semantic hierarchy and automatic logical
analysis of factual information. These systems also process a large volume of
unformalized texts and multimedia information, (Mallet, AskNet, Voyant-Tools, Tropes,
Sketch Engine, CLAVIRE, RCO (Russian Context Optimizer));
─ Multifunctional mixed-type information systems that have the advantages and
disadvantages of the information systems described above: the ability to process a large
volume of unformalized texts and multimedia information, reliability of identifying
links, a wide range of document formats. These systems have limitations in
automatic semantic analysis of various levels, which is a software developer task
(ABBYY Intelligent Tagger SDK, ABBYY Smart Classifier SDK; Title: PROMT
Analyser).</p>
        <p>These enlarged categories are used in the catalog to group software.
2.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Machine-readable view of the catalog</title>
        <p>The description of the software for contextual knowledge processing was analyzed.
This leads to the conclusion that such catalogues are mostly static lists or tables, where
the software is either grouped by a certain attribute (for example, freely distributed or
commercial; belonging to enlarged functional categories) or presented in an
unstructured form with a brief description of features and links to relevant sites on the Internet.
This presentation of information makes it difficult to quickly search for and effectively
select the software necessary for conducting research, considering the main features,
functionality, types, and formats of the processed contexts.</p>
        <p>
          For a structured representation of information about the software, it is suggested to
use its description, as well as the descriptions of the documents, via the metadata
representation. For example, González and van der Meer considered standard metadata
representation formats (Dublin Core, EAD, ISAD (G) and MARC) and suggested the
Extended Dublin Core for Software Components (XDC-SC) of the Dublin Core
scheme, which allows extracting information about a software using standard search
engine tools or XML tools [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. They also suggested that this approach may encourage
the creation of environments to present information about a software. Other researchers
suggest not to focus on one standard, but develop a Semantic Master Metadata Catalog
(SMMC) to ensure interaction between the existing metadata models (such as Dublin
Core, UNIMARC, MARC21, RDF/RDA, and BIBFRAME) based on the ontology
mapping model [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. Here, an approach to developing a Semantic enhanced Metadata
Software Ecosystem (SMESE) is proposed. This ecosystem is designed to support
specific distributed content management applications. However, the implementation of
such a solution is a complex task that can only be solved at a large consortium level.
        </p>
        <p>
          Based on the generally accepted approaches for software description, it is proposed
to use the Dublin Core meta-data representation specification in the catalog. The main
characteristics of the presented software are described by the corresponding elements
of the main metadata set (Dublin Core Metadata Element Set, DCMES) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. In this
approach, the combination of element values sufficiently describes the software
presented in the catalog, in accordance with the general approach of this specification
application to describe various entities [
          <xref ref-type="bibr" rid="ref23 ref27 ref9">9, 23, 27</xref>
          ]. The proposed approach also makes it
possible to present the catalog in a machine-readable form in the network information
systems with free access for both researchers and automated search and identification
by search engines. In this case, the users can search for various metadata elements:
classes, software functions, or types of the processed contexts.
        </p>
        <p>
          Usually, this approach mainly describes various text objects: articles, books, library
catalog cards, archive materials, and semantic models [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. When developing a
metadata representation scheme for software description, its specifics is not considered,
which is important when choosing a software (for example, the types of contextual
knowledge being processed) [
          <xref ref-type="bibr" rid="ref11 ref3">3, 11</xref>
          ]. Therefore, in connection with the specifics of the
software description, in addition to the main metadata elements, qualifiers were used to
refine the characteristics, which make the second level of metadata and refine the
elements [
          <xref ref-type="bibr" rid="ref20 ref7">7, 20</xref>
          ].
        </p>
        <p>
          Also, the most suitable software platform was selected to present the catalog in a
machine-readable form. When choosing, the following basic principles were
considered:
─ availability – open source or non-commercial software;
─ popularity – the most well-known and widespread solution;
─ flexibility – ability to adjust the metadata description to the task of catalog creation.
Among the systems considered, the most popular at present is DSpace software
platform (https://duraspace.org/dspace/). According to the most authoritative aggregator
ROAR [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], of the 4,725 open access repositories registered, 1,965 use DSpace. The
second place takes EPrints (679). DSpace meets all the above criteria, so this platform
was chosen for this study. After reviewing the documentation for metadata presentation
in DSpace by the Dublin Core specification [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], and considering the specifics of
qualifiers application, the following set of metadata is proposed for each catalog record:
dc.title — software title ;
dc.creator — developer;
dc.subject.classification — main functions (can be added after analyzing
the corresponding software);
dc.subject.other — type of context to process;
dc.description.abstract — software description ;
dc.publisher — vendor (copyright holder);
dc.contributor — contributor (people or organizations who also participated in
the software development );
dc.date.issued —last release date (year);
dc.type — categories (software classes according to the developed classification);
dc.format.mimetype — formats of the processed files;
dc.identifier.uri — identifier (link on the Internet to the developer's site);
dc.source.uri — source (link to the web application);
dc.language — languages of documents to be processed;
dc.relation.isreferencedby — relations (list of publications on the
software);
dc.coverage — supported operating systems;
dc.rights.license — license type.
        </p>
        <p>Based on the proposed approach, more than 50 software items were described. A
visual representation of a catalog record by the Dublin Core specification is shown in
Figure 1.</p>
        <p>dc.title
dc.creator
dc.subject.classification
dc.subject.other
dc.description
dc.publisher
dc.contributor
dc.date.issued
dc.type
dc.format.mimetype
dc.identifier.uri
dc.source.uri
dc.language
dc.coverage
dc.rights
dc.relation.
isreferencedby
dc.coverage
dc.rights.license</p>
        <p>Rambsy, Kenton (2016). Text-Mining Short Fiction by Zora
Neale Hurston and Richard Wright. Using Voyant Tools // CLA
Journal. № 59 (3): 251–258;
Priestley Alexis. Voyant Tools: A Tutorial for Text Analysis:
https://medium.com/@priestleyal/voyant-tools-a-tutorial-for-textanalysis-df265d85d214;
Multisystem</p>
        <p>Creative Commons Attribution 4.0 International (CC BY 4.0)</p>
      </sec>
      <sec id="sec-3-4">
        <title>The implementation and use of the catalog</title>
        <p>Despite the choice of the DSpace software platform for machine implementation, this
system installation and configuration is not a trivial task, which did not allow to
implement this solution immediately. In this regard, a free and open-source software Open
Journal Systems (OJS, https://pkp.sfu.ca/ojs/) was chosen as a palliative solution for
initial testing. This system is a full-cycle publishing platform to publish electronic
journals. This solution has already been applied to the machine-readable representation of
the thesaurus in the framework of the ongoing project. The OJS is easier to install and
configure and works on most virtual hosts. This system has all the functionality
required: it supports Dublin Core metadata format, allows to search for metadata,
provides open access to information, and acts as a provider for OAI-PMH protocol. For
experimental purposes in the installed OJS system
(http://ojs.iculture.spb.ru/index.php/thesauri), the descriptions of several software units
from the selected ones were entered. The Open Harvester Systems (OHS,
https://pkp.sfu.ca/ohs/) installation was used to control the correctness of metadata
display and verify the operation of OAI-PMH Protocol., which is an OAI-PMH metadata
aggregator. The proposed approach to present the software catalog in the
machine-readable form also allows to export metadata to other presentation formats for integration
in various information systems and metadata aggregators. Using the OAI-PMH
metadata exchange protocol makes it possible to integrate the catalog into the
information space of scientific research. The researchers can not only search the catalog but
also create their information systems and aggregate information from the catalog. A
platform like DSpace also allows to use it as an aggregator and collect information
about the software application presented in the catalog from various resources using
OAI-PMH protocol. For example, these may be scientific publications that consider
particular software for specific scientific purposes.</p>
        <p>The implementation of such a distributed information environment allows the users
to present not only software descriptions but also information about its application in
one information space. This provides the researchers with methodological support for
a more rational choice and tools efficiency for their scientific purposes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>The research has shown that there is no common approach to classifying the software
designed for analyzing contextual knowledge with the functions of highlighting,
classifying, and explicating scientific content, considering the types of the processed
contexts. It was also found that there are no developments in the representation of software
catalogs in machine-readable form based on metadata format and considering the
specifics of software for contextual knowledge analysis.</p>
      <p>The developed approach to present a catalog of a software designed for contextual
knowledge analysis with the functions for highlighting, classifying, and explicating
scientific content based on Dublin Core provides:
─ integration of the developed context typology into the catalog, which is an essential
characteristic and the basis for choosing a software for conducting specific research;
─ creation of a machine-readable catalog with standard freeware software (for
example, OJS, DSpace);
─ efficient search and selection of a specific software for research purposes by the main
characteristics described in Dublin Core tags, using standard search engines;
─ open access to catalog records for both users and automated indexing;
─ automated exchange over OAI-PMH protocol for aggregation of catalog meta
descriptions in other information systems.</p>
      <p>The proposed palliative solution (OJS) is expected to be replaced in the future with an
information system based on the freeware DSpace. In parallel, the work will continue
filling the catalog with software descriptions that can be used for contextual knowledge
analysis with the functions for extracting, classifying, and explicating scientific content.
Scientific publications describing the research results using the software presented in
the catalog will also be selected.</p>
      <p>There is a great potential to further catalog application in the framework of teaching
activities: masters of "Digital smart city technologies" educational program, majoring
in "Applied Informatics" will use it to select the technological tools for their research
projects. The catalog is one of the components of the educational and methodological
complex "Technologies of data extraction and mining in scientific research", aimed at
forming research and analytical competencies of undergraduate students. The course
"Information technologies in science" will be modified based on the developed
educational and methodological complex.</p>
      <p>Acknowledgement. This work was supported by the Russian Foundation for Basic Research
(project #18-011-00923-a) and the Vladimir Potanin Foundation (project GK200000654).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>1. 4 Free and Open Source Text Analysis Software</article-title>
          , https://www.softwareadvice.com/resources/easiest-to
          <article-title>-use-free-and-open-source-text-analysis-software</article-title>
          ,
          <source>last accessed</source>
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Andsbjerg</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vesset</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>IDC's Worldwide Software Taxonomy</article-title>
          ,
          <year>2018</year>
          : Update, https://www.idc.com/getdoc.jsp?containerId=US44835319, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Brisebois</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abran</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nadembega</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <string-name>
            <given-names>A Semantic</given-names>
            <surname>Metadata Enrichment Software</surname>
          </string-name>
          <article-title>Ecosystem (SMESE) Based on a Multi-Platform Metadata Model for Digital Libraries</article-title>
          .
          <source>Journal of Software Engineering and Applications</source>
          .
          <volume>10</volume>
          ,
          <fpage>370</fpage>
          -
          <lpage>405</lpage>
          (
          <year>2017</year>
          ). DOI:
          <volume>10</volume>
          .4236/jsea.
          <year>2017</year>
          .
          <volume>104022</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chugunov</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kabanov</surname>
          </string-name>
          , Y.:
          <article-title>“Electronic Governance” As an Interdisciplinary Scientific Field: Scientometrics Analysis</article-title>
          .
          <source>In: The State and Citizens in the Electronic Environment</source>
          . Vol.
          <volume>3</volume>
          .
          <source>Proceedings of the XXII International Joint Scientific Conference «Internet and Modern Society», IMS-2019</source>
          , St. Petersburg, June 19-22,
          <year>2019</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>24</lpage>
          (
          <year>2019</year>
          ). DOI:
          <volume>10</volume>
          .17586/
          <fpage>2541</fpage>
          -979X-2019-3-
          <fpage>11</fpage>
          -24 [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Computing</given-names>
            <surname>Classification</surname>
          </string-name>
          <string-name>
            <surname>System</surname>
          </string-name>
          , https://dl.acm.org/ccs, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>DCMI</given-names>
            <surname>Metadata</surname>
          </string-name>
          <article-title>Terms</article-title>
          . Dublin Core Metadata Initiative, https://www.dublincore.org/specifications/dublin-core/dcmi-terms, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>DCMI</given-names>
            <surname>Qualifiers</surname>
          </string-name>
          . Dublin Core Metadata Initiative, https://www.dublincore.org/specifications/dublin-core/dcmes-qualifiers, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. DSpace/dublin-core-types.xml at master DSpace. DSpace. GitHub, https://github.com/ DSpace/DSpace/blob/master/dspace/config/registries/dublin-core-types.xml, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Fedotov</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leonova</surname>
            ,
            <given-names>Y.V.</given-names>
          </string-name>
          :
          <article-title>Requirements for the prototype of the information resources management system in distributed information systems for the support of scientific research</article-title>
          .
          <source>Computational technologies</source>
          .
          <volume>23</volume>
          (
          <issue>5</issue>
          ),
          <fpage>82</fpage>
          -
          <lpage>109</lpage>
          (
          <year>2018</year>
          ). DOI:
          <volume>10</volume>
          .25743/ICT.
          <year>2018</year>
          .
          <volume>23</volume>
          .5.008 [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Geger</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tchupakhina</surname>
            ,
            <given-names>Y.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geger</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          :
          <article-title>Computers programs for the qualitative and mixed data analysis</article-title>
          .
          <source>St. Petersburg Sociology Today</source>
          .
          <volume>6</volume>
          ,
          <fpage>374</fpage>
          -
          <lpage>388</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>González</surname>
          </string-name>
          , R.,
          <string-name>
            <surname>Van Der Meer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Standard Metadata Applied to Software Retrieval</article-title>
          .
          <source>Journal of Information Science</source>
          .
          <volume>30</volume>
          (
          <issue>4</issue>
          ),
          <fpage>300</fpage>
          -
          <lpage>309</lpage>
          (
          <year>2004</year>
          ). DOI:
          <volume>10</volume>
          .1177/0165551504045850.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. Infrastructure and
          <string-name>
            <surname>Applications Worldwide Software Market Definitions. Gartner Dataquest Guide</surname>
          </string-name>
          (
          <year>2002</year>
          ), http://smartshore.us/Infrastructure_Market_trends_
          <year>2003</year>
          .pdf, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ivanova</surname>
            ,
            <given-names>A.A.</given-names>
          </string-name>
          :
          <article-title>Rhetoric of wargames (results of the content analysis)</article-title>
          .
          <source>In: Computer Linguistics and Computing Ontologies</source>
          . Vol.
          <volume>3</volume>
          (Proceedings of the XXII International Joint Scientific Conference «Internet and Modern Society», IMS-2019, St. Petersburg, June 19-22,
          <year>2019</year>
          ), pp.
          <fpage>266</fpage>
          -
          <lpage>278</lpage>
          . ITMO University, St.
          <source>Petersburg</source>
          (
          <year>2019</year>
          ). DOI:
          <volume>10</volume>
          .17586/
          <fpage>2541</fpage>
          -9781- 2019-3-
          <fpage>266</fpage>
          -278 [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <article-title>Catalog of linguistic programs and resources on the Web</article-title>
          . Compiled by
          <string-name>
            <given-names>S.V.</given-names>
            <surname>Logichev</surname>
          </string-name>
          . (
          <year>2006</year>
          ), https://rvb.ru/soft/catalogue/index.html last accessed
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <article-title>Classification of computer software</article-title>
          . In: Samsonova,
          <string-name>
            <given-names>O.V.</given-names>
            <surname>Informatika</surname>
          </string-name>
          <article-title>: uchebnoe posobie</article-title>
          , http://tpt.tom.ru/umk/informat/uchebnik/klass.htm,
          <source>last accessed</source>
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <article-title>Classification of software</article-title>
          . In: Alekseev,
          <string-name>
            <given-names>E.G.</given-names>
            ,
            <surname>Bogatyrev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.D.</given-names>
            <surname>Informatika</surname>
          </string-name>
          .
          <article-title>Mul'timediynyy elektronnyy uchebnik</article-title>
          , http://inf.e-alekseev.ru/text/Klassif_po.html,
          <source>last accessed</source>
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kononova</surname>
            ,
            <given-names>O.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prokudin</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          :
          <article-title>An approach to extraction, explication and presentation of contextual knowledge in the study of developing interdisciplinary research areas</article-title>
          .
          <source>International Journal of Open Information Technologies</source>
          <volume>8</volume>
          (
          <issue>1</issue>
          ),
          <fpage>90</fpage>
          -
          <lpage>101</lpage>
          (
          <year>2020</year>
          ). [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kravchenko</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
          </string-name>
          .A.:
          <article-title>Information's semantic search, classification, structuring and integration objectives in the knowledge management context problems</article-title>
          . Izvestiya SFedU.
          <source>engineering sciences 7</source>
          (
          <issue>180</issue>
          ),
          <fpage>5</fpage>
          -
          <lpage>18</lpage>
          (
          <year>2016</year>
          ). DOI:
          <volume>10</volume>
          .18522/
          <fpage>2311</fpage>
          -3103-2016-7-518 [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>K.I.</given-names>
          </string-name>
          :
          <article-title>Overview of systems for extracting data from unstructured texts (</article-title>
          <year>2013</year>
          ), http://www.pullenti.
          <article-title>ru/(X(1)S(ngdeikpifqat0ccmnoqanfz3))/CompetitorPage</article-title>
          .aspx?Aspx AutoDetectCookieSupport=1, last accessed
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. Dublin Core Qualifiers. RUSMARC, Russian version of UNIMARC. National Library of Russia, http://www.rusmarc.info/soft/dcq.html,
          <source>last accessed</source>
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Lavrent</surname>
          </string-name>
          <article-title>'ev,</article-title>
          <string-name>
            <given-names>A.M.</given-names>
            ,
            <surname>Smirnov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.V.</given-names>
            ,
            <surname>Solov</surname>
          </string-name>
          'ev,
          <string-name>
            <given-names>F.N.</given-names>
            ,
            <surname>Suvorova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.I.</given-names>
            ,
            <surname>Fokina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.I.</given-names>
            ,
            <surname>Chepovskiy</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.M.:</surname>
          </string-name>
          <article-title>Analysis of corpus of extremist texts and unlawful texts</article-title>
          .
          <source>Voprosy kiberbezopasnosti</source>
          <volume>4</volume>
          (
          <issue>32</issue>
          ),
          <fpage>54</fpage>
          -
          <lpage>60</lpage>
          (
          <year>2019</year>
          ). DOI:
          <volume>10</volume>
          .21681/
          <fpage>2311</fpage>
          -3456-2019-4-
          <fpage>54</fpage>
          -60.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Morozevich</surname>
            <given-names>A.N.</given-names>
          </string-name>
          et al.:
          <article-title>Fundamentals of Informatics</article-title>
          and Computer Engineering:
          <string-name>
            <given-names>A Study</given-names>
            <surname>Guide. Morozevicha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.N.</surname>
          </string-name>
          <article-title>(eds)</article-title>
          . BGEU, Minsk (
          <year>2005</year>
          ). [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Noor</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adil</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gohar</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saman</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jamil</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qayum</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Modeling and representation of built cultural heritage data using semantic web technologies and building information model</article-title>
          .
          <source>Computational and Mathematical Organization Theory</source>
          <volume>25</volume>
          ,
          <fpage>247</fpage>
          -
          <lpage>270</lpage>
          (
          <year>2019</year>
          ). DOI:
          <volume>10</volume>
          .1007/s10588-018-09285-y.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <article-title>Order of the Ministry of Telecom and Mass Communications of the Russian Federation of December 31, 2015 N 621 "On approval of the classifier of programs for electronic computers and databases"</article-title>
          <source>(ed. Order of the Ministry of Telecom and Mass Communications of 01.04.2016 N 134, of 30.07</source>
          .2019
          <string-name>
            <surname>N</surname>
          </string-name>
          422), https://normativ.kontur.
          <source>ru/document?moduleId=1&amp;documentId=345157#h74, last accessed</source>
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <article-title>Programs for linguistic analysis and text processing</article-title>
          , http://asknet.ru/analytics/programms.htm, accessed
          <year>2020</year>
          /02/17. [in Russian].
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. Registry of Open Access Repositories, http://roar.eprints.org, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27. SUNScholar/Metadata/By Function. Libopedia, https://wiki.lib.sun.ac.za/index.php/SUNScholar/ Metadata/By_Function, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Text</surname>
            <given-names>Analysis</given-names>
          </string-name>
          ,
          <source>Text Mining, and Information Retrieval Software</source>
          , https://www.kdnuggets.com/software/text.html, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29. Text Mining Software, https://www.capterra.com/text-mining-software,
          <year>accessed 2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <article-title>Text mining, text analytics &amp; content analysis with free open source software</article-title>
          , https://www.opensemanticsearch.org/doc/analytics/textmining, accessed
          <year>2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Vidiasova</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tensina</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Results of the Semantic Analysis of Texts in Mass Media on the Development of «Smart Cities» in Russia. In: The State and Citizens in the Electronic Environment</article-title>
          . Vol.
          <volume>2</volume>
          (Proceedings of the XXI International Joint Scientific Conference.
          <source>Internet and Modern Society</source>
          , IMS-2018, St. Petersburg, May 20 - June 2,
          <year>2018</year>
          ), pp.
          <fpage>112</fpage>
          -
          <lpage>117</lpage>
          . ITMO University, St.
          <source>Petersburg</source>
          (
          <year>2018</year>
          ). DOI:
          <volume>10</volume>
          .17586/
          <fpage>2541</fpage>
          -979X-2018-2-
          <fpage>112</fpage>
          -117.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Woodward</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Biscotti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Contu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hare</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhullar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dayley</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swinehart</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dsilva</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wurster</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poulter</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palanca</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deshpande</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abbabatulla</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Warrilow</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dharmasthira</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kostoulas</surname>
          </string-name>
          , J.:
          <source>Market Definitions and Methodology: Software</source>
          (
          <year>2019</year>
          ), https://www.gartner.com/en/documents/3906823/market-definitions-and
          <string-name>
            <surname>-</surname>
          </string-name>
          methodology-software,
          <year>accessed 2020</year>
          /02/17.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharov</surname>
          </string-name>
          . V. P.:
          <article-title>Computerized visualization of the Russian language picture of the world</article-title>
          .
          <source>In: Computer Linguistics and Computing Ontologies</source>
          . Vol.
          <volume>3</volume>
          .
          <source>Proceedings of the XXII International Joint Scientific Conference «Internet and Modern Society», IMS2019</source>
          , St. Petersburg, June 19-22,
          <year>2019</year>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>105</lpage>
          (
          <year>2019</year>
          ). DOI:
          <volume>10</volume>
          .17586/
          <fpage>2541</fpage>
          -9781- 2019-3-
          <fpage>92</fpage>
          -105.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>