Automation of the Export Data from Open Journal Systems to the Russian Science Citation Index
         Automation of the Export Data from Open Journal
           Systems to the Russian Science Citation Index

               Serhiy O. Semerikov1[0000-0003-0789-0272], Vladyslav S. Pototskyi1,
        Kateryna I. Slovak2[0000-0003-4012-8386], Svitlana M. Hryshchenko2[0000-0003-4957-0904]
                                        and Arnold E. Kiv3
        Kryvyi Rih State Pedagogical University, 54, Gagarina Ave., Kryvyi Rih, 50086, Ukraine
                            {semerikov, pototskiyvl}@gmail.com
                 State Institution of Higher Education “Kryvyi Rih National University”,
                          11, Vitali Matusevich St., Kryvyi Rih, 50027, Ukraine
                 slovak@fsgd.ccjournals.eu, s-grischenko@ukr.net
                          Ben-Gurion University of the Negev, Beer Sheba, Israel

           Abstract. It is shown that the calculation of scientometric indicators of the sci-
           entist and also the scientific journal continues to be an actual problem nowa-
           days. It is revealed that the leading scientometric databases have the capabilities
           of automated metadata collection from the scientific journal website by the use
           of specialized electronic document management systems, in particular Open
           Journal Systems. It is established that Open Journal Systems successfully ex-
           ports metadata about an article from scientific journals to scientometric data-
           bases Scopus, Web of Science and Google Scholar. However, there is no stand-
           ard method of export from Open Journal Systems to such scientometric data-
           bases as the Russian Science Citation Index and Index Copernicus, which de-
           termined the need for research. The aim of the study is to develop the plug-in to
           the Open Journal Systems for the export of data from this system to scientomet-
           ric database Russian Science Citation Index. As a result of the study, an in-
           fological model for exporting metadata from Open Journal Systems to the Rus-
           sian Science Citation Index was proposed. The SirenExpo plug-in was devel-
           oped to export data from Open Journal Systems to the Russian Science Citation
           Index by the use of the Articulus release preparation system.

           Keywords: Scientometric Indicators, Scientometric Databases, Specialized
           Systems for Electronic Workflow Support, SirenExpo plug-in.

1          Introduction

The formalized accounting of the scientist’s productivity according to the published
results is an important component of the evaluation of his activity, the activity of sci-
entists and scientific institutions ‒ is carried out with the help of scientometric data-
bases. Internet-accessibility of the scientific publication for today is one of the top-
priority requirements for its inclusion in any scientometric databases.
   The main source of information about the publication is their annotations and other
metadata posted on the website of the scientific journal. The use of standard protocols
for metadata exchange promotes a better calculation of scientometric indicators not
only of the scientist but also of the scientific journal itself (primarily its impact factor)
[3; 5; 6; 8; 16].
   Unfortunately, not all leading scientometric databases have the possibility of auto-
mated metadata collection from the scientific journal’s site, which actualized the con-
duct of an appropriate study.

2      Literature Review and Problem Statement

The issue of qualitative and quantitative evaluation of published scientific results is
due to the cause of the appearance of scientometry. Scientometrics determines the
quality of scientific works and the quality of the scientist’s work by analyzing scien-
tific works on certain criteria.
    One of the founders of scientometry is John Desmond Bernal, who described the
laws of the functioning and development of science, the structure and dynamics of
scientific activity, the interaction of science with the material and spiritual sphere of
society, the role of scientometry in the social process in his work «The Social Func-
tion of Science» [1] of 1939.
    After World War II, Derek John de Solla Price made a significant contribution to
the development of science. Being a mathematician and a physicist, he defended his
second thesis on the history of science. D. J. de Solla Price used quantitative methods
to study science [13].
    The term «scientometrics» was first used by V. V. Nalimov and Z. M. Mulchenko
in the monograph «Scientometrics. Study of science as an information process», pub-
lished in 1969. Authors define scientometrics as one of the branches of science, in
which «science is viewed as a system that self-organizing and directs its own infor-
mation flows» [10, p. 6]. «While studying science as an information process, it turns
out to be possible to apply quantitative (statistical) research methods... It seems natu-
ral to call this direction of research – scientometrics» [10, p. 9].
    A great contribution to scientometrics was made by Eugene Eli Garfield [6], who
in 1960 founded Institute for Scientific Information. In 1964, E. Garfield launched the
Science Citation Index [5], which became a powerful tool of scientometrics and be-
came the basis of the scientometric database Web of Science.
    The main scientometrics indicators are: Science Citation Index; h-index; g-index;
i10-index and impact factor [14; 15].
    Science Citation Index (SCI) – is a measure of the author’s influence or scientific
work on the development of science (see Fig. 1, Fig. 2). SCI reflects the total number
of references to a particular scientific work or author in other scientific articles. The
negative side of scientometric research using SCI is that this index does not take into
account the time of article influence on science. That means the author, who created
an article of poor quality about 20 years ago and quoted at least once a year, receives
the same citation index as the good work that received 20 citations per 20 years.
   Also, the citation index does not reflect the characterization of the scientific poten-
tial of the scientist. That is, a scientist who has written one work that has gained a
certain popularity, and without having written more works, can have the same popu-
larity with that scientist who has many scientific works. This and other shortcomings
of the citation index prompted scientists to create new methods for assessing scientific

                  Fig. 1. Indexed articles from SCI (according to [9, p. 3])

                 Fig. 2. The list of references to SCI (according to [9, p. 3])

The h-index was developed by Jorge E. Hirsch, a professor of physics at the Califor-
nia University of San Diego, who proposed «Hirsch’s index» in 2005 [8], where he
described the algorithm of the index, as well as the advantages and disadvantages of
alternative methods (Table 1). According to J. Hirsch, the relationship between the h-
index and the total number of citations can be described by the formula

                                           Nc,tot=ah2.                                           (1)
J. Hirsch find empirically that a ranges between 3 and 5.

   Table 1. Traditional methods for assessing the performance of a scientist according to [8]
 No             Method                  Advantage                          Disadvantage
 (i)     total number         of   measures productivity       does not measure importance or
         papers (Np)                                           impact of papers
 (ii)    total number         of   measures total impact       – hard to find and may be inflated
         citations (Nc,tot)                                    by a small number of “big hits”,
                                                               which may not be representative of
                                                               the individual if he or she is a coau-
                                                               thor with many others on those
                                                               papers. In such cases, the relation in
                                                               Eq. 1 will imply a very atypical
                                                               value of a > 5;
                                                               – gives undue weight to highly cited
                                                               review articles versus original re-
                                                               search contributions.
 (iii)   citations per paper       allows comparison of        hard to find, rewards low produc-
         (Nc,tot/Np)               scientists of different     tivity, and penalizes high productivi-
                                   ages                        ty
 (iv)    number of “signifi-       eliminates the disad-       y is arbitrary and will randomly
         cant papers”, de-         vantages of criteria (i),   favor or disfavor individuals, and y
         fined as the number       (ii) and (iii) and gives    needs to be adjusted for different
         of papers with more       an idea of broad and        levels of seniority
         then y citations          sustained impact
 (v)     number of citations       overcomes many of the       it is not a single number, making it
         to each of the q          disadvantages of the        more difficult to obtain and com-
         most-cited papers         criteria above              pare; also, q is arbitrary and will
                                                               randomly favor and disfavor indi-

   h-index is a scientific metric that is a quantitative characteristic of the performance
of a scientist, group of scientists or a country. According to J. Hirsch, the scientist has
an index h, if his Np articles are quoted at least h times. Scientific works that do not
satisfy this condition are not included in the indexation.
   The peculiarity of the h-index is that it well reflects the results of scientific work
when comparing the productivity of the scientific process in one area of activity. The
disadvantage of the h-index is that the scientific index depends on the activity of the
scientist. If a scientist ceases to engage in scientific work, his index will be the same
as he was before, or at best, the scientist will have an h-index equal to the number of
his articles.
    The problem of staticity of the h-index was attempted to solve by a Belgian scien-
tist from Universiteit Hasselt Leo Egghe by offering the g-index [4]. For a plurality of
papers by a scholar sorted by the number of quotes, g-index is the largest number that
g most cited articles received a total of at least g2 citations (see Fig. 3).

                    Fig. 3. The graph of the g-index (according to [14])

i10-index is the number of publications that were quoted not less than 10 times [7].
i10-index was developed by Google in 2011. This indicator depends predominantly
on the age of the researcher and has a tendency to grow steadily. The five-year
i10-index allows you to assess the current performance, and the overall ‒ the impact
of the work of a scientist on modern science without taking into account his past suc-
cesses [8].
   Impact factor (IF) is the ratio of the number of citations of articles of a certain
journal to the total number of articles published in this journal (Eq. 2). In each par-
ticular year, the factor influencing the journal is the number of citations this year of
the articles published in the journal over the past two years, divided by the total num-
ber of articles in this journal over the past two years [2].

                            =                                                       (2)

The basis for the analysis of the quantity and quality of the above indicators is the
scientometric databases. They include bibliographic, abstract or full-text material on
scientific publications, as well as tools for further tracking articles cited, internal
search, etc. Scientometric databases are divided into commercial and free ones. The
most popular commercial scientometric databases are Scopus and Web of Science.
Non-profit-oriented ones include Google Scholar, Russian Science Citation Index,
DOAJ, WorldCat, Index Copernicus. The analysis of the leading scientometric data-
bases has made it possible to identify their two main categories: 1) databases that
index article’s metadata automatically (Scopus, Web of Science), and 2) databases
that article’s metadata need to be entered by user’s own hands (Russian Science Cita-
tion Index, Google Scholar and Index Copernicus).
   Reduce the costs of supporting the work of the editorial board by creating the abil-
ity for members of the editorial board to work in the mode of remote access, increase
the efficiency of editorial and publishing processes, improve scientific metrics, etc.
provide specialized systems for supporting electronic document management.
   The study identified four of the most popular systems that have different function-
ality for the publication of scientific papers ‒ Open Journal Systems, DSpace, Koha
and EPrints. The largest support for the editorial staff of the journal provides the Open
Journal Systems (OJS) [12], the latest version of which (3.1) is partially documented
and in a state of development. OJS is a free software developed by a nonprofit Public
Knowledge Project. The system has a wide range of tools for editors of scientific
journals. If some functionality is missing, it can be expanded using plug-ins. The OJS
functionality and low system requirements have made it the standard to support the
work of editorial boards of scientific journals. OJS successfully exports metadata
about articles from scientific journals to such well-known scientometric databases as
Scopus, Web of Science, Google Scholar, but there is no standard export method from
OJS to such scientometric databases as Russian Science Citation Index and Index
Copernicus. eLibrary development is used for submitting data to the Russian Science
Citation Index that is the Articulus. The manual data input to Articulus is duplicated
the work on preparing the description of the articles that has already was done in OJS,
and therefore it’s important to automate this process in order to reduce the unproduc-
tive time costs of members of the editorial board of the journal.

3      The Aim and Objectives of the Study

The aim of the study is to develop the plug-in to the Open Journal Systems for the
export of data from this system to sciencemetric database Russian Science Citation
   To accomplish the set goal, the following tasks had to be solved:
1. to develop an infological model for the metadata export from Open Journal Sys-
   tems to the Russian Science Citation Index;
2. to develop and test the plug-in to the Open Journal Systems for the metadata export
   to the Russian Science Citation Index.
4      Simulation and Development of Software for Export
       Automation from Open Journal Systems to Russian
       Science Citation Index

The OJS has a number of additions to export data in popular formats, as well as to the
DOAJ open source directory. Unfortunately, with the transition to the new (third)
version of OJS, the documentation for the plug-in developer is still not relevant.
   In addition to the undocumented structure of the plug-in, there is another problem –
the under-contentiousness of the metadata required for the Russian Science Citation
Index. In Table 2, an infological model for the export of metadata from OJS to the
Russian Science Citation Index was developed by analyzing the results of numerous
experiments on the data export to/from the Russian Science Citation Index. As a re-
sult, XML structures were installed for import into the Russian Science Citation Index

      Table 2. Model of Metadata Export from OJS to Russian Science Citation Index
     RSCI tag                                      Description
 OperCard               a tag describing user information in the Articulus system
                        (automatically filled in by the system when creating or im-
                        porting a journal)
 Titleid                log title identifier
 ISSN                   an international standard serial number that allows to identify
 EISSN                  an international standard serial number that allows to identify
                        an electronic periodical
 JournalInfo            the block where you can specify Title
 Title (JournalInfo)    the subset of the JournalInfo tag, in which you can specify the
                        name of the journal in different languages using the language
                        attribute (lang=”UKR”, lang=”ENG”, etc).
 Issue                  the main tag, which describes all data of journal issue
 Volume                 journal volume
 Number                 issue number
 AltNumber              end-to-end issue number
 Part                   part of issue
 DateUni                date in YYYYMM format
 IssTitle               volume name
 Pages                  number of pages in a volume
 Articles               the main block, which contains a description of all the articles
 Article                the block of the article, which describes all metadata articles
 ArtType                type of article
 Authors                the main block of the article’s authors
     RSCI tag                                       Description
 Author                  a block that describes single author using the tags: surname,
                         initials, orgName (Organization Name), email, otherInfo
                         (other information)
 ArtTitles               article title block description. The block may include differ-
                         ent languages that are specified when describing the title of
                         an article
 Text                    text of the article
 Codes                   bibliographic description of the article, e. g, UDC, Dublin
                         Core etc.
 KeyWords                a block that describes the article keywords using the keyword
 References              references to other articles
 Files                   files that belong to the article

   The development of the export model allowed us to move on to the next task – de-
signing and developing a plug-in for export.
   When developing the plug-in, the PHP programming language was used; the server
with the LAMP-stack worked under Ubuntu Server 17.04. The following main assets
were used to analyze and write the plug-in: PHPStorm, HeidiSQL, Git.
   OJS has a number of shortcomings in the documentation that describes the rules
and requirements for writing plug-ins to the system. But it is important to note that the
program code is written by OJS authors, is well commented and uses comments in the
form of Doxygen [17]. In the process of document search, the automatically generated
Doxygen documentation for the OJS components was found.
   To generate an XML file, you had to understand the structure of the import file to
the Russian Science Citation Index – for this purpose, the Articulus system exported
and identified the XML tags needed for export. Based on experiments with Articulus
exports to the metadata described in Table 2, mandatory and optional fields were
identified and their association with the OJS metadata was established.
   The general scheme of the plug-in operation (see Fig. 4), is fairly transparent,
which resulted in its rapid prototyping and development.

                  Fig. 4. General scheme of work of the developed plug-in

The created plug-in, called SirenExpo (https://github.com/Ladone/SirenExpo) (see
Fig. 5). It was installed in the OJS system by copying the plug-in directory to the
appropriate directory.
                           Fig. 5. SirenExpo plug-in interface

When using the plug-in, the user receives a list of journal issues that are available to
him for export. To download the issue, select the issue and click on the “Export Is-
sues” button. The program generates an archive and returns to the user. An example
of the generated archive contains two pdf-files and one XML-file (see Fig. 6).

                               Fig. 6. Generated issue files

When authorizing the Articulus system, the user will receive a list of issues that can
be transferred for indexing in the Russian Science Citation Index. In the menu, you
need to click the “Restore Project” button. The user will go to the project recovery
page, where you need to upload the archive generated by the SirenEXPO plug-in.
   After uploading the file, the user receives a brief description of the journal issue,
which was restored using the import function to Russian Science Citation Index (see
Fig. 7).
   At the Articulus, you need to click the “Restore project” button, and then a dialog
box will open, in which you need to select the archive file generated by the SirenEX-
PO plug-in. After uploading the archive, all metadata required for the Russian Science
Citation Index will be successfully imported. It is also possible to go to the restored
project using the link “Open Project”, in which there will be a window for editing the
metadata of the journal. Editing of the restored project is depicted in Fig. 8.
   So, with the help of SirenEXPO plug-in, you can export journal issues from the
Open Journal Systems to the Russian Science Citation Index. The resulting archive is
successfully uploaded into the Articulus – as a result, a new project with metadata
imported from Open Journal Systems is created.
Fig. 7. Restoring of the project in Articulus

    Fig. 8. Editing the restored project
 5       Conclusions

 As a result of the research, a new plug-in for the OJS was created, with the help of
 which you can export data to the scientometric database Russian Science Citation
 Index. The work describes the plug-in structure, the source code clearly shows where
 the information came from and how it was developed. If you need to create a new
 plug-in to OJS or add another scientometric database that needs to be imported, based
 on this development and research results, you can create new plug-ins for export to
 other scientometric databases, such as Index Copernicus.

