<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>NaïveRole: Author-Contribution Extraction and Parsing from Biomedical Manuscripts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dominika Tkaczyk</string-name>
          <email>ad.tkaczyk@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew Collins</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joeran Beel</string-name>
          <email>cjoeran.beel@scss.tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Trinity College Dublin, ADAPT Centre, School of Computer Science and Statistics</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <abstract>
        <p>Information about the contributions of individual authors to scientific publications is important for assessing authors' achievements. Some biomedical publications have a short section that describes authors' roles and contributions. It is usually written in natural language and hence author contributions cannot be trivially extracted in machine readable format. In this paper, we present 1) A statistical analysis of roles in author contributions sections, and 2) NaïveRole, a novel approach to extract structured authors' roles from author contribution sections. For the first part, we used co-clustering techniques, as well as Open Information Extraction, to semi-automatically discover the popular roles within a corpus of 2,000 contributions sections from PubMed Central. The discovered roles were used to automatically build a training set for NaïveRole, our role extractor approach, based on Naïve Bayes. NaïveRole extracts roles with a micro-averaged precision of 0.68, recall of 0.48 and F1 of 0.57. It is, to the best of our knowledge, the first attempt to automatically extract author roles from research papers. This paper is an extended version of a previous poster published at JCDL 2018.</p>
      </abstract>
      <kwd-group>
        <kwd>document analysis</kwd>
        <kwd>author contributions</kwd>
        <kwd>semantic publishing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Authorship is an important concept in scholarly communication. It allows people to
properly credit those who contributed to scientific discoveries and is widely used to
assess people’s scientific achievements. However, to fully evaluate researcher’s
achievements, it is useful to know the precise nature of their contributions to authored
publications. In some biomedical journals, a submitting author must provide
information about each author’s individual contributions. This information is then attached
to the manuscript as a short section entitled e.g. “Authors’ Contributions” (Fig. 1).
Examples of contributor roles include the preparation of data, designing experiments,
programming software, or writing and editing the manuscript.</p>
      <p>
        These sections are usually written in natural language, are unstructured, and are
intended for humans to read rather than machines. Contribution taxonomies and
machinereadable formats are being introduced slowly, however, digital libraries contain
documents that have already been published in previous decades. Contribution information
in such documents will not conform to new standards and will remain in an unstructured
format. Consequently, analyses of author contribution information requires
timeconsuming manual work, which makes processing large collections of documents in
digital libraries impractical. We address these issues by proposing:
1. a method for semi-automatically discovering what roles are common in a corpus of
sections of interest
2. a scalable approach for annotating a ground truth role dataset
3. a supervised algorithm for automatic extraction of the roles from unstructured text
This paper is an extended version of a poster published at the Joint Conference on
Digital Libraries [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This extended version contains further descriptions of our study and
proposed approaches, a comparison of our results to an existing contributor role
taxonomy, and an error analysis of the proposed automatic role extractor.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        General information extraction from scientific literature is a popular research area,
resulting over the years in many approaches and tools, including CERMINE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
GROBID [3], PDFX [
        <xref ref-type="bibr" rid="ref3">4</xref>
        ], ParsCit [
        <xref ref-type="bibr" rid="ref4">5</xref>
        ], Science Parse1 and Docear PDF Inspector [
        <xref ref-type="bibr" rid="ref5">6</xref>
        ].
However, none of these systems extracts information related to the contributions of
individual authors directly from the content of the paper.
      </p>
      <p>The scientific community has thus far not agreed on standard author contributions
or even standard criteria for authorship. Nevertheless, some initiatives have been
undertaken to increase the level of consistency between journals. For example, the
International Committee of Medical Journal Editors published guidelines that suggest
minimum requirements for authorship, and the use of these guidelines is now encouraged
by some medical journals.</p>
      <p>
        CRediT 2 is an example of a contribution taxonomy that defines the standard for
contributors’ roles. CRediT is composed of 14 roles and was created based on
freeform contributions and acknowledgements sections. Journals are increasingly adopting
taxonomies like CRediT to consistently describe author contributions [
        <xref ref-type="bibr" rid="ref7">8</xref>
        ]. Our study
does not assume any input taxonomy but aims at discovering popular roles within a
corpus of contribution descriptions in an unsupervised way.
      </p>
      <p>
        Some journals, such as PLOS One or Annals of Internal Medicine, publish author
contribution information in a machine-readable form. Several studies have examined
author contributions using this data, for example, comparing author orderings to
1 https://github.com/allenai/science-parse (we used version 1 as at the time of our analysis the currently
released version 2 was not available, or we were at least not aware of it)
2 Contributor Roles Taxonomy: https://casrai.org/credit/
contributions [
        <xref ref-type="bibr" rid="ref8 ref9">9, 10</xref>
        ]. Typically, however, author contribution information has an
unstructured, natural language form, and cannot be trivially examined in this fashion.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <sec id="sec-3-1">
        <title>Roles Discovery</title>
        <p>The first stage of our workflow (Fig. 2) is to discover common roles appearing in the
corpus. Our analysis was composed of the following steps:
 Data preparation, where we gathered a corpus of contributions sections.
 Data preprocessing, where role mentions were extracted and cleaned.
 Clustering, where abstract role concepts were discovered.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Data Preparation</title>
        <p>We use the PubMed Central Open Access Subset as data for our work. This is a subset
of the total collection of articles in PMC, published under open licenses. We
downloaded the corpus of 1.6 million documents in machine-readable JATS format3. From
each document we extracted any section whose normalized (lowercased and with all
non-letters removed) title equals “authorscontributions”. We found these sections in
186,874 documents, constituting ~12% of the corpus. For performance reasons, we use
a random subset of 2,000 sections only. All sections are written in English.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Preprocessing</title>
        <p>
          Authors’ contribution sections typically mention the roles of several individual authors.
We refer here to a natural language expression of the role an author plays as a role
mention. A same role (e.g. data analysis) can be expressed by many forms of role
mention (e.g. “X analyzed microarray sequences”, “X was involved in data analysis”).
3 https://jats.nlm.nih.gov/
We represent a role mention as a 3-element tuple containing: 1) subject: “who”, usually
author name or initials, 2) action: activity, often a verb phrase, 3) object: “what the
action was applied to”, typically a noun phrase (Fig. 3).
We use the Stanford Open Information Extraction tool4 to extract role mentions from
the text. OpenIE [
          <xref ref-type="bibr" rid="ref10">11</xref>
          ] is an information extraction paradigm, in which it is possible to
extract relations in the form of tuples (relation plus its two arguments) from the text, in
an unsupervised way. The output corresponds to 3-element role mentions, where action
is the relation expression and subject and object are its two arguments.
        </p>
        <p>As a result of applying OpenIE to our sections corpus, for every section we obtained
a bag of role mentions, where a mention is a tuple of three text fragments from the
original text. For example, from the sentence “AWL did the literature search and
participated in the writing of the manuscript.” we got the following tuples: (“AWL”, “did”,
“literature search”) and (“AWL”, “participated in”, “writing of manuscript”).</p>
        <p>OpenIE tools tend to output tuples that are redundant. For example, from the same
sentence we might get both (“authors”, “read”, “final manuscript”) and (“authors”,
“read”, “manuscript”) tuples. We analyze all pairs of tuples and consider one tuple in a
pair redundant if the following conditions were met: 1) their subjects are exactly the
same, 2) the action of one tuple contains all the words of the other action in the same
order, and 3) the object of one tuple contains all the words of the other object in the
same order. We remove such redundant tuples.</p>
        <p>The roles in role mentions are expressed by action-object pairs, and the subject
refers only to the author. At the beginning, our corpus of 2,000 sections contained 6,924
distinct action-object pairs, many of which expressed the same roles.</p>
        <p>To merge some mentions and reduce the number of distinct action-object pairs, we
applied cleaning and normalizing to actions and objects of role mentions. First, we
stemmed words within actions and objects, and removed stopwords. For stemming we
used R’s SnowballC library, and the stopwords list was downloaded from an online
source5. This reduced the number of distinct roles to 6,289. We also remove rare role
mentions, that is, mentions appearing less than five times in the corpus. This leaves 434
distinct action-object pairs while keeping 55% of role mentions.</p>
        <p>Finally, we observed that due to splitting role mentions into action and object, we
still have distinct mentions that obviously refer to the same role, such as (“analys”,
“data”) and (“perform”, “the analys of the data”). We wanted to normalize this, at the
4 https://nlp.stanford.edu/software/openie.html
5 http://www.ranks.nl/stopwords
same time keeping the tuple-based structure of the mentions. To achieve this, we
extracted a number of most common terms from both actions and objects of the mentions
(terms appearing at least 20 times in the corpus), and then each term was labeled as
“action keyword” or “object keyword”, based on whether it is more common among
actions or objects.</p>
        <p>Table 1 lists extracted action and object keywords. Each role mention in the corpus
was then transformed in the following way: 1) the subject was left intact, 2) all action
keywords found in the entire original mention formed the new action, and 3) all object
keywords found in the entire original mention formed the new object. In addition, if the
new action turned out to be empty, we added a single “perform” keyword to it.
This operation moved words between actions and objects so that action keywords are
always in the actions of the mentions and object keywords are in their objects. For
example, since “perform” is an action keyword, and “analys” and “data” are object
keywords, both mentions (“analys”, “data”) and (“perform”, “the analys of the data”)
became (“perform”, “data analys”). This process left us with 285 distinct role mentions.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Finding Roles</title>
        <p>
          In this phase, we detect roles in our collection of role mentions. We adopted an
unsupervised machine learning technique (clustering) for this task. This is similar to a
standard ontology learning approach [
          <xref ref-type="bibr" rid="ref11">12</xref>
          ]. At the end of clustering, all mentions that refer to
the same role should belong to the same cluster. For example, (“performed”, “data
analysis”) and (“was involved in”, “analyzing data”) should be clustered together. After
preprocessing, our set contained 9,709 role mentions represented by cleaned
subjectaction-object tuples. We were interested in co-clustering the actions and the objects
separately yet simultaneously, which in turn would define a third clustering based on
the combinations of actions and objects.
        </p>
        <p>More formally, let  = { , … ,  } be the input mention set, and  and  the set
of action clusters and the set of object clusters, respectively. We define an action
clustering as a function  :  →  , which maps mentions to their action clusters. Similarly,
let  :  →  be the mapping function which defines object-based clustering. This lets
us define a role set  as the set containing all combinations of action and object
concepts that share some mentions:  = {( ,  ) ∈  ×  |  ( ) ∩  ( ) ≠ ∅}. The
final combined clustering is  :  →  such that ∀ ∈  ( ) =  ( ),  ( ) .</p>
        <p>Set  defines a binary relation between action and object clusters. We can define
the weight of this relation as the number of the mentions that the clusters share:
∀ ∈ , ∈  ( ,  ) = |{ ∈  |  ( ) = ( ,  )}| = | ( ,  )|. Intuitively, if an action
concept and an object concept appear in many role mentions together, they form a
common role, and the weight of the role is large. This defines a graph structure among the
clusters, with action and object concepts as nodes and weighted edges representing
relation strength.</p>
        <p>Finally, during our analysis we used the idea of a cluster label, defined as a bag of
terms of the most numerous member of the cluster.</p>
        <p>We use bottom-up clustering, where we start with initial action and object clusters,
and in several phases we merge clusters together. Initially, the clusters are defined as
distinct normalized actions and objects. In other words, two mentions are in the same
action/object cluster if their normalized actions/objects are identical. Each round of
clustering is composed of two stages. The first one is based purely on cluster term
labels. The second one uses the graph structure defined previously. Algorithm 1 presents
the pseudocode of the role mentions clustering.</p>
        <p>The first stage of the clustering is based on the action/object label terms of the
current role clusters. We examine pairs of role clusters and merge them if action and object
terms of one of them contain the other cluster’s terms. The new cluster is always given
a label equal to the label of the bigger cluster from the examined pair.</p>
        <p>The main clustering stage is based on the weighted graph relations between action
and object clusters. First, we identify an action or object cluster pair that is most similar
to each other, then their clusters are merged. When the highest similarity drops below
a predefined threshold, the clustering procedure terminates. We will only explain how
the similarity between two action clusters is defined. The similarity between object
clusters is defined analogously.</p>
        <p>The main observation used for calculating the similarity between two action clusters
is that two actions related to a lot of common objects will be more similar to each other.
However, this assumption is trivially violated in cases where there simply are different
ways we can affect the same object (for example the manuscript can be read, written,
reviewed, etc.). In such cases we would like the overall similarity to be lower.
Algorithm 1: Role mentions clustering
To reflect these observations, we introduce an object weight which is the reciprocal of
the number of distinct actions it is related to: ∀ ∈  ( ) = |{ ∈  | (,  ) ∈  }| .
Intuitively, an object with a small weight (such as “manuscript”) interacts with many
different actions, in other words there are many actions that can be applied to it.</p>
        <p>We define the similarity between two actions as the sum of the weights of all the
objects they share: ∀ , ∈   , = ∑ ∈ ,( , ) ∈ ,( , ) ∈ () . Intuitively, two
actions will have high similarity if: 1) they share a lot of objects, and 2) the objects they
share are “specific” (few distinct actions apply to them). An object that interacts with
many actions will not contribute much to the action similarity.</p>
        <p>Examples of merged clusters include: “particip” and “perform”, “contribut” and
“perform”, “assist” and “perform”, “manuscript” and “paper”, “carri” and “perform”,
“experi” and “study”, “perform” and “undertook”, “manuscript” and “articl”.
The procedure made a few errors, merging for example: “approv” and “read”,
“perform” and “supervis”. The final graph is shown in Fig. 4.
To reduce the number of errors from automatic clustering, we manually inspected 63
clusters. This included removing some clusters and merging others. We also assigned
role names to the clusters. The entire procedure resulted in 13 roles. The final set of 13
roles, as well as the fractions of mentions for every role, are presented in Fig. 5.
We annotated the dataset of role mentions. More specifically, the dataset contains role
mentions labelled with abstract roles. For example, the dataset might contain the entry:
(“participated in, the analysis of microarray data”, data analysis). The resulting dataset
is composed of the role mentions from the clusters, and the label for each role mention
is the role name assigned to the mention’s cluster. This annotation approach differs
from the typical approach, in which we would manually label each role mention in the
dataset. Even though our approach still requires manual work, it was performed on the
clusters, not each individual role mention. Since the clusters are much less numerous
than the role mentions, our proposed approach is less labor intensive.
3.4</p>
      </sec>
      <sec id="sec-3-5">
        <title>Roles Extraction from the Text</title>
        <p>This section describes our prototype of an automated extractor of authors’ roles from
text. The extractor takes a contributions section as input and outputs a set of extracted
roles. We used the previously developed preprocessing pipeline and discovered roles
for this task. The extraction algorithm is composed of the following steps:
 First, a set of role mentions is extracted from the text of the section. If the section is
written in a natural language, this is done using OpenIE. In some rare cases we came
across, the contributions section was not written in natural language, but rather
contained a list of contributions in the following format (or a variation of it): “author1:
role1, role2; author2: role3; ...”. In such cases we extract role mentions using regular
expressions. Redundant mentions are then removed.
 Next, each mention is represented as a feature vector. We use a binary bag-of-words
representation, with 64 words corresponding to the object/action keywords (Table
1). Only the keywords that remained after manual cluster removal are used.
 Finally, each mention is classified by a supervised Naïve Bayes model trained on the
mention set generated previously. The final output is a set of author-role pairs.
4
4.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <sec id="sec-4-1">
        <title>Roles Discovery</title>
        <p>In general, the results are similar (Table 2). Five roles appear in both our clusters and
CRediT. Our study resulted in four roles related to preparing the manuscript itself,
while CRediT has only two such roles. Three roles discovered in our study (paper
reading, literature review and interpretation) are not included in CRediT.
To evaluate our role extractor, we manually annotated a test set of 100 contributions
sections. At this point, we observed three new roles that were not discovered in our
study: paper approving, supervision and funding acquisition. Since the classifier does
not have any training data for these roles, they are never assigned.</p>
        <p>During the evaluation, for every document we compared the extracted author-role
pairs to the ground truth pairs. A pair was marked as correctly extracted if identical to
any pair in the ground truth. We obtained the following micro-averaged results:
precision 0.68, recall 0.48, F1 0.57. Table 3 presents the results for individual roles.</p>
        <p>Role
Analysis
Conceptualization
Experimenting
Study design
Coordination
Data collection
Paper drafting
Paper writing
Paper review
Paper revision
Paper reading
Literature review
Interpretation
We manually analyzed mistakes made by the extractor in the test set, and found two
types: false positives that lower precision (a subject-role pair incorrectly present in the
extracted output), and false negatives that that lower the recall (a correct subject-role
pair missing from the extracted output). We identified three sources of errors (Fig. 6):
 Errors related to mention extraction from the text. That is, an incorrect mention is
extracted, or a certain role mention is missing. These errors are responsible for 26%
of false positives and 73% of false negatives.
 Errors appearing during role discovery, related to incorrect cluster merging. These
errors result in the lack of roles paper approving, supervision and funding
acquisition in the extractor’s output and are responsible for 21% of false negatives.
Classification errors, resulting in assigning an incorrect role to the tuple. These errors
are responsible for 74% of false positives and 6% of false negatives.</p>
        <p>The quality of the mention extraction has the biggest impact on the overall results, in
particular recall. In a typical scenario, some mentions are missing from OpenIE output,
which makes it impossible to extract specific subject-role pairs.</p>
        <p>Incorrect tuples also affect the second cause of errors. For example, we observed
that in many cases, Stanford’s OpenIE tool extracts only one tuple from typical
sentences similar to “All authors read and approve the final manuscript”: (“all authors”,
“read”, “the final manuscript”). In this case, the missing mention related to approving
the manuscript resulted in the failure to discover this role in the corpus.</p>
        <p>Finally, we observed that in some cases the classifier made the decision based on a
single term such as “make”, which does not carry enough information for a correct
classification decision. Additional feature selection procedures for the classifier might
result in better classification performance.
In this paper, we presented a study of author contributions sections obtained from
publications in biomedical disciplines. The results of our study include: 1) a set of roles
discovered in the data in an unsupervised manner, and 2) a first prototype of a tool able
to automatically extract the roles from the contributions section.</p>
        <p>We semi-automatically discovered the following roles: experimenting, analysis,
study design, interpretation, conceptualization, paper reading, paper writing, paper
review, paper drafting, coordination, data collection, paper revision and literature
review. Three discovered roles (paper reading, literature review and interpretation) are
not included in the existing contributor roles taxonomy CRediT. The proposed
automated role extractor is able to extract roles directly from the text with micro-averaged
precision 0.68, recall 0.48 and F1 0.57.</p>
        <p>
          Our plans for future work include: testing alternative mention extraction approaches
and tools; testing alternative classification algorithms; and examining the relationships
between author orderings, H-index and the nature of contributions in a larger corpus
than used in previous analyses [
          <xref ref-type="bibr" rid="ref8 ref9">9, 10</xref>
          ].
This research was conducted with the financial support of Enterprise Ireland and the
European Regional Development Fund (ERDF) under Ireland’s European Structural
and Investment Funds Programme 2014-2020 under Grant Agreement No.
CF/2017/0808-I at the ADAPT SFI Research Centre at Trinity College Dublin. The
ADAPT SFI Centre for Digital Media Technology is funded by Science Foundation
Ireland through the SFI Research Centres Programme and is co-funded under the
European Regional Development Fund (ERDF) through Grant # 13/RC/2106.
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Tkaczyk</surname>
          </string-name>
          , A. Collins and
          <string-name>
            <given-names>J.</given-names>
            <surname>Beel</surname>
          </string-name>
          ,
          <article-title>"Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes,"</article-title>
          <source>in JCDL</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Tkaczyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Szostek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fedoryszak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dendek</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Bolikowski</surname>
          </string-name>
          ,
          <article-title>"CERMINE: automatic extraction of structured metadata from scientific literature,"</article-title>
          <source>International Journal on Document Analysis and Recognition</source>
          , vol.
          <volume>18</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>317</fpage>
          -
          <lpage>335</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Constantin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pettifer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Voronkov</surname>
          </string-name>
          ,
          <article-title>"PDFX: fully-automated pdf-to-xml conversion of scientific literature,"</article-title>
          <source>DocEng</source>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>180</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Councill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Giles and M.-Y. Kan</surname>
          </string-name>
          ,
          <article-title>"ParsCit: an open-source CRF reference string parsing package,"</article-title>
          <source>in International Conference on Language Resources and Evaluation</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Beel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Langer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Genzmehr</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Mueller</surname>
          </string-name>
          ,
          <article-title>"Docear's PDF Inspector: Title Extraction from PDF Files,"</article-title>
          <source>in Joint Conference on Digital Libraries</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ilakovac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Fister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marusic</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Marusic</surname>
          </string-name>
          ,
          <article-title>"Reliability of disclosure forms of authors' contributions,"</article-title>
          <source>Canadian Medical Association Journal</source>
          , vol.
          <volume>176</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>46</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [8]
          <string-name>
            <surname>M. K. McNutt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Bradford</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Drazen</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Hanson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Howard</surname>
            ,
            <given-names>K. H.</given-names>
          </string-name>
          <string-name>
            <surname>Jamieson</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Kiermer</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Marcus</surname>
            ,
            <given-names>B. K.</given-names>
          </string-name>
          <string-name>
            <surname>Pope</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>Schekman and others, "Transparency in authors' contributions and responsibilities to promote integrity in scientific publication,"</article-title>
          <source>Proceedings of the National Academy of Sciences</source>
          , vol.
          <volume>115</volume>
          , no.
          <issue>11</issue>
          , pp.
          <fpage>2557</fpage>
          -
          <lpage>2560</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Corrêa Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. N.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. d. F.</given-names>
            <surname>Costa</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Amancio</surname>
          </string-name>
          ,
          <article-title>"Patterns of authors contribution in scientific manuscripts,"</article-title>
          <source>Journal of Informetrics</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>498</fpage>
          -
          <lpage>510</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T. V.</given-names>
            <surname>Perneger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poncet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Carpentier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Agoritsas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Combescure</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Gayet-Ageron</surname>
          </string-name>
          ,
          <article-title>"Thinker, Soldier, Scribe: cross-sectional study of researchers' roles and author order in the Annals of Internal Medicine,"</article-title>
          <source>BMJ open</source>
          , vol.
          <volume>7</volume>
          , no.
          <issue>6</issue>
          , p.
          <fpage>e013898</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Angeli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J. J.</given-names>
            <surname>Premkumar</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>"Leveraging Linguistic Structure For Open Domain Information Extraction,"</article-title>
          <source>in ACL</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wong</surname>
          </string-name>
          , W. Liu and
          <string-name>
            <given-names>M.</given-names>
            <surname>Bennamoun</surname>
          </string-name>
          ,
          <article-title>"Ontology learning from text: A look back and into the future,"</article-title>
          <source>ACM Comput. Surv</source>
          , vol.
          <volume>44</volume>
          , no.
          <issue>4</issue>
          , pp.
          <volume>20</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          :
          <fpage>36</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>