<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Building and Exploiting a Web of Machine-Readable Scientific Facts to Make Discoveries</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicola Rafaele Di Matteo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Schimmenti</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Vitali</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>James Blustein</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dalhosusie University</institution>
          ,
          <addr-line>6299 South St, Halifax, NS</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università di Bologna</institution>
          ,
          <addr-line>Via Zamboni, Bologna</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We propose a method and the motivation for constructing a Web of machine-readable scientific claims, which computers can use for comparisons and to make new discoveries through the creation of a public structure of Nanopublications representing claims extracted from published papers. A scientific claim can be represented by a semantic predication, a triple in the form (subject, predicate, object). Information such as provenance can be associated with Nanopublications, making it a valid, atomic, scientific publication. By connecting these claims, discoveries can be made, and machines do this automatically with our structure. As a result, scientific articles published in hypertext are linked to increasing the objective knowledge of the world with valuable new hypotheses. While semantic predications from previously published scientific papers can be created using NLP tools, humans can extract them from the new articles. Authors, publishers, and readers can accurately identify core statements. They are valuable resources that must be encouraged to create or improve nanopublications. After having introduced how to make discoveries by connecting claims in the literature and having delineated the structure that makes them readable by machines, we introduce Desx as a conceptual method to enable authors to extract semantic predications from their published hypertext manuscripts and contribute to the global collection of nanopublications eficiently and accurately. We also discuss Desx's envisioned potential to linearize RDF into quasi-plain text for efective human readability, highlighting its future possibilities in knowledge presentation.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Semantic Web</kwd>
        <kwd />
        <kwd>RDF to text</kwd>
        <kwd />
        <kwd>text to RDF</kwd>
        <kwd />
        <kwd>nanopublications</kwd>
        <kwd />
        <kwd>Literature-based Discoveries</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Scientific literature is vast and grows rapidly. More than 50 million scientific papers have
been published, and around 2.5 million see the light of day every year [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In the last decades,
we have seen a substantial increase in publications. It is an extremely fragmented but highly
reliable knowledge, thanks to the activity of more than 28,000 peer-reviewed journals [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
and innumerable authors that make accessible their work in diferent formats, particularly
hypertext in recent times. Such literature contains assertions that describe the world and all of
our knowledge of it as a collection of (often unrelated, fragmented, and subjective) assertions
that are the consequences of the experiments reported and reviewed by peers, and represent
objective knowledge built interactively by researchers. A scientific paper in literature is an
autonomous part of knowledge that brings elements, assertions, fragments of a puzzle that form
a pattern that oftentimes may reveal new hypotheses [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        With its load of facts and, in general, assertions that can validate new hypotheses, theories
to connect, and experiment results, the scientific literature is itself a lode to dig and explore
so as to extend the scientific discovery process. Following Karl Popper [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], theories originate
from free creations of the mind and speculations of which scientists should imagine and test
counter-evidence to assess their robustness. Thus scientific theories are scholarly conjectures
that need to be thoroughly tested: those that can be tested in the real world are stronger, and
the ones that pass more validations positively are better theories than those that have failed the
tests or that were not fully and independently validated. In short, published arguments and
public criticism are crucial to the advancement of science.
      </p>
      <p>In Popper’s view, there are three worlds of reality: physical, objective, and subjective. The
objective world is the collection of theories with their validations performed by humans; on the
other hand, the subjective world is what is known and experienced by individuals. The scientific
literature is objective knowledge: it is created by humans and contains theories, experiments to
verify them, and inventions. However, what is contained in literature and the part that each of
us know of it is diferent. There is no certain method to retrieve all the (published) knowledge:
there will always be something to discover for individuals, including the novel problems brought
on by new theories and inventions. In conclusion, in the literature exists “undiscovered public
knowledge” [3, p. 108]. It is sometimes possible for crucial pieces of information to remain
undiscovered, e.g., that not all swans are white: a researcher might be unaware of the existence
of black swans if no trustworthy source has confirmed their presence. Similarly, a researcher
may not be able to establish the hypothesis that A causes C if they know of a paper reporting
that A causes B, but misses a connection to another paper afirming that B causes C.</p>
      <p>
        Objective knowledge is a key to finding answers to scientific questions. To achieve it,
researchers need not only hypotheses and potential refutations, but also the necessary connections
between sources of information. Researchers can arrive at objective knowledge not only via
scientific experiments, but also and ever more frequently by exploring and connecting various
sources of information, and reach conclusive answers through information networks. Although
no systems can be created to retrieve all the knowledge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], any step to building better tools
to recognize and combine hypotheses is a step that increases the subjective and objective
knowledge of the world and, therefore, Science.
      </p>
      <p>
        Studying methods to improve such discoveries and build eficient systems is the scope of
the field of Literature-Based Discovery (LBD), a research area born when Swanson [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] showed
that connecting results from independently published articles can reveal new and unexpected
hypotheses. Navigating objective, published knowledge is possible and beneficial, and studies
have proposed diferent methods and tools to do it automatically. Systems based on the recently
proposed relation/predicate-based approach that reason on semantic predications can suggest
new hypotheses based on well-defined logical consequences [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and show the associations
that connects assertions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] eficiently [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. An example of a discovery deducted by claims
in diferent papers is reported in Figure 1. Here, the claims EPA (Fish Oil) STIMULATES
Epoprostenol, Epoprostenol TREATS RD/RP (Raynaud Disease), are premises for an interesting
syllogism that concludes that Fish Oil is highly beneficial to treat Raynaud Disease, a medical
condition of reduced blood flow in extremities such as fingers and toes. Notably, the hypothesis
was not stated in any published articles [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Claims could have also originated from papers
belonging to diferent scientific domains, so they likely being unbeknownst to researchers
in either field. As shown in Figure 1, other claims found in diferent articles suggest another
interesting hypothesis: because both Epoprostenol and APA disrupt platelet aggregation, platelet
aggregation could cause Raynaud Disease. These consequences could also explain the biological
mechanisms. In summary, claims represented in triples and ontologies to associate terms such
as Epoprostenol and Prostaglandin make deductions that suggest hypotheses possible.
      </p>
      <p>
        Despite the efectiveness of making discoveries by reasoning about semantic predicates
as reported in several studies (e.g., [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]), almost all the tools presented in these studies are
not available anymore. This could suggest dificulties in using them and having an active
community that maintains and upgrades them. Indeed, the problem of making discoveries
could be positioned at a higher level through an eficient tools’ architecture that encourages a
divide-and-conquer approach. Such tools can create a heuristic space where interconnected
and accountable information can show new questions and unseen correlations for researchers
to study.
      </p>
      <p>
        In the systems proposed in the literature, almost all the LBD components are implemented
from scratch: from the analysis of a collection of scientific papers to retrieve significant terms,
to the algorithms that find relationships, to the user interfaces, to the methods to rank the
results [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Using available prepackaged parts such as NLP tools and domain ontologies seems
not to be considered in the studies; this leads to considerable, sometimes fruitless, eforts to
create and recreate usable tools. Also, this approach does not encourage collaboration between
experts in diferent fields.
      </p>
      <p>
        Building tools that make discoveries could be straightforward by exploiting semantic
predicates on the Web. Decoupling the extractions of assertions in articles from the reasoning about
them opens up the opportunity to use standard software components and allow diferent teams
to work together on a specific problem. Publishers and authors could produce publicly accessible
assertions [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. With an ecosystem in which articles as collections of machine-readable facts —
predications — and ontologies for the field of interest are published, the objective knowledge
represented by the literature could be extracted, analyzed, and discoveries could be made. Also,
researchers could contribute to collecting facts with new hypotheses suggested and, eventually,
validated in such an environment. The environment can be built and managed with tools defined
for this purpose, to make human knowledge accessible by machines; within the Semantic Web:
a collection of technologies and methods to define vocabularies, store, manage, share, query
data on the Web, and reason over them [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        We hereby envision a Web of Facts, a publicly accessible collection of (scientific) assertions
and hypotheses that computers use to make discoveries. We see it as a collection of
nanopublications [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], i.e. RDF Named Graphs that contain an atomic scientific statement together
with annotations and provenance. By removing rhetorics and style, a scientific article can
be represented as a collection of scientific assertions that can be published as a collection of
semantic predications &lt;subject, predicate, object&gt;, alongside provenance metadata,
ratings of certainty, and so on, to ensure the truth value of such sentences. A scientific claim
can therefore be expressed as a triple, e.g. &lt;aspirin, treats, cancer&gt;, and annotations
report the provenance and epistemic value of the claim itself.
      </p>
      <p>
        The source of the Web of Facts, the infrastructure of assertions that machines can use, are
scientific publications. NLP tools can extract assertions from published articles and represent
extracted statements as nanopublications. NLP tools combine techniques such as
Transformersbased models fine-tuned for assertions extraction, Named Entity Recognition like SemRep [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
developed by the National Library of Medicine (NLM), Abstract Meaning Representation, and
Relationship Extraction. In our view, the Web of Facts is a large-scale repository of
nanopublications from diferent scientific disciplines. In this space, scientists can find, explore and validate
new questions. New hypotheses can be generated and, in turn, published as nanopublications.
      </p>
      <p>Both machines and authors, publishers, and readers can participate in identifying the core
facts and generate nanopublications.</p>
      <p>We propose identifying fundamental methods of a process for making discoveries
automatically from the literature, in which the Web of Facts is the layer that permits decoupling the
efort to extract predications from papers and, therefore, encourages collaborations between
diferent entities. The methods involve:</p>
      <p>(i) Extracting assertions from existing papers, articles, and documents to publish them as
nanopublications; (ii) Using logic to derive new and unexpected inferences from such
nanopublications; (iii) Publishing the newly discovered knowledge in the form of new nanopublications;
(iv) Making this knowledge accessible to humans in a readable format for inspection and
evaluation of relevance by the domain experts, and as seeds for further explorations and new research
opportunities;</p>
      <p>An easy-to-use tool that authors can use to feed the Web of Facts is part of our vision of the
Web of Facts Publication Cycle. Through it, publishers can ofer open services that convey new
users to their premium oferings, giving access to a collection of assertions published alongside
the version for humans.</p>
      <p>Our contribution is to (i) delineate a methodology to make discoveries automatically,
connecting information expressed in scientific articles published in hypertext for humans, (ii) lay
the foundation of a Web of scientific facts, an essential part of the view that permits decoupling
the efort to extract predications from papers and, therefore, encourages collaborations between
diferent entities, (iii) suggest a new publishing cycle that opens opportunities for publishers
while contributing to disseminate machine-processable scientific knowledge, and (iv) propose
a method to support the creation of nanopublications manually and visualize to humans the
knowledge generally represented in RDF.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Previous works and background</title>
      <p>
        The Literature-Based Discovery (LBD) methodology, initially proposed by Swanson [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
connects information expressed in diferent papers to propose new hypotheses to the researcher.
Subsequently, methods to make discoveries using semantic predications extracted from papers
automatically have been proposed. In our view, discoveries emerge more eficiently if the
diferent phases of the process are well-distinct and dedicated tools and methodologies are used.
Such parts are:
      </p>
      <p>(i) The collection/generation of large quantities of predications representing atomic scientific
results extracted from, and deeply grounded in, the scientific articles they were first introduced
in. (ii) The adoption of meaningful and applicable relational models between concepts (e.g.,
ontologies) that supply the possibility to discover and boost unexpected and non-obvious
relationships between disconnected predications. (iii) Tools to integrate distributed collections
of predications, their comparison and integration, especially in a setting where intellectual
property of the scientific articles containing the source facts belong to private enterprises. (iv)
Tools to make aware scholars of the newly discovered potential predications in an operable
manner, allowing the humans to receive, understand, evaluate, edit and improve of the discovered
hypotheses.</p>
      <p>Diferent contributions in the literature about methods that use semantic predications to
make discoveries other than for each aspect of interest are given.</p>
      <p>
        Discoveries exploting semantic predications Diferent methods to make discoveries from
the literature have been proposed. Methodologies that exploit semantic predications clearly
represent the connections that bring discoveries, leading to better accuracy [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. For instance, in
showing results to determine the most efective chemotherapy for lung cancer treatment, Li
et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] points out the convenience of using semantic predications to evaluate hypotheses
and make discoveries. Semantic predications are “basic knowledge unit” [15, p. 5] representing
the assertions in papers. They can also be filtered considering their level of certainty, including
controversial and contradictory assertions.
      </p>
      <p>
        Using predications extracted from the MEDLINE corpus and grouping the results considering
the context defined by the MeSH descriptors is the approach Obvio [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] uses. Relations between
a concept of interest and possible causes are searched, connecting predications in a graph
and considering as more relevant the shortest paths. The method can easily infer relations
not explicitly stated, and discoveries can be grouped by categories such as cellular activity,
pharmaceutical, and lipids. The authors conclude that more hypotheses could be recovered with
predications extracted from the full-text [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] other than from abstracts and titles.
      </p>
      <p>Melodi [16] is another tool that analyses predications extracted by the MEDLINE corpus with
SemRep [17], a natural language processing tool developed by the National Library of Medicine
(NLM). Triples that express relations between common terms, i.e., frequently occurring in
the corpus, are not considered. On the other hand, paths that include connections between
rarer concepts, i.e., less frequent in the corpus and more often stated in the set of articles, are
considered more relevant. An improved version of the tool [18] filters the predications by the
semantic type of subjects, objects, and predicates.</p>
      <p>Semantic Web for discoveries Can Semantic Web triples and ontologies help? Some
scholars are positive. For instance, in [19] it is suggested that the Semantic Web has all the resources
needed to support finding new drugs. The authors advocate a new Semantic Web Stack in which
relevant conclusions from papers and ontologies are published in RDF, knowledge graphs are
built, and discoveries are made by reasoning on them.</p>
      <p>The authors of [20] point out that because Semantic Web enables knowledge sharing, systems
based on it can improve scientific discoveries. By sharing findings in machine-readable format,
biological models can be represented, and hypotheses can emerge through logical inferences.
With ontologies published in RDF on the Web, scientists can specify complex relations between
concepts allowing the combination of knowledge from diferent fields. Study conclusions
can then be described with RDF triples and included in articles, making the integration of
RDF statements a part of the publishing process itself. We strongly agree with the idea that
including RDF that describes relevant assertions expressed in a paper should be intrinsic part
of the publishing process. With such a mechanism, researchers can use previous studies more
eficiently to make new discoveries and generate and validate hypotheses.</p>
      <p>With the prototype of their tool named HyQue [21], [22] shows that Semantic Web
technologies can be used to validate hypotheses. Using ontologies published on bio2RDF [23], the
tool validates speculations in the biology fields. Also, they suggest a mechanism to associate
provenance to the assertions published on the Web that should reduce the complexity of the
nanopublication model.</p>
      <p>[24] suggests that Semantic Web technologies can satisfy the need to access knowledge
from diferent sources. In their view, a “virtual knowledge broker service” [24, p. 429] extracts
assertions and metadata from heterogeneous sources such as full-text literature and repositories
of protein sequences, genes and phenotypes. They point out that managing complex and
voluminous data, adopting Semantic Web standards, creating simple user interfaces, having
experts in data providing are the challenges to resolve to make, one day, the infrastructure they
propose.</p>
      <p>Extracting semantic predications Extracting predications from text is a complex task.
When approaching Text-to-RDF models, it is essential to consider that the text to be converted
must present a statement where a "variable" is given a "value" or, even better, something is
described. Not all text is worth converting into RDF, as some may lack meaningful information
or structured data that can be efectively represented in a knowledge graph. At an initial level,
the abstract of a paper is a good example. The text must be cleaned of uninteresting sections, and
multiple sentences must be grouped together to summarize the semantics of it. This brings us to
two other NLP topics: Abstract Meaning Representation and Natural Language Understanding.
Attempts to generate RDF from NL are not new [25], but to date, most approaches use NN.
Generating good RDF from text has seen its first successful attempts from simple sentences.
AMR2FRED, [26] generates RDF with NER and DBpedia. Similar tools employ both NER and
AMR to generate RDF. Still, a generally accepted solution has not been adopted generally by the
community. Other approaches include adaptations of seq2seq models using graph embeddings,
such as CycleGT. [27].</p>
      <p>Presenting discoveries: RDF-to-Text RDF-to-Text is an essential task in natural language
generation, aimed at converting knowledge graphs into natural language. This field is
continually evolving, exploring various methodologies including rule-based, template-based, and neural
network-based approaches, each with unique strengths and challenges. No standard solution
has yet gained widespread acceptance, a situation influenced by several factors: (i) Diversity in
knowledge graphs: Varying structures and complexities in RDF graphs make it challenging to
develop a universal solution. Machine Learning and Neural Network models, though adaptable
in general contexts, struggle with fine-tuning for specific domains. (ii) Distance between RDF and
NL: The structure of RDF triples, despite having a natural language-like label, difers significantly
from natural language constructs, complicating direct verbalization ([28]). (iii) Advancements
in NL generation: The evolution of techniques, such as those introduced by GPT-3, contribute
to the lack of standardization ([29]). (iv) Diferences in output requirements : The variability in
desired outputs based on application or end-user preferences, combined with the dominance of
English in these technologies, adds to the complexity ([30]). (v) Performance vs Precision: The
balance between the general applicability of ML and NN models like GPT-3 and Transformer
([29, 31]) and the precision of template-based models like RDF2PT ([32, 33, 34]) remains a key
challenge.</p>
      <p>The WebNLG challenges have significantly contributed to the growth of this field. Notably,
the 2020 Challenge highlighted the Graph2Seq approach as a promising method for bridging the
RDF-NL gap ([28]). Initial tests with GPT-4 for generating text from RDF triples show potential
but also reveal issues like the inclusion or omission of key information. Future research might
explore fine-tuning models to adhere more closely to input ontologies and graphs.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Making discoveries exploiting the Web of Facts</title>
      <p>The Web of Facts is a global collection of machine-readable scientific (nano)publications.
Objective knowledge is made accessible by machines using such a structure, similar to how it
is readable by humans through HTML pages and PDFs. Nanopublications are extracted from
papers by NLP tools and humans; they feed a collection made accessible by machines that can
generate well-founded and likely valuable hypotheses, being based on transparent deductions
from valid scientific assertions. Figure 2 shows the cycle we envision: the Web of Facts is fed
by nanopublications, machines generate syllogisms (reproducing what suggested by the LBD
methodology), and results are presented to humans in natural language. The results themself,
in RDF, can contribute to feeding the Web of Facts with newly generated nanopublications.
Visualization can then be used to show networks of papers that state similar things (or, by using
OWL properties, opposite or conflicting nanopublications).</p>
      <p>With a global collection accessible, it could retrieve more nanopublications that represent
identical semantic predication (i.e., the triple subject, predicate, and object are equal). This
fact can increase accuracy, thanks to the redundancy. Tools that summarize, reason, and make
discoveries exploit such a collection.</p>
      <p>The model is summarized in Figure 3.
4. Feeding the Web of Facts with author-curated predications
The Web of Fact can be feed by the authors themself. However, a straightforward tool to use is
necessary. We claim that annotating documents in HTML facililate the extraction of semantic
predications and the creation of nanopublications. Elements to include in a nanopublication are
(i) a semantic predication representing an assertion expressed in the paper, (ii) a level of truth
of the claim, such as speculation, hypothesis, claim, fact, or observation, (iii) provenance, i.e.,
author and any reference to the scientific publication, and (iv) information about the intellectual
property.</p>
      <p>We envision a method, that we named DesX to annotate natural language statements in
papers, that is complemented by the counterpart to represent the RDF output by LBD tools in
natural language. Based on the author’s input, DesX inserts annotations inside attributes of the
span elements in the HTML documents to indicate entities and predicates. An example of an
annotated sentence is:
&lt;span data-desx-tpl="wd:Q41567 wdt:P50 wd:Q692"&gt;
&lt;span data-desx-entity="wd:Q692" title="Shakespeare"&gt;The bard&lt;/span&gt;
wrote ’&lt;span data-desx-entity="wd:Q41567"&gt;Hamlet&lt;/span&gt;’
&lt;/span&gt;</p>
      <p>With such information, DesX can automatically compile the semantic predications needed to
create the nanopublication to feed the Web of Facts.</p>
      <p>Also, DesX can suggest to the author triples worthy of being extracted. In this case, the
identification is organized into three steps: (i) Identifying the template/pattern (ii) Identifying
the entities mentioned in the template/pattern (iii) Identifying the role of such entities (subject,
object - depending on the template) A template is defined as a list of terms with their role. It
can be specific to a particular field and could originate from ontologies. With this mechanism in
place, the interface helps the author(s) publish nanopublications alongside their paper. Once a
predication is created, the author revises the triple, improves the predication, and adds necessary
information. The result is a nanopublication ready to feed the collection of machine-readable
assertions — the Web of Facts. An example of a sentence that contains a claim from which a
nanopublication is extracted is shown in Figure 4. Provenance can be added automatically to
the nanopublication as a triple and the corresponding assertion in NL, such as "doi:xxxx claims
that Eicosapentaenoic acid disrupts platelet aggregation"
Apart from stating the correct provenance, the simple sentence(s) summarizes the paper’s
outcome or work. Queries can be performed for searching nanopublications with similar
subjects, objects, or both, or papers with similar (or identical) nanopublications to filter the
number of statements involved in reasonings.</p>
      <p>This process includes identifying and reconnecting entities already in the network, such as
authors, institutions, and other relevant resources. When the new nanopublication is added to
the Web of Facts, mechanisms that leverage unique identifiers, such as DOIs, and established
ontologies, can ensure that the newly added RDF nanopublications are correctly connected to
the existing entities, preserving the integrity and consistency of the Web of Facts, especially to
avoid duplication of entities and loss of new possible discoveries.</p>
    </sec>
    <sec id="sec-4">
      <title>5. Presenting discoveries to Humans</title>
      <p>Machines make discoveries using the Web of Facts infrastructure and must present them
to humans. A linearization of RDF in the quasi-plain text becomes necessary to represent
discoveries and make them readable to a wide audience. With DesX we propose an alternative
serialization of RDF meant for being easier to read for humans. It is a template and rule-based
model we are developing for Text-to-RDF tasks (e.g. NL realization of nanopublications), which
leverages RDF properties to store and describe templates. A sub-property of rdfs:label stores
the generative template so that specific property can be verbalized at any time. They can be
easily implemented directly in any knowledge base and are highly customizable. The choice
was also made to keep consistency between diferent graphs without risking any divergent
interpretation of the graph by using e.g. LLMs. A DesX template represents a pattern of
one or two variables and a set of natural language tokens. It is similar to a SPARQL basic
graph pattern, mutatis mutandis. DesX templates consist of: (i) One or two variables (object,
subject); the variables are denoted by the $ prefix ($object, $subject). The variable resolution is
a substitution function with the matched triple’s labels on subject and object entities (or data
types). The variables work as "placeholders" for the entities’ labels. (ii) A set of tokens which
represent the property. Any property can be verbalized using a default template: (i) Object
properties: $subject is in relationship $propertyLabel with $object (ii) Data properties: $subject’s
relationship $propertyLabel has value “xsd:DataType $value” E.g.: Template: dcterms:title
desx:template "$subject has the title $object" Realization: "doi:10.1234/jneuro.2022.001 has title
"Gene therapy improves motor and cognitive function in a mouse model of Huntington’s disease"
DesX preferred output is in HTML: by using &lt;span&gt; elements with attributes e.g. "data-desx-tpl"
to store the originally extracted triple(s), "data-desx-src" to store the source of the triple(s), so
reverse conversion and extraction can be easily performed, as well as editing.
5.1. From nanopublications to text</p>
      <sec id="sec-4-1">
        <title>5.1.1. Nanopublications</title>
        <p>We show an example starting from a mock nanopublication about Huntington’s disease1.
med:np001 {
&lt;&gt; a np:Nanopublication ;
np:hasAssertion med:assertion001 ;
np:hasProvenance med:provenance001 ;
np:hasPublicationInfo med:pubinfo001 .
}
med:assertion001 {
med:therapy_genic_med001 a med:Treatment ;
med:treats med:Huntington_disease ;
1We assume to use standard prefixes for the triples as shown
}
med:provenance001 {</p>
        <p>med:study001 a prov:Entity ;
dcterms:title "Gene therapy improves motor and cognitive function in a mouse model of Huntington’s diseas
dcterms:creator "Doe J., Smith A."@en ;
dcterms:date "2022-01-15"^^xsd:date ;
foaf:homepage &lt;http://example.com/study001&gt; ;
prov:wasDerivedFrom &lt;http://example.com/mouse_model_001&gt; .
}
med:pubinfo001 {
dcterms:title "Gene therapy for Huntington’s disease" ;
dcterms:publisher "Journal of Neuroscience" ;
dcterms:identifier "doi:10.1234/jneuro.2022.001" .
}</p>
      </sec>
      <sec id="sec-4-2">
        <title>5.1.2. Templates</title>
        <p>The properties would have templates similar to ones in table 1:</p>
        <p>Property
med:Treatment
med:treats
med:hasMethod
med:hasEvidence
dcterms:title
dcterms:publisher
dcterms:identifier</p>
        <p>The resulting text, without much post-processing, would result as: "Gene therapy, a type of
treatment, which is based on the method of gene therapy, can treat Huntington’s disease, as
supported by the evidence presented in med:study001, "Gene therapy improves motor and
cognitive function in a mouse model of Huntington’s disease". It was published by Journal
of Neuroscience”, it was created on 2022-01-15 by Doe J. and Smith A. It has the identifier
doi:10.1234/jneuro.2022.001, and can be found at http://example.com/study001. The study was
derived from the resource http://example.com/mouse_model_001".</p>
        <p>A visualization method based on templates, while it has its drawbacks (the template
themselves, dificulty in making the text seem natural), ofers several advantages. First, it is
accountable, meaning the resulting text can be traced back to the RDF data and the template used
to generate it. This is important for ensuring the accuracy and reliability of the information
presented in the text. Second, the method is transparent in the way it generates the text. By
using predefined templates, the process of creating the text is clear and can be easily understood
by others. This is particularly important in cases where the text is used for decision-making
purposes, as it enables stakeholders to understand how the text was generated and assess
its validity. Third, the method relies on provenance information, which is stored in the RDF
data, to ensure that the information presented in the text is based on reliable sources. This
means the text can be trusted to reflect the underlying data accurately. Fourth, the method
is consistent across platforms. The templates are defined as platform-independent, enabling
the same template to generate text across diferent systems and applications. This consistency
ensures that the text remains the same, regardless of the platform used to generate it, while
being adaptable to diferent languages.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusions</title>
      <p>More than 50 million scientific publications contain hidden, implicit hypotheses that could
accelerate science if retrieved. Although making discoveries automatically from the scientific
literature is arduous, succeeding in this would have a relevant impact on humanity to justify
the highest efort. The Literature-based discoveries approach generates new hypotheses by
analyzing and combining diferent scientific publications. Computers can execute this systematic,
pragmatic, and well-defined methodology reasoning on semantic predications representing
article assertions. If such statements are published with attributes such as provenance, they are
valid (nano) scientific publications that can be combined to generate valuable hypotheses. From
this, the idea: create a Worldwide collection of nanopublications that machines can consume to
suggest new hypotheses. Nanopublications are extracted from hypertext and published next to
them: they are the machine-readable collection of the relevant assertions in the paper. Our view
is to generate nanopublications to feed such Web of Fact, curating them, having software that
makes conclusions exploiting the collection, and generating new hypertext for humans. Tools
and methodologies exist to show that with syllogisms from semantic predications discoveries
are rendered, and we have presented examples.</p>
      <p>In this paper, we have given an overview of the model we envision and described Desx,
a relevant part of the model that facilitates the feeding of the Web of Facts with manually
generated nanopublications and the representation of the discoveries in natural language.</p>
      <p>In conclusion, by reducing the sparseness of literature, and discovering new hypotheses
within the scientific literature, ofers significant potential for improving Science.
s11192-021-03880-8. doi:10.1007/s11192-021-03880-8.
[16] B. Elsworth, K. Dawe, E. E. Vincent, R. Langdon, B. M. Lynch, R. M. Martin, C. Relton,
J. P. T. Higgins, T. R. Gaunt, MELODI: Mining Enriched Literature Objects to Derive
Intermediates, International Journal of Epidemiology 47 (2018) 369–379. URL: https:
//academic.oup.com/ije/article/47/2/369/4803214. doi:10.1093/ije/dyx251.
[17] T. C. Rindflesch, M. Fiszman, The interaction of domain knowledge and linguistic structure
in natural language processing: interpreting hypernymic propositions in biomedical text,
Journal of Biomedical Informatics 36 (2003) 462–477. URL: https://linkinghub.elsevier.com/
retrieve/pii/S1532046403001175. doi:10.1016/j.jbi.2003.11.003.
[18] B. Elsworth, T. R. Gaunt, MELODI Presto: a fast and agile tool to explore semantic triples
derived from biomedical literature, Bioinformatics (2020). URL: https://doi.org/10.1093/
bioinformatics/btaa726. doi:10.1093/bioinformatics/btaa726.
[19] S. Kanza, J. G. Frey, A new wave of innovation in Semantic web tools for drug
discovery, Expert Opinion on Drug Discovery 14 (2019) 433–444. URL: https://doi.org/10.1080/
17460441.2019.1586880. doi:10.1080/17460441.2019.1586880, publisher: Taylor &amp;
Francis _eprint: https://doi.org/10.1080/17460441.2019.1586880.
[20] E. K. Neumann, E. Miller, J. Wilbanks, What the semantic web could do for the life sciences,
Drug Discovery Today: BIOSILICO 2 (2004) 228–236. URL: https://linkinghub.elsevier.com/
retrieve/pii/S1741836404024205. doi:10.1016/S1741-8364(04)02420-5.
[21] A. Callahan, M. Dumontier, N. H. Shah, HyQue: evaluating hypotheses using Semantic
Web technologies, Journal of Biomedical Semantics 2 (2011) S3. URL: https://doi.org/10.
1186/2041-1480-2-S2-S3. doi:10.1186/2041-1480-2-S2-S3.
[22] A. V. Callahan, Semi-automated Hypothesis Evaluation Using Semantic
Technologies, Text, Carleton University, 2014. URL: https://curve.carleton.ca/
20d96321-a493-4a92-8e46-151b80fef6e6, last Modified: 2015-07-03T16:20-04:00.
[23] F. Belleau, M.-A. Nolin, N. Tourigny, P. Rigault, J. Morissette, Bio2RDF: Towards a mashup
to build bioinformatics knowledge systems, Journal of Biomedical Informatics 41 (2008) 706–
716. URL: http://www.sciencedirect.com/science/article/pii/S1532046408000415. doi:10.
1016/j.jbi.2008.03.004.
[24] I. Harrow, W. Filsell, P. Woollard, I. Dix, M. Braxenthaler, R. Gedye, D. Hoole, R. Kidd, J.
Wilson, D. Rebholz-Schuhmann, Towards Virtual Knowledge Broker services for semantic
integration of life science literature and data sources, Drug Discovery Today 18 (2013) 428–
434. URL: https://www.sciencedirect.com/science/article/pii/S1359644612004011. doi:10.
1016/j.drudis.2012.11.012.
[25] I. Augenstein, S. Padó, S. Rudolph, LODifier: Generating Linked Data from Unstructured
Text, in: E. Simperl, P. Cimiano, A. Polleres, O. Corcho, V. Presutti (Eds.), The Semantic
Web: Research and Applications, Lecture Notes in Computer Science, Springer, Berlin,
Heidelberg, 2012, pp. 210–224. doi:10.1007/978-3-642-30284-8_21.
[26] A. Meloni, D. R. Recupero, A. Gangemi, Amr2fred, a tool for translating abstract meaning
representation to motif-based linguistic knowledge graphs, in: Extended Semantic Web
Conference, 2017.
[27] Q. Guo, Z. Jin, X. Qiu, W. Zhang, D. Wipf, Z. Zhang, CycleGT: Unsupervised Graph-to-Text
and Text-to-Graph Generation via Cycle Training, in: Proceedings of the 3rd
International Workshop on Natural Language Generation from the Semantic Web (WebNLG+),
Association for Computational Linguistics, Dublin, Ireland (Virtual), 2020, pp. 77–88. URL:
https://aclanthology.org/2020.webnlg-1.8.
[28] Q. Guo, Z. Jin, N. Dai, X. Qiu, X. Xue, D. Wipf, Z. Zhang, A Plan-and-Pretrain Approach for
Knowledge Graph-to-Text Generation, in: Proceedings of the 3rd International Workshop
on Natural Language Generation from the Semantic Web (WebNLG+), Association for
Computational Linguistics, Dublin, Ireland (Virtual), 2020, pp. 100–106. URL: https://
aclanthology.org/2020.webnlg-1.10.
[29] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,
R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin,
S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei,
Language Models are Few-Shot Learners, 2020. URL: http://arxiv.org/abs/2005.14165.
doi:10.48550/arXiv.2005.14165, arXiv:2005.14165 [cs].
[30] T. Castro Ferreira, C. Gardent, N. Ilinykh, C. van der Lee, S. Mille, D. Moussallem, A.
Shimorina, The 2020 Bilingual, Bi-Directional WebNLG+ Shared Task: Overview and Evaluation
Results (WebNLG+ 2020), in: Proceedings of the 3rd International Workshop on Natural
Language Generation from the Semantic Web (WebNLG+), Association for Computational
Linguistics, Dublin, Ireland (Virtual), 2020, pp. 55–76.
[31] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser,
I. Polosukhin, Attention Is All You Need, 2017. URL: http://arxiv.org/abs/1706.03762.
doi:10.48550/arXiv.1706.03762, arXiv:1706.03762 [cs].
[32] D. Moussallem, T. C. Ferreira, M. Zampieri, M. C. Cavalcanti, G. Xexéo, M. Neves,
A.C. N. Ngomo, RDF2PT: Generating Brazilian Portuguese Texts from RDF Data, 2018. URL:
http://arxiv.org/abs/1802.08150. doi:10.48550/arXiv.1802.08150, arXiv:1802.08150
[cs].
[33] D. Moussallem, Knowledge Graphs for Multilingual Language Translation and
Generation (2020). URL: http://arxiv.org/abs/2009.07715. doi:10.17619/UNIPB/1-980,
arXiv:2009.07715 [cs].
[34] A.-C. N. Ngomo, D. Moussallem, L. Bühmann, A Holistic Natural Language Generation
Framework for the Semantic Web, Technical Report arXiv:1911.01248, arXiv, 2019. URL:
http://arxiv.org/abs/1911.01248. doi:10.48550/arXiv.1911.01248, arXiv:1911.01248
[cs] type: article.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jinha</surname>
          </string-name>
          ,
          <article-title>Article 50 million: An estimate of the number of scholarly articles in existence</article-title>
          ,
          <source>Learned Publishing</source>
          <volume>23</volume>
          (
          <year>2010</year>
          )
          <fpage>258</fpage>
          -
          <lpage>263</lpage>
          . doi:
          <volume>10</volume>
          .1087/20100308.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ware</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mabe</surname>
          </string-name>
          ,
          <source>The STM Report: An overview of scientific and scholarly journal publishing</source>
          , Copyright, Fair Use, Scholarly Communication, etc. (
          <year>2015</year>
          ). URL: https:// digitalcommons.unl.edu/scholcom/9.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Swanson</surname>
          </string-name>
          , Undiscovered Public Knowledge,
          <source>The Library Quarterly</source>
          <volume>56</volume>
          (
          <year>1986</year>
          )
          <fpage>103</fpage>
          -
          <lpage>118</lpage>
          . URL: https://www.journals.uchicago.edu/doi/10.1086/601720. doi:
          <volume>10</volume>
          .1086/601720.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Popper</surname>
          </string-name>
          , The Logic of Scientific Discovery, Hutchinson, London,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Swanson</surname>
          </string-name>
          , Fish Oil,
          <article-title>Raynaud's Syndrome, and Undiscovered Public Knowledge</article-title>
          ,
          <source>Perspectives in Biology and Medicine</source>
          <volume>30</volume>
          (
          <year>1986</year>
          )
          <fpage>7</fpage>
          -
          <lpage>18</lpage>
          . URL: https://muse-jhu-edu.ezproxy. library.dal.ca/article/403510/pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Preiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stevenson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>McClures</surname>
          </string-name>
          ,
          <source>Towards Semantic Literature Based Discovery</source>
          (
          <year>2012</year>
          )
          <article-title>2</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yalamanchili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Danh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vallabhaneni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Rindflesch</surname>
          </string-name>
          ,
          <article-title>A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>46</volume>
          (
          <year>2013</year>
          )
          <fpage>238</fpage>
          -
          <lpage>251</lpage>
          . URL: https://www.sciencedirect.com/science/article/pii/S1532046412001517. doi:
          <volume>10</volume>
          . 1016/j.jbi.
          <year>2012</year>
          .
          <volume>09</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Thilakaratne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Falkner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Atapattu</surname>
          </string-name>
          ,
          <string-name>
            <surname>A Systematic</surname>
          </string-name>
          <article-title>Review on Literature-based Discovery: General Overview</article-title>
          , Methodology, &amp;
          <article-title>Statistical Analysis</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>52</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>34</lpage>
          . URL: https://dl.acm.org/doi/10.1145/3365756. doi:
          <volume>10</volume>
          .1145/3365756.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kavuluru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Rindflesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          ,
          <article-title>Context-driven automatic subgraph creation for literature-based discovery</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>54</volume>
          (
          <year>2015</year>
          )
          <fpage>141</fpage>
          -
          <lpage>157</lpage>
          . URL: https://www.sciencedirect.com/science/ article/pii/S1532046415000167. doi:
          <volume>10</volume>
          .1016/j.jbi.
          <year>2015</year>
          .
          <volume>01</volume>
          .014.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Thilakaratne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Falkner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Atapattu</surname>
          </string-name>
          ,
          <article-title>A systematic review on literature-based discovery workflow</article-title>
          ,
          <source>PeerJ Computer Science</source>
          <volume>5</volume>
          (
          <year>2019</year>
          )
          <article-title>e235</article-title>
          . URL: https://peerj.com/articles/cs-235. doi:
          <volume>10</volume>
          .7717/peerj-cs.
          <volume>235</volume>
          , publisher: PeerJ Inc.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kuhn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. E.</given-names>
            <surname>Barbano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Nagy</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krauthammer, Broadening the Scope of Nanopublications</article-title>
          , in: P.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Corcho</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Hollink</surname>
          </string-name>
          , S. Rudolph (Eds.),
          <source>The Semantic Web: Semantics and Big Data, Lecture Notes in Computer Science</source>
          , Springer, Berlin, Heidelberg,
          <year>2013</year>
          , pp.
          <fpage>487</fpage>
          -
          <lpage>501</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -38288-8_
          <fpage>33</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <fpage>W3C</fpage>
          ,
          <string-name>
            <surname>Semantic</surname>
            <given-names>Web</given-names>
          </string-name>
          <source>- W3C</source>
          ,
          <year>2015</year>
          . URL: https://www.w3.org/standards/semanticweb/.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Groth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gibson</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Velterop,</surname>
          </string-name>
          <article-title>The anatomy of a nanopublication</article-title>
          ,
          <source>Information Services &amp; Use</source>
          <volume>30</volume>
          (
          <year>2010</year>
          )
          <fpage>51</fpage>
          -
          <lpage>56</lpage>
          . URL: https://www.medra.org/servlet/aliasResolver?alias=iospress&amp; doi=10.3233/ISU-2010-0613. doi:
          <volume>10</volume>
          .3233/ISU-2010-0613.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kilicoglu</surname>
          </string-name>
          , G. Rosemblat,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fiszman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <article-title>Broad-coverage biomedical relation extraction with SemRep</article-title>
          ,
          <source>BMC Bioinformatics 21</source>
          (
          <year>2020</year>
          )
          <article-title>188</article-title>
          . URL: https://doi.org/10.1186/ s12859-020-3517-7. doi:
          <volume>10</volume>
          .1186/s12859-020-3517-7.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <article-title>Towards medical knowmetrics: representing and computing medical knowledge using semantic predications as the knowledge unit and the uncertainty as the knowledge context</article-title>
          ,
          <source>Scientometrics</source>
          (
          <year>2021</year>
          ). URL: https://doi.org/10.1007/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>