=Paper= {{Paper |id=Vol-2976/paper-3 |storemode=property |title=Yoked Flows for Direct Representation of Scientific Research |pdfUrl=https://ceur-ws.org/Vol-2976/paper-3.pdf |volume=Vol-2976 |authors=Robert Allen |dblpUrl=https://dblp.org/rec/conf/jcdl/Allen21 }} ==Yoked Flows for Direct Representation of Scientific Research== https://ceur-ws.org/Vol-2976/paper-3.pdf
Yoked Flows for Direct Representafion of Scienfific Research
Robert B. Allen
New York, NY, USA
ABSTRACT
We propose developing highly structured and interlocking, or yoked, descriptions for all aspects of scientific
research reports. These structured descriptions would be based on rich standardized vocabularies. We use two
principal sets of flows to provide such structured descriptions: (a) Research Design and Procedures; and (b)
Hypotheses and Outcomes. The structured descriptions may also include the research question, threats to
validity, and implications. We propose that the best way to capture and describe the structure of scientific
research is by considering multiple flows which are yoked. The claims from the research are propositions and
they can be coordinated in a knowledgebase. As an example, we examine Pasteur’s study of germ theory and
support interaction with the structured description of the study with a prototype graphical user interface. We
also consider template structures for different parts of the research reports. Ultimately, structured research
reports could be interwoven into structured and evolving digital-library knowledgebases.
KEYWORDS
Highly Structured Digital Library, Microworld, Research Designs, Simulation Space, Specific Comparisons,
Transitional Propositions, Validity


1       INTRODUCTION                                                        Recently, we have focused on the comprehensive
                                                                            ontology SUMO [24] as the vocabulary for such
We have been exploring direct representation for                            models. One important feature of SUMO that
scientific research reports. Direct representation                          distinguishes it from most other ontologies is the
proposes that entire research reports can and                               inclusion of rules. We also propose the adoption
should be highly structured. Moreover, we                                   of object-oriented modeling [7] in place of
propose that collections of research reports can be                         traditional approaches to presenting and
interwoven into a rich semantic knowledgebase.                              processing knowledgebases. We implement
                                                                            transitions between object states and apply
1.1 Semantic Models
                                                                            linguistic models of “case roles” to describe them
Causal models, whether explicit or implicit, are                            [9].
central to science. Scientific research articles
                                                                            In previous work, we have proposed a broad
would benefit from using highly structured models
                                                                            framework for flows that can be applied across
which support state changes and causal relations.
                                                                            domains [6, 10]. We have conducted several
We use “flows” as a generic term for sequences of
                                                                            studies describing mechanisms and systems with
transitions such as workflows. flowcharts, plans,
                                                                            structured, semantic vocabularies. Building on the
mechanisms, and other causal sequences.
                                                                            modeling techniques in [9], we describe steps
Potentially, flows could be circular or have
                                                                            toward developing a rich model-oriented
feedback loops.
                                                                            knowledgebase to support science. We describe

Digital Infrastructures for Scholarly Content Objects (DISCO), September 2021
EMAIL: rba@boballen.info
ORCID: [0000-0002-4059-2587]
                               © 2021 Copyright for this paper by its authors. Use permitted under Creative
                               Commons License Attribution 4.0 International (CC BY 4.0).

                             CEUR Workshop Proceedings (CEUR-WS.org)
policies for making these simulations plausible and                without the aid of a microscope, produced spoilage,
useful. While our current work focuses on                          fermentation, and some diseases.
qualitative models, the approach should also
                                                                   One early controversy was whether microbes
support quantitative models.
                                                                   developed only from other microbes or whether
1.2 Scientific Research Reports                                    they developed spontaneously. That is, whether
                                                                   existing organisms are needed to propagate new
There is a long tradition of research on scholarly                 organisms and those existing organisms are carried
publications (e.g., [1, 30, 32]). Structure has                    by air currents. We focus on a version of Pasteur’s
increasingly been added to descriptions of                         classic experiments that explored spontaneous
scientific research.        Taken to the logical                   generation [23, 26]. Pasteur’s experiments are
conclusion, we propose that research reports                       generally regarded as pivotal in confirming the
should be totally structured. Structured research                  importance of microbes and how they propagate.1
reports have many advantages. For instance, they
can support interactive interfaces for visualizing                 1.4 Roadmap
and exploring the relationships among interlocking
                                                                   In [3], we used Pasteur’s germ theory experiments
flows. Visualization of flows is related to timeline
                                                                   to illustrate the potential for applying direct
visualizations (e.g., [2]).
                                                                   representation to scientific research reports. In this
Several types of flows are already widely used in                  paper, we return to that example and describe how
science.      Workflows are used to specify                        several techniques proposed in our recent work can
experimental procedures (e.g., [14]). Mechanisms                   be implemented to produce unified scientific
are often central for describing complex                           research reports.
phenomena [6, 11]. However, before our work,
                                                                   Our primary goal is the development of the
Research Designs (e.g., [28]) as distinct from
                                                                   underlying modeling framework for the
Research Procedures have not been explored as
                                                                   organization and application of scientific
structured flows.
                                                                   knowledge. These models emphasize causal
Beyond describing aspects of workflows and                         relationships (rather than classification) so we
research phenomena directly, other parts of science                focus on what might be called transitional
research reports make claims and generalizations                   propositions. We also describe an interface for
about phenomena. These can be characterized as a                   interacting with the models.2
type of discourse [1, 2, 13, 15, 19, 25, 30]. We
                                                                   In short, we propose that the best way to capture
agree with [21] that research inferences cannot be
                                                                   and describe the structure of scientific research is
based simply on formal logic. Rather, they follow
                                                                   by considering multiple flows which are yoked.
a preponderance of evidence and consistency with
                                                                   The claims from the research are propositions that
other results.
                                                                   can be coordinated in a knowledgebase.
1.3 Pasteur’s Germ Theory
Germ theory was a paradigm shift in biology. It
was sparked by the development of the microscope
and the resulting ability to see microbes. Louis
Pasteur was a major proponent of germ theory,
which was the notion that tiny organisms, invisible


1 While Pasteur’s report was not as detailed as current research   2 At this point, we are not focused on inference or text mining.

reports, it is straightforward and useful as an exemplar.

2
2      STRUCTURED RESEARCH                                    Research Question, Research Motivation, and
                                                              Hypotheses.
       REPORTS
                                                              Addressing the Research Question is the
2.1 Models and Knowledge                                      immediate goal of the research. Typically, it
    Structures                                                involves determining the existence, properties,
                                                              mechanisms, processes, or applications associated
While science uses systematic manipulations
                                                              with an entity or phenomenon. In some cases, the
and/or observations, it also crucially depends on
                                                              goal may simply be the replication of other
models about the phenomena under investigation.
                                                              research or addressing some criticisms that were
We employ two major flows to capture these two
                                                              raised about prior work. In this paper, we require
aspects. The first describes the Research Design
                                                              that Research Questions can be answered with
and Procedure while the second describes the
                                                              structured propositions.4
Hypothesis (i.e., what might happen) and
Outcomes (i.e., what did happen).                             Examples of the Research Motivation might be
                                                              practical (e.g., to find cures for a disease) or simply
Research       microworlds     are     where      the
                                                              to acquire knowledge. Either way, it is an axiom,
manipulations      come     together     with     the
                                                              a given representing a valuation. Additional
phenomenon under investigation. States and state
                                                              statements may link the Research Question to the
changes are useful (at least implicitly) for models
                                                              Research Motivation.
that describe dynamic environments. Some states
are based on the properties of objects. Other states          The researcher then establishes plausible
are based on the relationship among objects (e.g.,            hypotheses by considering the factors potentially
an object is “trapped”). Sealing a flask is a                 relevant to the Research Question by referring to
complex action that achieves a state of separation.           established principles and previous research.
Breaking the flask is a way to unseal it and
instantiate a new state in which the external air can         2.3 Research Design and Procedures
move into the flask. Many research activities are             Based on the hypotheses, a Strategy is determined.
workflows that involve multiple steps and                     The Strategy consists of the Research Design and
interlock with other flows [9, 10].                           Research Procedure. The Design is an overall
                                                              framework for obtaining valid results.
While much of science is concerned with                       Independent and dependent variables are key parts
developing general principles, sciences such as               of the Design. Typically, one of the hypotheses
geology and astronomy, as well as clinical                    proposes some causal relationship between the
medicine, deal more with particulars. Reasoned                independent and dependent variables.           The
models can be developed for either general                    independent variable may be manipulated either
(abstract) principles or instances.                           directly or indirectly. In natural experiments, the
                                                              researcher identifies a natural event that creates
2.2 Creating a Research Space                                 conditions suitable for the research. These may
                                                              include cases from natural science, social science
Traditional research papers follow the IMRD                   [8], and medical science (e.g., the effects of
(Introduction, Methods, Results, Discussion)3 [32]            smoking on cancer). In field and laboratory
framework. Swales [32] described the purpose of               experiments, the researcher takes specific actions
the Introduction of a research report as “creating a          to manipulate the test environment.
research space” (CARS). This includes defining a
                                                              Standard Research Designs are so entrenched in
                                                              some fields that many researchers are unaware of

3 Some publications do not use the exact IMRD structure but   4 See https://plato.stanford.edu/entries/questions/

usually follow some permutation of it.
3
them. In other fields, a variety of research                    manipulations directly or indirectly change the
paradigms is used and their merits are debated.                 state of the microworld and/or its contents. In
[28] is a well-known analysis of the issues with                other work (e.g., [7, 8, 10]) we allow complex
different research designs. It discusses a wide                 microworlds; potentially, they could be subdivided
range of designs and provides a notation for                    and have different levels of temporal and spatial
describing them. Moreover, it compares the                      granularity.
possible threats to valid inference using different
research designs. While [28] is primarily based on              2.5 Outcomes, Internal Validity, and
field research with randomization such as is                        Comparisons
common in social science, it can and should be
                                                                As the research is conducted, the raw data can be
applied more generally.
                                                                structured and stored according to the semantic
It is highly desirable to have at least two conditions          model.     The data can be manipulated and
for comparison [28]. This is especially true when               workflows for data transformations and statistical
one group is a control group and there is                       analyses can be included7 along with the massaged
randomization of participants across conditions.                data.
However, these recommendations are not followed
when a second group is difficult or impossible to               Using the data, we can make comparisons across
implement, or when the researcher believes that                 the flows. These comparisons are the basis for
he/she knows about and has controlled for possible              claims. Claims are propositions. They have a truth
extraneous factors.                                             value that expresses a judgment or opinion about
The Research Procedure is a script or plan for the              some aspect of the research (e.g., the causal
researcher’s actions. It applies methods and                    relationship between the independent and
materials. Those are usually specific to the domain             dependent variables).
under investigation and may threaten the internal               The primary comparison is set up by the Research
validity of the research if applied incorrectly.                Design. In Pasteur’s study which we analyze
                                                                below, the comparison is relatively simple. In
2.4 Hypotheses and Microworlds                                  other cases (e.g., [33]), the comparisons may
There is considerable controversy about the role of             involve complex objects and processes, and
hypotheses in scientific research. In cases such as             statistical tests that require additional flows.
Pasteur’s experiment discussed below, the
                                                                Research must satisfy many constraints; many
hypotheses are sharply drawn and are associated
                                                                things can go wrong and invalidate the results.
with a distinct, although not necessarily fully
                                                                [28] identifies two major types of validity for
understood, mechanism. However, in other cases,
                                                                research, internal and external validity8. Internal
a hypothesis may be nothing more than a hunch.5
                                                                validity refers to problems with the Research
Our models are typically situated in a microworld6              Procedure and Methods, and whether they
which is a spatial region that provides the context             implemented the intended research conditions.
for the interaction of objects involved in the                  The researcher may check on the effect of a novel
phenomenon under investigation [12].           The

5                                                               6 This term is adopted from object-oriented programming. In
  Perhaps it would be better to use the term “potential
explanation” rather than hypothesis. For example, in [4] we     our applications, it may be more appropriate to call it a
                                                                simulation space.
examined [33], a modern biology paper dealing with the          7 These could follow the scripts of any of several statistical
protein pathway related to Wallerian Degeneration. That paper
                                                                analysis packages, although a common interchange
cast a wide net and tested hypotheses which seemed unlikely     framework would be preferred.
to be relevant.                                                 8 They also mention statistical conclusion validity and

                                                                construct validity.

4
or tricky manipulation. Such checks on the                         wine, beer, tofu, and soy sauce making, and for
manipulation would also be described with flows.                   controlling infectious disease. In [3], we used
                                                                   Pasteur’s research to explore the possibilities for
[28] lists potential threats to validity for each
                                                                   highly structured research reports. In this paper,
research design. Structured research reports
                                                                   we take another step toward realizing that goal.
should include specific structures for handling
                                                                   We consider one of a series of related experiments
each of these issues. For instance, the outcome
                                                                   by Pasteur. Specifically, we develop flows and an
summary could have a list of hypotheses and
                                                                   interface for presenting a structured description of
challenges to their validity.
                                                                   one of Pasteur’s germ theory experiments.
2.6 External Validity,                                             Pasteur put a nutrient broth in two sets of flasks.
    Generalizations, and                                           He boiled the broth and then sealed the neck of the
    Explanations                                                   flasks. He observed the flasks and eventually
                                                                   broke the neck open on one set of them. The flasks
External validity refers to the ability to generalize              that remained sealed did not show microbe growth,
beyond the experiment. Some generalizations may                    while the flasks with the broken necks did.
be straightforward, but others would be based on
conditions.        [28] describes criteria for                     We separate two main streams of activity in
generalizations.     Generalization may require                    describing the experiments. The first is the
referring to broader issues within the research area               Researcher Activity Model, which is what the
or in other areas.                                                 researcher does based on the Design and Procedure.
                                                                   The second is the Outcomes Model, which is what
We would like to model those broader contexts,                     happens, or could happen, in the environment
but, in many cases, they are not currently part of                 under investigation. Although we distinguish them,
any structured model base. Eventually, such a                      the two streams are closely interlinked or yoked.
model base could be developed; until then we can
sketch a temporary framework (see Section 3.4).                    We focus on modeling the microworld and frame
                                                                   the experiment as a research design with two
Explanations may simply state a general rule.                      conditions. In the first condition, broth-filled
They may also try to describe how the rule applies                 flasks are sealed and then observed indefinitely. In
to a given situation. If pressed, a mechanism to                   the second condition, the flasks are sealed but
support the rule might be given. For instance, if                  eventually broken to demonstrate that spoilage
we were explaining why hot air balloons rise, we                   occurs once external air reaches the broth. The
would assert the rule that “hot air rises” and then                critical test, between the sealed and broken-neck
might go into a discussion of the molecular                        flasks, is determined by the Research Design and
dynamics of gasses (see Section 4.1).                              the manipulations.9
                                                                   By modern standards, Pasteur’s description of the
3       PASTEUR’S SPONTANEOUS
                                                                   research is somewhat informal. For instance,
        GENERATION EXPERIMENT                                      although Pasteur mentions that he made multiple
                                                                   flasks, we do not know how many. For illustrative
3.1 Overview                                                       purposes, we have inferred details as needed to
Farmers     have    considerable    interest   in                  complete these examples.
understanding and controlling fermentation. The
results of Pasteur’s studies [23, 26, 31] are of
practical importance for endeavors such as dairy,

9 No systematic randomization was done and there was no            the comparisons that can be made and that must be explicitly
statistically significant sample, but the control groups suggest   represented.
5
3.2 Prototype Interactive Interface                               Toggle Model Details: Shows additional details of
                                                                   the models. Potentially, there would be unique IDs
Figure 1 shows the Researcher Activity Model (left)                for each of the model entities and transitions and the
                                                                   ontological parents associated with each could be
and Outcomes Model (center). Each has two                          displayed [10].
columns, for each of the two conditions. Also                     Threats [to validity] and Alternative Explanations
shown (right) are Actual results and the key                      Inferences, Related Research, Applications, and
                                                                   Commentary
comparison that indicates that H1 (Hypothesis1) is
supported (lower right).                                  The interface was implemented with Python using
                                                          the Tk graphics library. Development is ongoing;
At the top of the interface, there are several options
                                                          the current version is tailored to the specific
to control the features of the visualization. These
                                                          example and does not include all the features
include:
                                                          needed for other research reports.
       Toggle Method Details: Presents        detailed
        descriptions of the procedures.

Figure 1: Screenshot of our interactive interface. The Conditions (left) follow the Research Design (blue) and
Research Procedures (maroon). The Hypothesis Models and expected results are shown in green. The main
comparisons for the hypotheses are shown (far right) in red and the conclusion in gold.




3.3 Hypotheses and the Microworld                         We focus here on Hypothesis1 because it is much
                                                          more specific than Hypothesis0. Hypothesis1 is
    Model                                                 justified by several claims:
Because of the complex interaction of entities in
the Microworld, developing the full hypothesis                    Microbes can be carried by air currents         (0)
                                                                  Sealing the flask neck blocks outside air       (1)
models required additions to our evolving ontology                Breaking flask neck allows outside air to enter (2)
and modeling framework. While some air had live                   High temperatures kill microbes                 (3)
microbes suspended in it, the air in the sealed flask             Microbes feed in a nutrient medium              (4)
had no live microbes. Thus, the state of the air is               Microbes will reproduce given food and other
                                                                   suitable conditions                             (5)
correlated with its location and the history of that              Metabolism by many microbes results in spoilage
location.                                                                                                          (6)
6
Earlier research by Pasteur had confirmed (0). The                 the reproductive processes of the microbes would
other claims are largely consistent with common                    provide support.
sense, though they could be tested more
                                                                   As noted earlier, the research outcomes need to
systematically as needed. However, even with
                                                                   satisfy both internal and external validity. Internal
extended testing, it is difficult to make an
                                                                   validity concerns what happened because of the
unassailable case [21].
                                                                   experimental procedure. For instance, we could
A full executable flow model for Hypothesis1                       dismiss (*9) 12 based on the experience of farmers.
would be analogous to the flow model in [10].                               A longer time is needed for spoilage to develop than
Note that a model for Hypothesis1 would need to                              was used                                        (*9)
include models of airflow in the microworld, ad
                                                                   (*10) was proposed by Antoine Béchamp, one of
hoc subregions for the air in the flasks, and multi-
                                                                   Pasteur’s critics. The claim was that sealing the
granular models that describe transitions of                       flask prevented air with some “vital essence” from
individual microbes as well as collections of
                                                                   reaching the broth. In a follow-up study, Pasteur
microbes.                                                          was able to dismiss this criticism with his well-
Because they are yoked, any execution of the                       known swan-neck flask experiment [26].
Hypothses1 model should execute the parallel                                Sealed air loses its vital essence            (*10)
Researcher Activity model.
                                                                   3.5 Generalizations
3.4 Outcomes
                                                                   If we combine (6) with (8’) we obtain (11).
Raw data and inferences based on those data can
                                                                            Spoilage due to fermentation can be minimized by
be collected and organized according to the models                           controlling the presence of microbes        (11)
described here. In Pasteur’s study, the key
                                                                   This suggests the need for cleanliness to control
observation is whether spoilage develops once the
                                                                   contamination in the preparation of fermented
flask neck is broken and microbe-laden air can
                                                                   products. Further, if we combine (3) with (11) we
enter. That is, the critical test for the Pasteur study
                                                                   get (12), which is the basis of pasteurization.
supports Hypothesis1, that the living microbes
carried by air currents lead to spoilage. 10 We did                         Spoilage due to fermentation can be controlled by
                                                                             heating the nutrient medium                  (12)
not model the Actual Outcomes in this case, but we
could have because they could be different than the                Joseph Lister generalized (3, 8’, 13) to bacterial
predictions of either of the Hypotheses.11                         infections to study and promote the need for sterile
                                                                   surgery. Moreover, adding (14) yields (15).
Based on accepting Hypothesis1, we can state two
overall claims:                                                             Bacteria are a type of microbe                (13)
                                                                            Antiseptics kill bacteria                     (14)
         Microbes do not develop spontaneously        (7)                  Bacterial infection can be minimized by antiseptics
         Microbes develop from other microbes         (8)                                                                 (15)
         Microbes develop only from other microbes of the
          same type                                   (8’)         Given the importance of each of these inferences
                                                                   for humans, presumably additional work would be
(8’) is a stronger version of (8). Initially, we might
                                                                   done. For instance, specific microbes and the
be less willing to accept it, but there are additional
factors we might consider. For instance, flows for

10 We might note the initial observation, that the sealed flasks   12 Following a convention in linguistics, the * indicates that

show no spoilage. For a more formal confirmation, we could         the proposition is incorrect.
conduct an additional study with a control group.
11 In [33] the results demonstrated a type of protein binding

that was not predicted by the authors.

7
details of conditions for growth could be studied        4.2 Knowledge Structures
for each medium.
                                                         Claims from research reports and general axioms
                                                         could be collected into a comprehensive
4     FUTURE WORK                                        knowledgebase. Although comprehensive, such a
4.1 Interface, Model, and Claims                         knowledgebase would be fragmented, changing,
                                                         and need to represent multiple viewpoints. Even
The interface in Figure 1 is adequate for a              for areas where there is considerable agreement,
straightforward experiment such as Pasteur’s.            there are internally consistent areas of knowledge
However, many modern research papers are much            (e.g., Newtonian mechanics) that may be usefully
more complex. For instance, [33] includes a              modeled separately from their connection to
description of developing a strain of Drosophila         broader models (e.g., quantum mechanics).
needed for the research. It then conducts a series
of overlapping studies that makes a case for its         Any knowledgebase of claims will need a range of
conclusions although no one study provides a             structured hedges to indicate the type of claim
definitive test. In such a set of studies, a great       (conceptual/logical, empirical, etc.), level of
many flows can be identified and modeled. The            confidence in the claim, and possible criticisms of
interface will need to be improved to provide better     it. We would use a preponderance-of-evidence
support for that complexity.                             criterion for the acceptance of claims.

The model and interface should be able to                To the extent that we want to do inference on these
reorganize the research report flows to fit the          propositions, we will need to support both open
IMRD framework (see Section 2.2). An IMRD                and closed worlds [27] and temporal reasoning in
Methods section would include the Research               a dynamic environment [18, 22, 29].
Design, Procedure, Methods, and Materials. Each
                                                         4.3 Services for the Scientific
of these components should fit sub-structures or
templates and be integrated into the overall IMRD            Knowledgebase
framework.                                               The knowledgebase of research reports and claims
Claims must be based on clear definitions [12].          can be viewed as a digital library. In addition to
We have proposed SUMO as an ontology. SUMO               structured research reports, the library could also
bases its rules on established definitions, but even     include structured surveys and reviews. Such a
these need to be expanded and refined.                   library could be overlaid with services like those
                                                         found in a text-based digital library such as
Although we have related claims to natural               metadata harvesting and search indexing. Because
language propositions, our structured approach           the contents are structured, daemons may be able
does not require natural language. Moreover, the         to generate text versions of the reports and to
case roles may be more exactly defined for each          identify redundancy and inconsistencies.
transitional and its interaction with various objects.
                                                         We emphasize propositions that make claims about
In Section 2.6, we suggested that an explanation         state changes such as (8). In a knowledgebase
for a claim could present a rule and an underlying       these claims should be accompanied by metadata.
mechanism.      There is a broader sense of              The metadata should include basic details such as
explanations that they should engage users in a          date and creator; they should also link to related
way that promotes understanding. For instance, an        claims. If the metadata are said to provide support
extension of Figure 1 could support graphical            for claims, the details of that support should be
guided tours as explanations. More elaborate             included.
explanations may be tutorial and can be based on
pedagogical techniques.
8
There could be links across structured research                 4    R.B. Allen, Rich linking in a digital library of full-text
                                                                     scientific research reports, Columbia University
reports that are analogous to citations [5]. Our                     Research Data Symposium, 2013, RDStext.pdf
focus is at the level of semantics rather than the              5    R.B. Allen, Rich semantic models and
characteristics of the documentation. Thus, rather                   knowledgebases for highly-structured scientific
                                                                     communication, 2017, arXiv: 1708.08423
than link authors, we link functionally and                     6    R.B. Allen, Issues for using semantic modeling to
semantically related flows (e.g., about methods)                     represent mechanisms, 2018, arXiv: 1812.11431
that are shared across research reports. In addition,           7    R.B. Allen, Definitions and semantic simulations
                                                                     based on object-oriented analysis and modeling, 2019,
measures analogous to citation metrics and alt-                      adXiv: 1912.13186
metrics could be developed for the strength of                  8    R.B. Allen, Metadata for social science datasets, In:
claims and the coherence of the knowledgebase                        Rich Search and Discovery for Research Datasets:
                                                                     Building the Next Generation of Scholarly
[16].                                                                Infrastructure, Edited by J.I. Lane, I. Mulvany, P.
                                                                     Nathan, Sage Publishing, 2020, PDF
Finally, (structured) annotations and commentary                9    R.B. Allen, Semantic modeling with SUMO, 2020,
could be added. And administrative and editorial                     arXiv: 2012.15835
policies should be developed for managing the                   10   R.B. Allen, YM. Chu, Semantic models of pottery
                                                                     making, Pacific Neighborhood Consortium, 2021 in
collection.                                                          press
                                                                11   W. Bechtel, A. Abrahamsen, Explanation: A
4.4 Envoi                                                            mechanist alternative. In Special Issue: Mechanisms
                                                                     in Biology. Studies in History and Philosophy of
We have proposed using yoked flows to manage                         Science Part C Studies in History and Philosophy of
the complexity of scientific research reports and                    Biological and Biomedical Sciences, 36(2), 421-41,
                                                                     2005, doi: 10.1016/j.shpsc.2005.03.010
have presented a prototype of a user interface for              12   J.J. Cimino, Desiderata for controlled medical
exploring those flows.                                               vocabularies in the twenty-first century, 1998,
                                                                     Methods of Information in Medicine, 1998, 37(4-5),
In addition, we have discussed issues for how                        394–403, PMC: 3415631
claims from empirical scientific research can be                13   T.     Clark,   P.N.     Ciccarese,      C.A.     Goble,
                                                                     Micropublications: A semantic model for claims,
collected and coordinated. The discovery and                         evidence, arguments, and annotations in biomedical
evaluation of causal claims are common to other                      communications, Journal of Biomedical Semantics,
scientific paradigms [20]. While those other                         2014, doi: 10.1186/2041-1480-5-28
                                                                14   D. de Roure, C. Goble, R. Stevens, The design and
paradigms may have different procedures than                         realisation of the myexperiment virtual research
empirical research, they are also based on flows.                    environment for social sharing of workflows. Future
Even if we distinguish classification (e.g.,                         Generation Computer Systems 25, 2009, 561-567. doi:
                                                                     10.1016/j.future.2008.06.010
identifying different types of microbes) as a                   15   A. deWaard, 2010, From proteins to fairytales:
scientific research activity, flows are still used and               Directions in Semantic Publishing, Intelligent
could be modeled.                                                    Systems, IEEE, 25(2), 83-88, 2010, doi:
                                                                     10.1109/MIS.2010.49
While we have pointed out some promising                        16   M. Friendman, Explanation and scientific
                                                                     understanding, Journal of Philosophy, 71(1), 1974, 5-
directions, there is still challenging work to be                    19
done in populating and organizing a large                       17   R. Furuta, F.M. Shipman, CC. Marshall, D. Brenner,
knowledgebase of credible propositions.                              H-W. Hsieh, Hypertext paths and the world-wide web:
                                                                     experiences with Walden’s paths, ACM Hypertext,
                                                                     1997, doi: 10.1145/267437.267455
5     REFERENCES                                                18   A. Galton, Spatial and temporal knowledge
                                                                     representation, Earth Science Informatics, 2, 169–187,
1     R.B. Allen, Highly structured scientific publications.         2019, doi: 10.1007/s12145-009-0027-6
      ACM/IEEE Joint Conference on Digital Libraries,           19   N.L. Green, Toward mining scientific discourse using
      472, 2007, doi: 10.1145/1255175.1255271                        argumentation schemes, Argument & Computation 9,
2     R.B. Allen, Visualization, causation, and history,             121–135, 2019, doi: 10.3233/AAC-18003
      iConference, 2011, doi: 10.1145/1940761.1940835           20   T. Hey, S. Tansley, K. Tolle (eds.), The Fourth
3     R.B. Allen, Model-oriented scientific research reports,        Paradigm: Data-Intensive Scientific Discovery,
      D-Lib Magazine, 2011 doi: 10.1045/may2011-allen                Microsoft Research, 2009


9
21   .I. Lakatos, A. Musgrave, Criticism and the growth of   28   W.R. Shadish, T.D. Cook, D.T., Campbell,
     knowledge, Proceedings of the International                  Experimental and Quasi-Experiment Designs for
     Colloquium in the Philosophy of Science, London,             Generalized Causal Inference, Houghton, Mifflin,
     V.4, Cambridge University Press, 1965                        Boston, 2002,
22   E.T. Mueller, Commonsense Reasoning: An Event           29   M. Shanahan, The Event Calculus Explained, LNAI
     Calculus Approach, Morgan Kaufmann/Elsevier,                 (1600) 409-430, 1999, Citeseer: 10.1.1.596.7046
     2014, ISBN-13: 978-0128014165                           30   S.B. Shum, E, Motta, J. Domingue. ScholOnto: An
23   L Pasteur, Sur les corpuscules organisés qui existent        ontology-based digital library server for research
     dans l'atmosphère: examen de la doctrine des                 documents and discourse. International Journal on
     générations spontanées: leçon professée à la Sociéte         Digital Libraries, 3, 237–248, 2000, doi:
     chimique de Paris le 19 mai 1861, par L. Pasteur.            10.1007/s007990000034
24   A. Pease, Ontology: A Practical Guide, Articulate       31   D.E. Stokes, Pasteur's Quadrant – Basic Science and
     Software Press, Angwin, CA, 2011                             Technological Innovation. Brookings, 1997, ISBN
25   M. Pera, The Discourses of Science, Translated: C.           9780815781776
     Botsford, University of Chicago Press, 1994             32   J.M. Swales, 2004, Research Genres: Explorations
26   J.R. Porter, Louis Pasteur: Achievements and                 and Applications. Cambridge University Press,
     disappointments: 1861, Bacteriological Reviews,              Cambridge UK, 2004
     25(4), 389-403, 1961, doi: 10.1128/br.25.4.389-         33   R.G. Zhai, Y. Cao, P. R. Hiesinger, Y. Zhou, S.Q.
     403.1961                                                     Mehta, K.K. Schulze, P. Verstreken, H.J. Bellen,
27   A. Rector, J.Rogers, A. Taweel, Models and inference         Drosophila NMNAT maintains neural integrity
     methods for clinical systems: A principled approach,         independent of its NAD synthesis activity. PLOS
     Studies in Health Technology and Informatics, 2004,          Biology, 4, 2006, doi: 10.1371/journal.pbio.0040416
     107(Pt 1), 79-83, PMID: 15360779




10