=Paper=
{{Paper
|id=Vol-1087/paper3
|storemode=property
|title=A Lattice-Theoretic Approach for Representing and Managing Hypothesis-driven Research
|pdfUrl=https://ceur-ws.org/Vol-1087/paper3.pdf
|volume=Vol-1087
|dblpUrl=https://dblp.org/rec/conf/amw/GoncalvesP13
}}
==A Lattice-Theoretic Approach for Representing and Managing Hypothesis-driven Research==
A lattice-theoretic approach for representing and managing hypothesis-driven research Bernardo Gonçalves, Fabio Porto Extreme Data Lab (DEXL Lab) National Laboratory for Scientific Computing (LNCC), Av. Getulio Vargas 333, Petrópolis, Brazil {bgonc,fporto}@lncc.br Abstract. As the problems of interest in large-scale science are ever more complex or inter- twined, scientists can benefit from machinery to manage hypotheses and their interconnection in the context of a large-scale research project. A partial order is assumed to model hypoth- esis interconnectivity, meaning that hypotheses may be ‘based on or equal to’ one another. Then we investigate lattices as a mathematical abstraction for representing and managing hypothesis-driven research. We present queries designed to exploit lattice-theoretic topolog- ical features of a hypothesis-driven research and then discuss their complexity. Keywords: Scientific Hypotheses, Large-scale Science, Data Management, Lattice Theory. 1 Introduction Large-scale science relies on data management technology in order to deal with the increasing amount of observational and experimentally-produced data. As the problems of interest are ever more complex or intertwined, scientists can also benefit from machinery to manage hypotheses in the context of a large-scale research project. Hypotheses appear interconnected in a research, playing roles of assumption or proposition to explain phenomena [11]. We say that a research is hypothesis-driven if hypotheses are central elements for its conceptual organization and key to its fulfillment. Problem Statement. Representation and management of scientific hypotheses within the context of a large-scale research project. We address this problem by considering that hypothesis interconnectivity is order-theoretic in nature. We write hα ≤ hβ , read hypothesis hα ‘is based on or equal to’ hypothesis hβ . We aim at enabling a scientist end-user to query topological features of a research, such as hypothesis connectivity, paths and neighborhoods. There are, nonetheless, some special topological features that seem to be of particular interest in a hypothesis-driven research: given any two hypotheses in a research, (i) what is their infimum hypothesis (“weakest stronger claim”), and (ii) what is their supremum hypothesis (“strongest weaker claim”). Both are warranted to exist for all hα , hβ in a partially-ordered set R, if R is a lattice [6,10]. In this paper we investigate lattices as a mathematical abstraction for representing and managing hypothesis-driven research. Section 2 introduces notation and concepts from Lattice Theory (Order) for normalization and easy reference. Section 3 introduces the (hypothesis-driven) research lattice abstraction. Section 4 considers querying features on a research lattice. Section 5 briefly discusses related work on hypothesis data modeling. Section 6 concludes the paper. 2 Lattice Theory: Notation and Basic Concepts Order-theoretic properties of sets can be expressed in terms of an ordering relation ≤. Let A be an arbitrary set. Then, for all a, b, c ∈ A, we consider: (P1) a≤a (reflexivity) (P2) a ≤ b and b ≤ a implies that a = b (antisymmetry) (P3) a ≤ b and b ≤ c implies that a ≤ c (transitivity). 2 B. Gonçalves, F. Porto A partially ordered set, or poset hA; ri, A for short, consists of a non-empty set A and a binary relation r on A, such that r satisfies properties (P1)–(P3). Note that a ≥ b means b ≤ a, i.e., ≤ has an inverse; and that < is defined by ¬(P1) and (P3). If A is a poset, a, b ∈ A, then a and b are comparable if a ≤ b or b ≤ a. Otherwise, a and b are incomparable, in notation, a k b. A zero of a poset P is an element 0 with 0 ≤ x for all x ∈ P . A unit, 1, satisfies x ≤ 1 for all x ∈ P . There are at most one zero and at most one unit in a poset. A bounded poset is one that has both 0 and 1. A poset P can be represented by a list of the pairs of elements satisfying the partial order. It can also be represented by means of the covering relation: a is covered by b, in notation, a ≺ b, if a < b and for no x, a < x < b. This relation is enough to determine a finite poset [10], with hP ; ≺i for P, and it is used to build its Hasse diagram.1 Given two hypotheses hα , hβ ∈ R, we shall write: hα ≤ hβ (hα ‘is based on or equal to’ hβ ); hα < hβ (hα ‘is based on’ hβ ); hα ≺ hβ (hα ‘is directly based on’ hβ ); and hα k hβ (hα ‘is incomparable to’ hβ ). Let H ⊆ P , a ∈ P , for an arbitrary poset hP ; ≤i. Then a is an upper bound of H, if h ≤ a for all h ∈ H. An upper bound a of H is the least upper bound of H or supremum of H if, for any upper bound b of H, we have a ≤ b. It is written a = sup H, and the uniqueness of the supremum can be shown straightforwardly.2 The concepts of lower bound and greatest lower bound or infimum are similarly defined. The latter is denoted by inf H, and uniqueness of the infimum is verified likewise. The set of upper (lower) bounds of an element h is denoted ↑ h (↓ h). Def. 1. A poset L is a lattice if sup{a, b} and inf{a, b} exist for all a, b ∈ L. We can refer to lattices as algebraic structures by the notation: a ∨ b ≡ sup{a, b} a ∧ b ≡ inf{a, b} read ∨ the join, and ∧ the meet. In lattices, these are binary operations, i.e., they can be applied to any pair of elements a, b ∈ L to yield again an element of L. For all a, b ∈ L, these operations satisfy idempotency, commutativity, associativity and absorption. Lattice Theory has singled out a special kind of posets for detailed investigation. It is a vast mathematical theory with plenty of interesting results. In Theoretical Computer Science, for instance, lattices are used in the study of semantic properties of programs (e.g., confluence, termination) [6]. In the sequel we refer to lattices as a mathematical abstraction for hypothesis-driven research. The inf–sup existence property of lattices as special posets may translate into neat properties of a hypothesis-driven research as a scientific theory. 3 The Research Lattice Abstraction We characterize a hypothesis-driven research as a poset, further constrained to be a lattice. Def. 2. Let H be a set of hypotheses. The top, >, is a special hypothesis such that h ≤ >, for all h ∈ H. The bottom, ⊥, is another special hypothesis, dual of >, such that ⊥ ≤ h, for all h ∈ H. Def. 3. A hypothesis-driven research poset R is a finite, bounded poset, whose elements are hypotheses, > ∈ R is unit, and partial order ≤ reads ‘is based on or equal to.’3 Def. 4. Let R be a research poset. We say that R is lifted, written R⊥ , if ⊥ ∈ R. Then ⊥ is zero of R⊥ . 1 http://en.wikipedia.org/wiki/Hasse_diagram. 2 In fact, if a0 and a1 are both suprema of H, then a0 ≤ a1 , since a1 is an upper bound, and a0 is a supremum. Similarly, a1 ≤ a0 , since a0 is an upper bound, and a1 is a supremum. Thus, by (P2), a0 = a1 . 3 N.B.: since R is a finite poset, we can infer the set of partial order pairs ≤ from the set of covering pairs ≺. At a logical level, we shall refer to both somewhat interchangeably. At the user level, only ≺ is seen. A lattice-theoretic approach for managing hypothesis-driven research 3 The top hypothesis, >, serves as a root element under which a research poset is built. It assumes nothing, leaving everything open for consideration. The bottom hypothesis, ⊥, is meant to indicate that the research is yet unfinished, either because it is simply ongoing (without a final hypothesis), or because there are two or more incomparable hypotheses that can be considered to be competing. If there exists a hypothesis hυ 6= ⊥ that is directly based on two others, i.e., hυ ≺ hα and hυ ≺ hβ , then it is intuitive to say that hα and hβ are confluent to hυ as trains of thought. As a stronger notion, consider a mental operation in which a scientist practitioner takes two (or more) hypotheses hα and hβ as input and (non-deterministically) generates another hypothesis hυ as output. Def. 5 gives a precise meaning for that conceptual operation, which is founded in Conceptual Blending [7], an established theory in cognitive sciences. Def. 5. Let R be a research poset, and let S ⊂ R such that |S| > 1 and x k y for all x, y ∈ S. Then h ∈ R is a blend of S, written h C S, if h ≺ s for all s ∈ S and h ≺ s0 implies s0 ∈ S. Examples of research posets are presented in Fig. 1. They have been derived from authori- tative sources, i.e., taken from their (historical) research context and organized according to our conceptual framework—see Appendix B; note, in contrast to Fig. 1, that hypotheses lack their inherent interconnectivity when presented in tabular form. Those research posets are all lattices. Nonetheless, we need to investigate research posets in general w.r.t. the inf–sup existence property. In a poset R, the greatest lower bound inf{x, y} of whatever pair {x, y} ⊂ R may fail to exist for two different reasons [6]: (i) because x and y have no common lower bound, i.e., {x, y}` = ∅; or (ii) because they have no greatest lower bound, i.e., | Max{x, y}` | > 1. The rationale for sup{x, y} non-existence is dually the same. Lemma 1. Let R be a research poset. Then {x, y}` 6= ∅ and {x, y}u 6= ∅ for all x, y ∈ R. Proof. By Def. 3, R is bounded. Then it has both 0 and 1. Since it has 0, and 0 ≤ x for all x ∈ R, we have 0 ∈ {x, y}` . Thus {x, y}` 6= ∅. Similarly, since it has 1 and x ≤ 1 for all x ∈ R, we have 1 ∈ {x, y}u . Thus {x, y}u 6= ∅. By Lemma 1, we know that condition (i) above is avoided. We still need to ensure that condition (ii) is avoided as well. For that, we shall constrain our application of Conceptual Blending specifically for scientific thinking (Axiom 1). This axiom can be interpreted as a parsimony principle. Axiom 1. Let R be a research poset, and let h C S such that {x, y} ⊆ S, for h ∈ R and S ⊂ R. Then there can be no h0 6= h in R such that h0 C S 0 and {x, y} ⊆ S 0 . Theorem 1. Let R = hR; ≺i be a research poset, initialized ≺ := {(⊥, >)}, and let it be freely manipulated by Algorithms 1, 2 or 3 (cf. Appendix A) to derive an arbitrary poset R0 . We claim that R0 is a lattice. Proof. The proof is given in Appendix A. 4 Querying over a Research Lattice We can treat querying features related to inf{a, b} and sup{a, b} from an algebraic perspective through the operators meet and join (cf. Section 2). They are among the main querying capabilities we have in mind in view of a scientist end-user thinking over his/her research lattice. Let us consider some query examples (see Fig. 1). (Q1). Given (h2 ) Polynucletide structure and (h3 ) DNA storage of genetic information, find their join (“strongest weaker claim”): { h ∈ R | h = h2 ∨ h3 }. [ h1 ]. (Q2). Given (h2 ) Intra-species competition and (h5 ) Predator-prey coupling, find their meet (“weak- est stronger claim”): { h ∈ R | h = h2 ∧ h5 }. [ ⊥ ]. (Q3). Given (h3 ) Logistic growth and (h5 ) Predator-prey S∗ coupling, find the disjoint union of their complementary upper bounds: ( ↑ h3 \ ↑ h5 ) ( ↑ h5 \ ↑ h3 ). [ {(h3 , 1), (h2 , 1), (h5 , 2), (h4 , 2)} ]. 4 B. Gonçalves, F. Porto > > (h1 ) > Nucleic acid molecule (h2 ) (h1 ) (h4 ) Intra-species Intrinsic T rophic (h1 ) (h2 ) competition reaction interactions (h2 ) (h3 ) F alling Elliptical Polynucleotide DNA storage of bodies orbits structure genetic information (h3 ) (h5 ) Logistic Predator-prey (h4 ) (h3 ) growth coupling Chargaff ’s Gravitation rule — ⊥ (h5 ) is directly based on (upwards) DN A double helix Fig. 1. Hasse diagrams of research posets (lattices) from diverse fields of science. Hypotheses (the elements) are connected by line segments (the covering relation ≺) that go upward from hα to hβ whenever hα ‘is directly based on’ hβ . The ‘ID’ and ‘name’ properties for each element are presented (cf. Appendix B). These queries illustrate a potential interest (Q1) in the strongest claim once dropping h2 and h3 ; or (Q2) in the weakest claim once assuming h2 and h5 ; or (Q3) in clustering sets of competing hypotheses (one set competing with the other) for inspection. Retrieving the up-set ↑ h of a hypoth- esis (reflexive transitive closure of ≤) in a research lattice can also be relevant. For brevity, we are presenting small examples, but the research lattice abstraction is meant to be used in large-scale science. Furthermore, we can think of research lattices being further merged on the web (merge as an additional operation), such that large lattices are to be formed. The complexity of queries related to the meet and join operators (Q1-Q2) or to computing reflexive transitive closures (Q3) is affordable. Efficient methods for the two lattice operations that also involve computing the closures are available in the literature [1,15,8]. They have been motivated by theoretical concerns [15], multiple inheritance in programming language design [1], or querying over trees and directed acyclic graphs (DAGs) [8]. Taking advantage of pre-processing on specific data structures, they can be computed in almost constant time [1]. Besides, a connectivity- based search procedure for research lattices does not have to differ from one for DAGs. Depth-first search (DFS) [13], for example, is a classical graph-traversal algorithm that has a linear complexity on the size of its input, O(|V | + |E|), where V stand for vertices and E for edges. The adaptation of DFS for research lattices (seen as special DAGs to be traversed) must be straightforward. A systematic study of the complexity of operations and queries on a research lattice data structure shall be object of future work. 5 Related Work Most of the current work on hypothesis modeling is found embodied in the RDF data model.4 We point out the initiatives on the Robot Scientist [12], the SWAN Knowledge Base [9], and the Hypothesis Browser (or HyQue) [5]. The choice for RDF (with OWL on top), however, is presumably more related to its reasoning capabilities and web compliance, and less because the RDF triple structure is an accurate fit for the data (the hypotheses). Graph data models [3], in general, are in fact candidates to frame hypothesis-driven research. In this sense, the RDF query language still lacks support for topological features (connectivity, neighborhoods, etc) as standard abstractions [2]. This paper contributes to the investigation of a theoretical basis for managing hypothesis-driven research (Problem 1). Seen as graphs, lattices would have the special features of: (i) being hierarchical (DAGs) and bounded; (ii) having their edge labels defined by a notion of order (the covering relation); and (iii) the inf–sup existence property for each pair of nodes. This work can be complementary to initiatives on the (semi-)automatic retrieval of hypotheses (or claims) from the narrative structure of scientific documents. Primarily, we aim at providing sci- entists with a data management framework to operate their large-scale hypothesis-driven research. 4 http://www.w3.org/TR/rdf-primer/#rdfmodel. A lattice-theoretic approach for managing hypothesis-driven research 5 6 Conclusions In this paper we introduced a lattice-theoretic approach for managing hypothesis-driven research. To our knowledge, this is the first study on scientific hypotheses from a data management point of view. In this approach, scientists themselves are enabled with operations to insert/delete hypotheses into/from their research lattices and query them further. This research vision can be made into useful data management technology for hypothesis-driven large-scale science. References 1. H. Aı̈t-Kaci, R. Boyer, P. Lincoln, and R. Nasr. Efficient implementation of lattice operations. ACM T Progr Lang Sys, 11(1):115–46, 1989. 2. R. Angles and C. Gutierrez. Querying RDF data from a graph database perspective. In Proc. of ESWC, 2005. 3. R. Angles and C. Gutierrez. Survey of graph database models. ACM Computing Surveys, 40(1):1–39, 2008. 4. N. Bacaer. A Short History of Mathematical Population Dynamics. Springer, 2011. 5. A. Callahan, M. Dumontier, and N. H. Shah. HyQue: Evaluating hypotheses using semantic web technologies. Journal of Biomedical Semantics, 2(Suppl 2):S3, 2011. 6. B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press, 2nd edition, 2002. 7. G. Fauconnier and M. Turner. The way we think: Conceptual blending and the mind’s hidden com- plexities. Basic Books, 2002. 8. J. Fischer and D. H. Huson. New common ancestor problems in trees and directed acyclic graphs. Information Processing Letters, 110(8-9):331–5, 2010. 9. Y. Gao et al. SWAN: A distributed knowledge infrastructure for alzheimer disease research. J. Web Semantics, 4(3):222–8, 2006. 10. G. A. Gratzer. General Lattice Theory. Birkhauser, 2nd edition, 1998. 11. N. R. Hanson. Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cam- bridge University Press, 1958. 12. R. D. King et al. The automation of science. Science, 324(5923):85–9, 2009. 13. J. Kleinberg and E. Tardos. Algorithm Design. Addison Wesley, 2005. 14. L. A. Pray. Discovery of DNA structure and function: Watson and Crick. Nature Education, 1(1):1–8, 2008. 15. M. Talamo and P. Vocca. A data structure for lattice representation. Theor. Comput. Sci., 175(2):373– 92, 1997. Appendix A: Proof of Theorem 1 Proof. The proof roadmap is as follows. We refer to the insertion and deletion operations to accomplish it by induction. So, consider research poset R⊥ = hR; ≺i, with ≺ = {(⊥, >)}, as given. We have inf{>, ⊥} = ⊥ and sup{>, ⊥} = >, therefore R⊥ is a lattice. Now, assume by the induction hypothesis that R of size |R| = k elements, k ≥ 2, is a lattice, either lifted or not.5 We must show that R0 := insert(R, h, ?), of size |R0 | = k + 1, is also a lattice, either after simple insertion, ? = s, or blending insertion, ? = S of a new element h. Furthermore, suppose the insertion operations have been shown to preserve the property such that R can be arbitrarily large, say |R| = M . Then, by reverse induction, assume that R of size |R| = k elements, 2 < k ≤ M , is a lattice, either lifted or not. We must show that R0 := delete(R, h), of size |R0 | = k − 1 is a lattice idem. Before we proceed with the induction argument for each operation, it is worth highlighting some general features of posets. If a ≤ b, then inf{a, b} = a and sup{a, b} = b; hence, to show that a poset is a lattice, it suffices to check the incomparable pairs, i.e., {a, b} such that a k b [6]. Consider also Def. 3, Lemma 1, to recall that after manipulation the poset must be kept bounded. In particular, after insertion of a new element h, we have to verify for all new pairs {h, x} ⊂ R0 \ R that both inf{h, x} and sup{h, x} exist; and after deletion of element h, we have to verify that it does not violate that property w.r.t. the remaining pairs. Simple insertion (Algorithm 1). New element h ≺ s is connected to its one cover s ∈ R. Then, if s = R(0), we have h = R0 (0). Thus inf{h, x} = h and sup{h, x} = x for all x ∈ R0 . Otherwise, we should ensure that R0 is bounded (cf. Def. 3). If R was already R⊥ , then just connect ⊥ to h making ⊥ ≺ h in R0⊥ . Else, lift R0 to connect ⊥ to both h and R(0). In any case, then, we have inf{h, s} = h and sup{h, s} = s, and inf{⊥, h} = ⊥ and sup{⊥, h} = h. Now, it remains to check all pairs {h, x} such that x ∈ R0 and h k x. Note, however, that ↓ h = {h, ⊥} and ⊥ ≤ x for all x ∈ R0 . Thus {h, x}` = {⊥}, i.e., inf{h, x} = ⊥. 5 As a special element, ⊥ is not counted for the purpose of the induction step, i.e., its insertion into R (“lifting”) or its removal from R (“unlifting”) is automated and not seen. 6 B. Gonçalves, F. Porto Besides, note that x, s ∈ R and R is a lattice by the induction hypothesis, therefore sup{s, x} must be well defined in R for all x ∈ R. Since {s, x}u ⊂ R does not change in R0 with the insertion of h, then sup{s, x} is well defined in R0 for all x ∈ R0 idem. Finally, since h ≺ s for exactly one s ∈ R0 , then sup{h, x} exists and is the same as sup{s, x}. Blending insertion (Algorithm 2). New element h C S is connected to its many covers s ∈ S. For blending insertion, it is required that R is already lifted R⊥ . If h is replacing ⊥ as R0 (0), then there is nothing to verify because R is a lattice by hypothesis and the connections h ≺ s in R0 just replace ⊥ ≺ s in R for all s ∈ S (the blending leads to an order-isomorphism ϕ : R → R0 ). Otherwise, connect ⊥ to h, and disconnect ⊥ from all p ⊆ S such that ⊥ ≺ p in R. Then, for all pairs {x, y} ⊆ S, we have sup{x, y} in R0 the same as in R (which, by the induction hypothesis, are well defined), and inf{x, y} = h. Finally, for all pairs {h, z} such that z ∈ R0 and h k z, we have inf{h, z} = ⊥. By the induction hypothesis, we know that sup{x, z} is well defined for all x ∈ S ⊂ R. Therefore sup{h, z} must exist for each z such that z k h and coincide to sup{x, z} for some x ∈ S. Deletion (Algorithm 3). If element h ∈ R, such that h 6= > and h 6= ⊥, is R(0), then just lift R to R0⊥ replacing h by ⊥ in R0 and there is nothing to verify. Else, it has associated non-empty sets Q0 = {q 0 ∈ R | q 0 ≺ h}, and P 0 = {p0 ∈ R | h ≺ p0 }. Consider, further, subsets Q ⊆ Q0 such that Q = {q ∈ R | q ≺ h and q ≺ x implies x = h}, and P ⊆ P 0 such that P = {p ∈ R | h ≺ p and x ≺ p implies x = h}. Then we can verify that R0 is a lattice by means of a twofold analysis, first based on Q and P , and then broadened to their supersets Q0 and P 0 . If |Q| = 1 and |P | ≥ 1, connect the element q ∈ Q to all p ∈ P and then remove h from R0 . When |P | > 1, we have inf{x, y} = q for all x, y ∈ P ⊂ R0 . Now, consider the superset P 0 of P . For all w ∈ P 0 \ P , (by construction) there exists some z ∈ R such that z 6= h and z ≺ w. Take any p ∈ P , and note that in R we had inf{p, w} = h. Moreover, by the induction hypothesis, inf{q, z} = ` for some ` ∈ R (which has not been affected by the deletion of h). Now, in R0 , inf{p, w} is the same as inf{q, z} and, therefore, exists. For q ∈ Q, the analysis of sup{q, r} for all r ∈ Q0 \ Q is similar to what we just have seen for the elements of P and P 0 . Besides, if |Q| > 1 and |P | = 1, connect all q ∈ Q to the only element p ∈ P and then remove h from R0 . That is the dual problem, thus the reasoning for the inf–sup existence is analogous. Otherwise, the logic follows idem. If |Q| > 0 then connect all q ∈ Q to >; if |P | > 0 then connect all p ∈ P to ⊥ (lifting R to R0⊥ in case ⊥ ∈ / R). Remove h from R0 . It is 0 straightforward to verify in R that, if |P | ≥ 2, then inf{x, y} = ⊥ for all x, y ∈ P ; and if |Q| ≥ 2, then sup{x, y} = > for all x, y ∈ Q. Algorithm 1 Simple insertion on R = hR; ≺i.6 procedure insert(R : poset, h : element, s : element) . returns: poset Require: h ∈ / R, h 6= ⊥, and s ∈ R, s 6= ⊥ Ensure: If R given is a lattice, then R returned after the insertion of h is idem R ← R ∪ {h} 3: zero ← R(0) . holds zero of R R≺ ← R≺ ∪ {(h, s)} . connects h to s if zero = s then . h is new zero of R 6: return R else if ⊥ ∈ R then . R is already lifted R⊥ R≺ ← R≺ ∪ {(⊥, h)} 9: if (⊥, s) ∈ R≺ then R≺ ← R≺ \ {(⊥, s)} . disconnects ⊥ else . lifts R to R⊥ 12: R ← R ∪ {⊥} R≺ ← R≺ ∪ {(⊥, h), (⊥, zero)} return R Algorithm 2 Blending insertion on R = hR; ≺i. procedure insert(R : poset, h : element, S : set) . returns: poset Require: h ∈/ R, ⊥ ∈ R, and S ⊂ R is parsimonious (cf. Axiom 1) Ensure: If R given is a lattice, then R returned after the insertion of h is idem R ← R ∪ {h} 3: if ⊥ C S then . h is replacing ⊥ as R(0) for all s ∈ S do R≺ ← R≺ ∪ {(h, s)} 6: R≺ ← R≺ \ {(⊥, s)} R ← R \ {⊥} else . h is set ⊥ ≺ h ≺ s for all s ∈ S 9: for all s ∈ S do R≺ ← R≺ ∪ {(h, s)} if (⊥, s) ∈ R≺ then 12: R≺ ← R≺ \ {(⊥, s)} R≺ ← R≺ ∪ {(⊥, h)} return R 6 By a slight abuse of notation, we write R≺ instead of ≺ to refer to the set of covering pairs of R = hR; ≺i. A lattice-theoretic approach for managing hypothesis-driven research 7 Algorithm 3 Deletion on R = hR; ≺i. procedure delete(R : poset, h : element) . returns: poset Require: h 6= > and h 6= ⊥ Ensure: If R given is a lattice, then R returned after the deletion of h is idem if h = R(0) then . lifts R to R⊥ 3: R ← R ∪ {⊥} S ← {s ∈ R | h ≺ s} for all s ∈ S do 6: R≺ ← R≺ ∪ {(⊥, s)} R≺ ← R≺ \ {(h, s)} . disconnects h else 9: C ← {c ∈ R | c ≺ h or h ≺ c} Q ← {q ∈ C | q ≺ h and q ≺ x implies x = h} P ← {p ∈ C | h ≺ p and x ≺ p implies x = h} 12: if |Q| = 1 and |P | ≥ 1 then for all p ∈ P do R≺ ← R≺ ∪ {(q, p)} 15: else if |Q| > 1 and |P | = 1 then for all q ∈ Q do R≺ ← R≺ ∪ {(q, p)} 18: else for all q ∈ Q do R≺ ← R≺ ∪ {(q, >)} 21: if ⊥ ∈ / R and |P | > 0 then . lifts R to R⊥ R ← R ∪ {⊥} for all p ∈ P do 24: R≺ ← R≺ ∪ {(⊥, p)} for all c ∈ C do . disconnects h if c ≺ h then 27: R≺ ← R≺ \ {(c, h)} else R≺ ← R≺ \ {(h, c)} 30: R ← R \ {h} . removes h return R Appendix B: Hypothesis in Tabular Form Table 1. Hypotheses in Celestial Mechanics that explain phenomena of motion of planets in the solar system. Adapted from: [11]. ID Author Name Statement h1 Galileo Falling bodies “The velocities of a freely falling body are proportional to the times.” h2 Kepler Elliptical orbits “Each planet moves in an ellipse with the sun in one focus.” h3 Newton Gravitation “The gravitational force exerted by the sun on each planet is proportional to its mass and inversely proportional to the square of its distance to the sun.” Table 2. Hypotheses in Population Dynamics that explain phenomena of population growth and decay. Adapted from: [4]. ID Author Name Statement h1 Malthus Intrinsic reaction “Population changes by intrinsic reaction measured as natality and mor- tality rates.” h2 Verhulst Intra-species competition “Population growth is limited by competition for resources among indi- viduals of the same species.” h3 Verhulst Logistic growth “Population grows by intrinsic reaction, but limited by competition.” h4 Lotka, Trophic interactions “Population growth/decay is limited by inter-species trophic interactions.” Volterra h5 Lotka, Predator-prey coupling “Population grows/decays by intrinsic reaction and is regulated by Volterra predator-prey interactions.” Table 3. Hypotheses in Molecular Biology that explain phenomena of genetic information storage in living cells. Adapted from: [14]. ID Author Name Statement h1 Miescher Nucleic acid molecule “The nucleic acid molecule is a new substance that is in some sense equivalent to a protein.” h2 Levene Polynucleotide structure “Nucleic acids are composed of a series of nucleotides, with each nu- cleotide composed of just one of four nitrogen-containing bases, a sugar molecule, and a phosphate group.” h3 Avery et al. DNA storage of genetic “DNA is the exclusive chromosomal component bearing the genetic information information of living cells.” h4 Chargaff et al. Chargaff’s rule “The total amount of purines and the total amount of pyrimidines are nearly equal.” h5 Watson, Crick DNA double helix “DNA has a three-dimensional double-helical structure.”