Give Reasoning a Trie
Thomas Prokosch, François Bry
Institute for Informatics, Ludwig-Maximilian University of Munich, Germany


                                      Abstract
                                      A data structure providing an efficient storage and retrieval of logical expressions is a necessary com-
                                      ponent of every automated reasoning system. Most theorem provers rely on term indexes tuned to
                                      unification. Tabled logic programming systems in contrast favour indexes tuned to the retrieval of vari-
                                      ants or instances. This article proposes the versatile data structure Instance Trie which supports the
                                      unification of queries with stored expressions as well as the retrieval of variants, instances, and general-
                                      isations of queries. Instance tries are stable in the sense that their structure is independent of the order
                                      in which they are filled or updated. They give rise to an incremental expression retrieval.


1. Introduction
Automated reasoning [1] relies on an efficient storage of logical expressions as well as various
forms of expression retrieval. Resolution-based theorem proving systems require first and
foremost the retrieval of expressions unifying with queries. If they perform ancestor resolution
or make use of lemmas, they also require the retrieval of variants or instances of queries. Meta-
programming [2], a form of logic programming, requires unification as well as the retrieval of
instances, variants, and generalisations of queries. Tabling [3, 4, 5, 6], a form of memoing used
in logic programming, requires the retrieval of variants or generalisations of queries while logic
programming’s core reasoning requires the retrieval of expressions unifying with queries.
   This article proposes instance tries primarily developed for tabled logic programming [3, 4,
5, 6] and meta-programming. Instance tries are versatile in the sense that they support the
various retrieval modes needed in automated reasoning: The unification of queries with stored
expressions and the retrieval of variants, instances, and generalisations of queries. Instance tries
exploit well established techniques: The implementation of variables as pointers of programming
languages’ run-time systems and the substitution-based tries of the term indexes of resolution-
based theorem proving [7, 8, 9, 10]. Instance tries re-use these techniques in a novel manner,
though, resulting in a data structure which, in contrast to most of its predecessors, is stable in
the sense that it is independent of the order in which it is filled or updated and gives rise to an
incremental expression retrieval.
   The article is structured as follows. Section 1 is this introduction. Section 2 is devoted to
related work. Section 3 briefly recalls the concepts referred to, and introduces the notations
used in this article. Section 4 describes the structure of an instance trie. Section 5 addresses

PAAR 2020: Seventh Workshop on Practical Aspects of Automated Reasoning, June 29–30, 2020, Paris, France (virtual)
  prokosch@pms.ifi.lmu.de (T. Prokosch); bry@lmu.de (F. Bry)
 0000-0003-4367-0832 (T. Prokosch); 0000-0002-0532-6005 (F. Bry)
                                    © 2020 Copyright for this paper by its authors.
                                    Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                      93
Thomas Prokosch et al. CEUR Workshop Proceedings                                             93–108


various data retrieval modes for instance tries. Section 6 concludes the article.


2. Related work
The book [8] and the book chapter [9] both survey “term indexing”, that is, data structures for
the storage and retrieval of first-order logic terms or atoms. They complement each other: [8]
covers “out-of-fashion” data structures, [9] covers data structures which did not exist when [8]
was written.
   Tries have been introduced in [11] for storing strings, the name “trie” in was introduced
in [12]. Tries store strings after their lexicographical order. Tries have advantages over hash
tables: They are not subject to collisions and are stable in the sense that their structures are
independent of the insertion order.
   The Path-Indexing Method [13] stores terms in “path lists” themselves stored in tries or hash
tables. A Path-Index is not “stable” in the sense that its structure depends on the insertion
order what might be beneficial in reasoning with commutative functions [13, p. 15] but makes
retrieval less efficient. Path-Indexing is “versatile” in the sense that it supports all query modes
variant, instance, generalisation and unification. Path-Indexes disregard variables’ identities [13,
p. 6] and therefore return false positives among the answers to retrieval queries.
   Dynamic Path Indexing [14] fixes a deficiency of the Path-Indexing Method [13]. For efficiency
reasons, a dynamic path index re-orders the terms it stores. Thus, a dynamic path index is not
stable. Like a path-index, a dynamic path-index is versatile.
   Extended Path Indexing [15] improves the efficiency of Path-Indexing for non-linear queries.
While Extended Path Indexing’s algorithms rely on substitutions, its indexes store terms. Like a
path-index and a dynamic path-index, a dynamic path-index is not stable but versatile.
   Discrimination Trees [16, 17] store terms in their leaves (their inner nodes represent term
prefixes). Like path indexes, discrimination trees disregard variables’ identities and therefore
return false positives what makes necessary post-retrieval matching or unification tests. Dis-
crimination trees identify and order terms with randomly assigned integers [17, p. 158] what
makes them non-stable. They are versatile.
   Deterministic Discrimination Trees [18] improve discrimination trees by making a post-re-
trieval matching test unnecessary for retrieving variant, instances or generalisations of linear
queries [18, p. 323]. Deterministic Discrimination Trees are not stable but they are versatile.
   Adaptive Discrimination Trees [19] adapt the traversal order on every insertion what makes
them non-stable but yields faster retrievals of generalisations, the only query mode supported
[19, p. 248]. Adaptive Discrimination Trees are not stable and not versatile.
   Abstraction Trees [20] represent terms by substitutions and are structured by anti-unification
[21, 22]: Parent nodes carry strict generalisations of their children. The inner nodes of abstraction
trees do not store terms. A heuristic decides which terms are anti-unified what makes abstraction
trees non-stable. They are versatile.
   Substitution Trees [7] combine features of discrimination and abstraction trees and represent
the stored terms by substitutions. They have better memory usage and retrieval time than
abstraction trees. They are not stable but versatile.


                                                 94
Thomas Prokosch et al. CEUR Workshop Proceedings                                                    93–108


   Downward Substitution Trees [23] modify substitution trees to ensure stability and the linearity
of deletions.
   Code Trees [24] were developed for speeding up resolution. Each query mode requires a
specific code tree: Code trees are not versatile but they are stable.
   Coded Context Trees [25] like abstraction trees represent terms by substitutions (and currying),
are structured by anti-unification and depend on the insertion order: They are not stable. They
support only the retrieval of generalisations: They are not versatile.
   Tabling [3, 4, 5, 6] consists in sharing, and therefore storing and retrieving, answers to
(sub)goals so as to improve both the efficiency and the termination of logic programs.
   Dynamic Threaded Sequential Automata (DTSA) [26] have been developed for tabling. They
are based on Sequential Factoring Automaton (SFA) which were designed for speeding up
resolution. DTSA are substitution-based tries which store data only in their leaves and their
parent-child relationship is based on unification. The structure of a DTSA depends on the
insertion order what makes them non-stable. They support only the retrieval of instances: They
are not versatile.
   Time Stamped Tries [4] have also been developed for tabling. They are simpler and require
less memory than DTSA. Their retrieval prioritizes the most often instantiated terms. They rely
on a hash function based on the derivation time of the terms: They are not stable. They support
only the retrieval of variant and instances: They are not versatile.


3. Concepts and notations
Expressions Instance tries are defined in the following referring to expressions, a generalisa-
tion of first-order terms and first-order atomic formulas.1 Expressions are defined from symbols
as follows.
   Finitely many non-variable symbols and infinitely many variables are considered. The non-
variable symbols are totally ordered by ≤𝑛𝑣 and the variables are totally ordered by ≤𝑣 .
   A constructor 𝑐 is a pair 𝑠/𝑎 with 𝑠 a non-variable symbol and 𝑎 one of finitely many arities
associated with the symbol 𝑠.
   An expression is either a variable or a non-variable expression. A non-variable expression is
either a constructor 𝑐 of arity 0, or it has the form 𝑐(𝑒1 , . . . , 𝑒𝑛 ) where 𝑐 is a constructor of arity
𝑛 ≥ 1 and 𝑒1 , . . . , 𝑒𝑛 are expressions.
   Two expressions are variable-disjoint if none of the variables occurring in the one expression
occur in the other.
   𝑒1 , . . ., and 𝑒𝑛 are the direct subexpressions of 𝑐(𝑒1 , . . . , 𝑒𝑛 ).
   Observe that no proper prefix of an expression (in standard or prefix form) is an expression
and that there are finitely many constructors.

Orders on expressions and sequences of expressions A total order ≤𝑐 is (lexicographi-
cally [27, pp. 18–19] [28, pp. 88–89]) defined as follows on the variables and constructors from
the orders ≤𝑛𝑣 , ≤𝑣 , and the order on the natural numbers:

    1
        The distinction between terms and atomic formulas is irrelevant to storage and retrieval.


                                                          95
Thomas Prokosch et al. CEUR Workshop Proceedings                                                         93–108


    • 𝑣1 <𝑐 𝑣2 if 𝑣1 and 𝑣2 are variables and 𝑣1 <𝑣 𝑣2 .
    • 𝑣 <𝑐 𝑠/𝑎 if 𝑣 is a variable and 𝑠 is a non-variable symbol.
    • 𝑠/𝑎1 <𝑐 𝑠/𝑎2 if 𝑠 is a non-variable symbol and 𝑎1 < 𝑎2 .
    • 𝑠1 /𝑎1 <𝑐 𝑠2 /𝑎2 if 𝑠1 and 𝑠2 are non-variable symbols and 𝑠1 <𝑛𝑣 𝑠2 .

  A total order ≤𝑒 on expressions is lexicographically derived from the total order ≤𝑐 on
constructors. Similarly, a total order ≤𝑠 on finite sequences of expressions is lexicographically
derived from the total order ≤𝑒 on expressions.

Prefix notation An expression 𝑐(𝑒1 , . . . , 𝑒𝑛 ) is in standard notation. This expression can also
be written without parentheses in prefix notation (also known as Polish or Łukasiewicz notation)
as 𝑐/𝑛 𝑒1 /𝑎1 . . . 𝑒𝑛 /𝑎𝑛 where 𝑎𝑖 is the arity of expression 𝑒𝑖 (1 ≤ 𝑖 ≤ 𝑛).

Notations In the following, the lower case letters 𝑎, 𝑏, 𝑐, . . . , 𝑧 with the exception of 𝑣 denote
the non-variable symbols and ≤𝑐 denotes the standard order on that set; 𝑣0 , 𝑣1 , 𝑣2 , . . . (with
subscripts) denote the variables and ≤𝑣 the order on variables: 𝑣𝑖 ≤𝑣 𝑣𝑗 if 𝑖 ≤ 𝑗. The notation
𝑣 1 , . . . , 𝑣 𝑛 (with superscripts) denotes 𝑛 arbitrary variables.

Example 1.

    • 𝑎, 𝑎(𝑏) and 𝑎(𝑏, 𝑐) are expressions in standard notation. Their counterparts in prefix nota-
      tion are 𝑎/0, 𝑎/1 𝑏/0 and 𝑎/2 𝑏/0 𝑐/0.
    • 𝑣2 and 𝑝(𝑎, 𝑎(𝑣5 )) are expressions in standard notation. Their counterparts in prefix nota-
      tions are 𝑣2 and 𝑝/2 𝑎/0 𝑎/1 𝑣5 .
    • 𝑎 <𝑒 𝑎(𝑏) <𝑒 𝑎(𝑏, 𝑐) in standard notation and 𝑣2 <𝑒 𝑝/2 𝑎/0 𝑎/1 𝑣5 in prefix notation.
    • 𝑣2 ≤𝑣 𝑣5 .
    • [𝑣6 ] <𝑠 [𝑣6 , 𝑎(𝑏, 𝑐), 𝑝(𝑎, 𝑎(𝑣5 ))] <𝑠 [𝑎(𝑏, 𝑐), 𝑝(𝑎, 𝑎(𝑣5 ))] <𝑠 [𝑝(𝑎, 𝑎(𝑣5 )), 𝑎(𝑏, 𝑐)] where
      [𝑒1 , 𝑒2 , . . . , 𝑒𝑛 ] denotes the sequence of expressions 𝑒1 , 𝑒2 , . . ., and 𝑒𝑛 .

Substitutions Let ℰ denote the set of expressions and 𝒱 the set of variables. A substitution2
𝜎 = {𝑣 1 ↦→ 𝑒1 , . . . , 𝑣 𝑛 ↦→ 𝑒𝑛 } with 𝑛 ≥ 0 denotes a total function 𝒱 → ℰ such that

    • 𝑣 1 , . . . , 𝑣 𝑛 are pairwise distinct variables,
    • 𝜎(𝑣 𝑖 ) = 𝑒𝑖 for all 𝑖 = 1, . . . , 𝑛, and
    • 𝜎(𝑣) = 𝑣 if 𝑣 ̸= 𝑣 𝑖 for all 𝑖 = 1, . . . , 𝑛.

   The application 𝑒𝜎 of a substitution 𝜎 = {𝑣 1 ↦→ 𝑒1 , . . . , 𝑣 𝑛 ↦→ 𝑒𝑛 } to an expression 𝑒 is the
expression obtained from 𝑒 by simultaneously replacing for all 𝑖 = 1, . . . , 𝑛 every occurrence
of each 𝑣 𝑖 in 𝑒 by 𝑒𝑖 . The application of the empty substitution 𝜖 to an expression leaves it
unchanged: For all expressions 𝑒, 𝑒𝜖 = 𝑒. If 𝑒 is an expression and if 𝜎 and 𝜏 are substitutions,
then (𝑒𝜎)𝜏 = 𝑒(𝜎𝜏 ) where 𝜎𝜏 denotes the composition of 𝜎 and 𝜏 .

    2
     The binding of a variable 𝑣 to an expression 𝑒 is denoted 𝑣 ↦→ 𝑒 because it reminds of the implementation of
variables as pointers. Another widespread notation for the same, used among others in [29], is 𝑒/𝑣.


                                                       96
Thomas Prokosch et al. CEUR Workshop Proceedings                                                93–108


  Let 𝜎 = {𝑣 1 ↦→ 𝑒1 , . . . , 𝑣 𝑛 ↦→ 𝑒𝑛 } be a substitution. The finite set of variables dom(𝜎) =
{𝑣 1 , . . . , 𝑣 𝑛 } is the domain of 𝜎. The set of variables modified by 𝜎, that is, variables 𝑣 such
that 𝑣𝜎 ̸= 𝑣, map to the range of 𝜎. The range of a substitution is finite.
  A renaming substitution is a substitution which is a permutation of the set 𝒱 of variables.
  Substitutions 𝜎1 and 𝜎2 are compatible if for all 𝑣 ∈ dom(𝜎1 ) ∩ dom(𝜎2 ) 𝑣𝜎1 = 𝑣𝜎2 . If 𝜎1
and 𝜎2 are compatible substitutions, then 𝜎1 ∪ 𝜎2 is a well-defined substitution.

Variants, instances, generalisations, and unifiers A variant of an expression 𝑒 is an ex-
pression 𝑒′ such that 𝑒′ = 𝑒𝜌 for some renaming substitution 𝜌.
  An instance of an expression 𝑒 is an expression 𝑒′ such that 𝑒′ = 𝑒𝜎 for some substitution 𝜎.
An instance 𝑒′ of 𝑒 is said to match to 𝑒. A strict instance of an expression 𝑒 is an instance of 𝑒
which is not a variant of 𝑒.
  A generalisation of an expression 𝑒 is an expression 𝑔 such that 𝑒 is an instance of 𝑔, that is,
𝑒 = 𝑔𝜎 for some substitution 𝜎. A strict generalisation of an expression 𝑒 is a generalisation of
𝑒 which is not a variant of 𝑒.
  A unifier 𝜈 of expressions 𝑒1 and 𝑒2 is a substitution such that 𝑒1 𝜈 = 𝑒2 𝜈.

Matching and unification A (first-order syntactic) matching problem, short matching prob-
lem, of an expression 𝑒1 in an expression 𝑒2 consists in finding substitutions 𝜈 such that
𝑒1 = 𝑒2 𝜈. Thus, a matching problem consists in finding substitutions establishing an instance,
or symmetrically generalisation, relationship between expressions [30].3
   A first-order syntactic unification problem, short unification problem, consists in finding uni-
fiers for two expressions.
   If the matching (unification, respectively) problem of 𝑒1 in 𝑒2 (𝑒1 and 𝑒2 , respectively) has
solutions, then it has solutions 𝜇 most general matchers, short mgm, of 𝑒1 in 𝑒2 (most general
unifiers, short mgu, of 𝑒1 and 𝑒2 ), which are [30, 31]

    • complete in the sense that for all matchers 𝜈 of 𝑒1 in 𝑒2 (all unifiers 𝜈 of 𝑒1 and 𝑒2 ,
      respectively) there exist a substitution 𝜎 such that 𝜈 = 𝜇𝜎,
    • equivalent in the sense that two distinct mgms (mgus, respectively) 𝜇1 and 𝜇2 of 𝑒1 in 𝑒2
      (of 𝑒1 and 𝑒2 , respectively) are identical up to a variable renaming (that is, there exists a
      renaming substitution 𝜎 such that 𝜇1 𝜎 = 𝜇2 ).

   A matching problem is a strengthening of a unification problem: If 𝑒1 = 𝑒2 𝜏 , then 𝜏 is a
unifier of 𝑒1 and 𝑒2 . The converse is false: Some solutions to unification problems of 𝑒1 and 𝑒2
are solutions neither of the matching problem of 𝑒1 in 𝑒2 , nor of the matching problem of 𝑒2
in 𝑒1 . As a consequence, matching problems can be solved with algorithms (called matching
algorithms) which might be, in some cases, more efficient than algorithms for the unification
problem (called unification algorithms).


    3
        Matching has also been called filtering, one-sided unification, and semi-unification.


                                                            97
Thomas Prokosch et al. CEUR Workshop Proceedings                                                                         93–108


4. Instance tries: Structure
An instance trie is a tree 𝑇 such that:

    • The root of 𝑇 carries, but does not store, a variable.
    • Every node of 𝑇 except the root stores an expression.
    • Every arc 𝑁1 → 𝑁2 of 𝑇 corresponds to non-renaming substitution 𝜎 such that if 𝑒 is
      the expression stored at node 𝑁1 , then the instance 𝑒𝜎 of 𝑒 is the expression stored at
      node 𝑁2 .
    • The children of a node are ordered (in a manner described below).

  Since the substitution corresponding to an arc of an instance trie is not a renaming substitution,
the expression stored at a child of a node 𝑁 is a strict instance of the expression stored at 𝑁 .

Example 2. Abstract depiction of an instance trie storing the expressions 𝑝(𝑎, 𝑣1 , 𝑣1 ), 𝑝(𝑎, 𝑏, 𝑏),
𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐), and 𝑞(𝑣5 , 𝑎, 𝑏, 𝑐):
                                             v0

                                                   p(a,v1 ,v1 )        q(v5 ,v3 ,v4 ,c)

                                                    p(a,b,b)            q(v5 ,a,b,c)
The root of the instance trie, denoted above 𝑣0 , is a variable which is not an expression stored in
the instance trie. The children of the root, 𝑝(𝑎, 𝑣1 , 𝑣1 ) and 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐) are ordered by <𝑒 , the
expression order.
   In fact, an instance trie does not store expressions but substitutions. Thus, a more faithful though
still abstract depiction of the instance trie given above is as follows:
                                                     v0


                                          {v0 ↦→p(a,v1 ,v1 )}          {v0 ↦→q(v5 ,v3 ,v4 ,c)}

                                                {v1 ↦→b}                 {v3 ↦→a,v4 ↦→b}
Composing the substitutions along a path from the root to a node yields the expression stored at
that node. 𝑞(𝑣5 , 𝑎, 𝑏, 𝑐) is for example obtained by 𝑣0 {𝑣0 ↦→ 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐)}{𝑣3 ↦→ 𝑎, 𝑣4 ↦→ 𝑏}.

   Instance tries do not store expressions in standard notation but instead in prefix notation and
furthermore using a low-level representation. The following paragraphs successively address
the representations of expressions, substitution application, substitutions, and finally instance
tries.

Representation of expressions The representation of an expression in the memory of a
run-time system is based on the expression’s prefix notation. Assuming that a constructor and a
variable are stored in 4 bytes and storage begins at address 0, the representation of 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 )
is:4
    4
        Recall that 𝑎, 𝑏, 𝑐, . . . , 𝑧 except 𝑣 denote non-variable symbols and 𝑣0 , 𝑣1 , 𝑣2 , . . . denote variables.


                                                                  98
Thomas Prokosch et al. CEUR Workshop Proceedings                                                           93–108

   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19

        𝑓 /4            𝑎/0             nil        𝑏/0          8

The leftmost, or first, occurrence of the variable 𝑣1 is represented by the value nil which indicates
that the variable is unbound. The second occurrence of the variable 𝑣1 is represented by an offset:
The address of this second occurrence’s representation, 16, minus the offset, 8, is the address
of the representation of the variable’s first occurrence, 8. Occurrences of (the representation
of) a variable like the second occurrence of 𝑣1 in 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 ) and the cell representing such
variables like the cell at address 16 in the above representation of 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 ) are called locally
bound variables.
   Three properties of an expression representation are worth stressing:

   1. The variables’ names are irrelevant to expression representations. If 𝑒 is an expression
      and 𝜌 is a variable renaming, then 𝑒 and 𝑒𝜌 have exactly the same representation.5 For
      example, the expression 𝑝(𝑎, 𝑣1 , 𝑏, 𝑣1 ) is represented exactly like 𝑝(𝑎, 𝑣0 , 𝑏, 𝑣0 ) and the
      expression 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐) is represented exactly like 𝑞(𝑣0 , 𝑣1 , 𝑣3 , 𝑐). However, this is
      not the case of the representation of substitution applications: As it is discussed below,
      𝑓 (𝑎, 𝑣1 , 𝑣2 , 𝑣3 ){𝑣2 ↦→ 𝑏, 𝑣3 ↦→ 𝑣1 } is not represented like 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 ).
   2. The representation of an expression is variable-linear, short linear, in the sense that a
      variable occurs at most once in an expression. Indeed, a non-linear expression (like
      𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 )) is represented as the application of a substitution to a linear expression
      (like 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣2 ){𝑣2 ↦→ 𝑣1 }).
   3. In an instance trie, two distinct variables cannot be bound to two distinct expressions in
      which the same variable occurs. The bindings 𝑣1 ↦→ 𝑓 (𝑣3 ) and 𝑣2 ↦→ 𝑔(𝑣3 ) for example
      cannot occur in an instance tree. In an instance trie, such bindings would be expressed
      for example as 𝑣1 ↦→ 𝑓 (𝑣3 ) and 𝑣2 ↦→ 𝑔(𝑣4 ), 𝑣4 ↦→ 𝑣3 .

   Thus, expressions can be ordered as if their variables were, in the order of their first occurrence,
𝑣0 , 𝑣1 , 𝑣2 , . . ., that is, as if they were standardized [29].
   This representation of 𝑓 (𝑎, 𝑣1 , 𝑏, 𝑣1 ) given above is abstract in the sense that it is a simplifi-
cation. Its implementation in the heap of a run-time system, that is, the concrete representation,
differs from the abstract representation as follows:

       • A next-representation address is added at the beginning of a (connected) expression repre-
         sentation so as to ease garbage collecting (but not at the beginning of each subexpression).
       • Instead of cells, tokens are considered, each of which consists of one or several consecutive
         cells.
       • A three-valued flag at the beginning of every token indicates the type of its content:
         Non-variable symbol, non-locally bound variable, or locally bound variable.
       • The token for a constructor 𝑠/𝑎 consists of three successive memory cells, the first of
         which contains the afore-mentioned type flag, the second the code of the symbol 𝑠, and
         the third the arity 𝑎.

    5
      As a consequence, expressions can be considered standardized [29]. Recall that the (unique) standardized form
of 𝑓 (𝑣2 , 𝑣5 , 𝑣2 ) is 𝑓 (𝑣0 , 𝑣1 , 𝑣0 ).


                                                                          99
Thomas Prokosch et al. CEUR Workshop Proceedings                                                                       93–108


       • A concrete expression representation might include additional information like variable
         names or a time stamp.6

  Assuming that there is no additional information, that 𝑝 and 𝑎 are represented by their
ASCII codes 112 and 97 respectively, and that nil is represented as 9999997 , the type flag takes
the values 0, 1, and 2, and using decimal instead of binary numbers for better readability, the
concrete representation of 𝑝(𝑎, 𝑣1 , 𝑣1 ) is as follows:
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

        24          0 112 3 0 97 0 1 999999 2                               5


Representation of substitution applications Assuming that the expression 𝑝(𝑎, 𝑣1 , 𝑣1 )
is stored at address 0 and the expression 𝑞(𝑏, 𝑣3 ) at address 23, the substitution application
𝑝(𝑎, 𝑣1 , 𝑣1 ){𝑣1 ↦→ 𝑞(𝑏, 𝑣3 )} is represented as follows:
Representations of 𝑝(𝑎, 𝑣1 , 𝑣1 ) and 𝑞(𝑏, 𝑣3 ) (before any substitution application):
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

       𝑝/3              𝑎/0             nil          4                                q/2       b/0         nil

Representations of the same after the application of 𝑝(𝑎, 𝑣1 , 𝑣1 ){𝑣1 ↦→ 𝑞(𝑏, 𝑣3 )}:
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

       𝑝/3              𝑎/0             23           4                                q/2       b/0         nil

Thus, binding a variable (like 𝑣1 ) to an expression (like 𝑞(𝑏, 𝑣3 )) is realised by storing the
expression’s address (23) at the variable’s address (cell 8). Observe that the cell representing
the second occurrence of the variable 𝑣1 (cell 12) keeps its offset (4) unchanged. Thus, binding
a variable 𝑣 which occurs in an expression 𝑒 to an expression 𝑒′ consists in storing at the
leftmost occurrence of 𝑣 in the representation of 𝑒 the address of the representation of 𝑒′ ,
leaving unchanged further occurrences of 𝑣 in the representation of 𝑒. This approach to binding
variables ensures that the representation of a substitution application is unique.

Representation of substitutions In an instance trie, a substitution is applied at a node 𝑁
to the expression representation 𝑅 stored at the parent node 𝑃 of 𝑁 . The substitution’s domain
consists of all variables with value nil, that is, the non-locally bound variables, of 𝑅. Consider
for example the arc 𝑃 : 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐) → 𝑁 : 𝑞(𝑣5 , 𝑎, 𝑏, 𝑐) or, in a more faithful depiction,
𝑃 : {𝑣0 ↦→ 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐)} → 𝑁 : {𝑣3 ↦→ 𝑎, 𝑣4 ↦→ 𝑏}, of Example 3. The substitution
{𝑣3 ↦→ 𝑎, 𝑣4 ↦→ 𝑏} is represented by two lists:

       • The sequence of addresses of nil variables (that is, non-locally bound variables) in the
         expression 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐) at the parent node 𝑃 (in their order of occurrence): [𝑣5 , 𝑣3 , 𝑣4 ]
       • The sequence of the values assigned to each of these variables yielding the expression
         𝑞(𝑣5 , 𝑎, 𝑏, 𝑐) at the child node 𝑁 : [𝑛𝑖𝑙, 𝑎, 𝑏]
     6
       Time stamps are needed by logic programming run-time systems for semi-naïve forward chaining and for
tabling.
     7
       nil can be represented by any other value outside the heap’s address space.


                                                                          100
Thomas Prokosch et al. CEUR Workshop Proceedings                                                      93–108


   More generally, a substitution {𝑣 1 ↦→ 𝑒1 , . . . , 𝑣 𝑛 ↦→ 𝑒𝑛 } at a node 𝑁 is represented by two
lists:

    • The sequence of addresses of nil variables (that is, non-locally bound variables) in the
      expression representation at the parent node of 𝑁 .
    • The sequence of the expressions or variable addresses8 assigned to each of these variables
      at the child node 𝑁 .

  Observe that

    • no variables occur in both lists since, as the expressions of second list are created, their
      variables are represented by so far unused memory cells,
    • the first list does not contain any variable which does not occur in the expression at the
      parent node.

Representation of instance tries             The following example illustrates the representation of
substitutions in an instance trie.

Example 3. Abstract depiction of an instance trie storing expressions 𝑞(𝑣5 , 𝑣3 , 𝑣4 , 𝑐), 𝑞(𝑣5 , 𝑎, 𝑏, 𝑐),
𝑓 (𝑎, 𝑣1 , 𝑣1 , 𝑣2 ), and 𝑓 (𝑎, 𝑏, 𝑏, 𝑐):
                                                v0

                                          f(a,v1 ,v1 ,v2 )       q(v5 ,v3 ,v4 ,c)

                                            f(a,b,b,c)            q(v5 ,a,b,c)
Faithful depiction of the same instance trie with a standard representation of substitutions:
                                                  v0


                                  {v0 ↦→f(a,v1 ,v1 ,v2 )}        {v0 ↦→q(v5 ,v3 ,v4 ,c)}

                                     {v1 ↦→b,v2 ↦→c}               {v3 ↦→a,v4 ↦→b}
Faithful depiction of the same instance trie with an abstract representation of substitutions:
                                               [] [v0 ]


                            [f(a,v1 ,v1 ,v2 )] [v1 ,v2 ]     [q(v5 ,v3 ,v4 ,c)] [v5 ,v3 ,v4 ]

                                     [b,c] []                         [nil,a,b] [v5 ]

  A substitution {𝑣 1 ↦→ 𝑒1 , . . . , 𝑣 𝑛 ↦→ 𝑒𝑛 } at a node 𝑁 is represented by the expression
sequence [𝑒1 , . . . , 𝑒𝑛 ] for the variables [𝑣 1 , . . . , 𝑣 𝑛 ] in the parent node’s variable sequence.
    8
     Which turn nil variables in the parent node’s expression into locally bound variables in the child node’s
expression.


                                                           101
Thomas Prokosch et al. CEUR Workshop Proceedings                                                              93–108


The children of a node are ordered after their expression sequences. For all expressions 𝑒,
𝑛𝑖𝑙 ≤𝑒 𝑒, reflecting that as variable precedes every expression. The order on unbound variables
is reflected by the positions of nil in expression sequences. The root of an instance trie has
an empty expression sequence because it has no parent node. A node with an empty variable
sequence [] (like the leftmost leaf in the above example) cannot have any children what reflects
that the expression stored at such a node has no strict instance. The node of an instance trie
carries two sequences:

     • A sequence of expressions or nil representing the bindings of its parent node’s variable.
     • The sequence of the addresses of the nil variables occurring in the node’s expression.

   The representations of the two kinds of sequences is as follows:

     • The first sequence, the sequence of variable bindings, is a sequence of representations of
       expressions. Since representations of expressions have variable lengths, the sequence is
       represented as a linked list.
     • The second sequence, the sequence of nil variables, is a sequence of representation of
       variables, that is, of addresses of memory cells representing variables. Since such cells
       all have the same length, the sequence can be represented as a memory block beginning
       with the number of variable addresses stored in the block.

   In addition to the afore-mentioned two lists, the representation of the node 𝑁 of an instance
tree includes a data structure storing, and giving a fast access to, the addresses of the children
of 𝑁 . If this number is small, a linked list can be used. Otherwise, a binary tree, or a B-tree is
convenient.
   The expression stored at a node 𝑁 of an instance trie is obtained by successively applying to
the instance trie’s root variable the substitutions along the path from the root to 𝑁 . Because
of this, instance tries are a kind of substitution-based tries or substitution tree in [7, 8, 10].9
Most data structures so far designed for automated reasoning and logic programming are, like
instance tries, substitution tries. They differ from each other on how the expression 𝑒𝑁 stored
at a node 𝑁 relates to the expression 𝑒𝑃 stored at the parent node 𝑃 of 𝑁 . In Instance Tries,
this relationship is structural: 𝑒𝑁 is a strict instance of 𝑒𝑃 . For most substitution tries proposed
for automated reasoning, this relationship is random: It results from the storing or updating
order of expressions.


5. Instance tries: Operations
Let 𝑞 be a query, that is, an expression to be evaluated against the expressions stored in an
instance trie 𝑇 . Four query modes are considered:

    1. Variant: Is 𝑞 a variant of an expression stored in 𝑇 ?
    2. Instance: Which expressions stored in 𝑇 are (strict) instances of 𝑞?
    9
     Substitution-based tries are not expression-based tries because an expression 𝑒𝐶 stored at a child 𝐶 of a node
𝑃 is not a strict prefix of the expression 𝑒𝑃 stored at 𝑃 . Indeed, no strict prefix of an expression is an expression.


                                                         102
Thomas Prokosch et al. CEUR Workshop Proceedings                                                                                       93–108


   3. Generalisation: Which expressions stored in 𝑇 are (strict) generalisations of 𝑞?
   4. Unification: With which expressions stored in 𝑇 does 𝑞 unify?

  All four query modes are realized by both, a traversal of the instance trie 𝑇 and a test at each
node whether that node’s expression is an answer to the query.

Dereferencing The tests are defined below referring to dereferenced representations of ex-
pressions.10 Consider the following two representations of 𝑓 (𝑎, 𝑣1 , 𝑣2 , 𝑣1 , 𝑔(𝑣1 )):
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

        𝑓 /5            𝑎/0             nil        nil           8         g/1          16
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

        𝑓 /5            𝑎/0             nil        nil           8          36                                             g/1          8

The first representations of 𝑓 (𝑎, 𝑣1 , 𝑣2 , 𝑣1 , 𝑔(𝑣1 )) is dereferenced because, except for the repre-
sentations of the second and third occurrences of the variable 𝑣1 , the variables’ values are nil.
The second and third occurrences of 𝑣1 cannot be dereferenced like a pointer, because this would
result in the following representation of 𝑓 (𝑎, 𝑣1 , 𝑣2 , 𝑣3 , 𝑔(𝑣4 )), not of 𝑓 (𝑎, 𝑣1 , 𝑣2 , 𝑣1 , 𝑔(𝑣1 )):
   0    1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

        𝑓 /5            𝑎/0             nil        nil         nil          36                                             g/1         nil

   A dereferenced representation of an expression 𝑒 is generated from any representation 𝑅𝑒 of
𝑒 as follows: While traversing 𝑅𝑒 from left to right:

       • If the cell reached contains a constructor or nil, or the offset of a locally bound variable,
         then copy its content to a new cell.
       • Otherwise (the token reached is a non-locally bound variable 𝑣 storing the address 𝐴 of
         an expression representation), (recursively) dereference the expression representation at
         address 𝐴.

   We assume that when an expression representation is dereferenced, the list of the addresses
of its nil variables (that is, non-locally bound variables) is constructed. Constructing this list
can be done while dereferencing the expression representation, that is, it does not require an
additional traversal of the expression representation.
   In implementing the algorithm sketch above, care must be given not to trespass expression
representations’ ends in recursive calls. This is easily cared for in the same traversal of expression
representations by using as follows the constructors’ arities during a left-to-right traversal of
an expression representation 𝐸:
   Let 𝑅 denote the number of remaining (sub)expression representations; set 𝑅 := 1 before
traversing 𝐸, at each constructor 𝑠/𝑛 perform the update 𝑅 := 𝑅 − 1 + 𝑛 (−1 for the
(sub)expression beginning at that constructor and +𝑛 for the 𝑛 subexpression representations

   10
        Dereferencing a pointer 𝑃 means determining the value 𝑃 points to.


                                                                          103
Thomas Prokosch et al. CEUR Workshop Proceedings                                                          93–108


now to be traversed), and at each variable perform the update 𝑅 := 𝑅 − 1. The expression
representation’s end is reached when 𝑅 = 0.
   Since dereferencing an expression copies that expression, dereferencing expressions comes at
a cost. This cost is acceptable, though. Indeed, dereferencing speeds up the tests which anyway
require to copy the stored expressions they are performed against because they bind variables.
Binding variables directly in an instance trie would destroy it.11

Representation of a stored expression An expression 𝑒 stored at a node 𝑁 of an instance
trie retrieved for comparison with a query 𝑞 has the form 𝑒 = 𝑒′ 𝜎 where 𝑒′ is the expression
stored at the parent node of 𝑁 and 𝜎 is a non-renaming substitution, that is, 𝑒 is a strict
instance of 𝑒′ . As discussed in the previous paragraph, (the representation of) 𝑒′ is assumed to
be dereferenced. Furthermore, 𝑞 and 𝑒′ as well as 𝑞 and 𝑒 are variable-disjoint.
   The substitution 𝜎 is represented as described in Section 4, paragraph “Representation of
substitutions” by two lists: A list of variables and a list of expressions or nil. It is also assumed
that (the representation of) the expressions in that second list are dereferenced.
   Useful information on 𝑒′ , the nature of which depends on the query mode, can be assumed to
be known:
    • If variants or instances of 𝑞 are searched for, then 𝑒′ is a strict generalisation of 𝑞 (or,
      equivalently, 𝑞 is a strict instance of 𝑒′ ). Indeed, otherwise the strict instance 𝑒 = 𝑒′ 𝜎 of
      𝑒′ cannot be a variant or an instance of 𝑞 and the search through 𝑇 , which is described
      below, would not have led to 𝑒.
    • If generalisations of 𝑞 are searched for, then 𝑒′ is already a generalisation of 𝑞. Indeed,
      otherwise, the strict instance 𝑒 = 𝑒′ 𝜎 of which 𝑒′ is a strict generalisation cannot be a
      generalisation of 𝑞 and the search through 𝑇 , which is described below, woulds not have
      led to 𝑒.
    • If expressions unifying with 𝑞 are searched for, then 𝑞 and 𝑒′ unify. Indeed, if 𝑞 and
      𝑒 = 𝑒′ 𝜎 unify, then there exists a substitution 𝜇 such that 𝑞𝜇 = 𝑒𝜇 = 𝑒′ 𝜎𝜇. Since the
      representations of 𝑞 and 𝑒′ are variable disjoint and the domain of the representation of 𝜎
      contains only variables occurring in 𝑒′ , 𝑞𝜎 = 𝑞. Therefore 𝑞𝜇 = 𝑞𝜎𝜇 = 𝑒′ 𝜎𝜇. That is, 𝜎𝜇
      is as unifier of 𝑞 and 𝑒′ . Thus, if 𝑞 and 𝑒′ do not unify, then 𝑞 and 𝑒 = 𝑒′ 𝜎 cannot unify,
      and the search through 𝑇 , which is described below, would not have led to 𝑒.

Retrieval Both a query 𝑞 and an expression 𝑒 stored in an instance tree at a node 𝑁 tested as
a possible answer to 𝑞 are available as strict instances of the expression 𝑒′ stored at the parent
node of 𝑁 . Indeed, by definition of an instance trie, 𝑒 is stored in that form, and during the
instance tree traversal 𝑞 has been recognised as a strict instance of 𝑒′ for otherwise, 𝑞 and 𝑒
would not be compared.
   Assume that 𝑞 = 𝑒′ 𝜎𝑞 and 𝑒 = 𝑒′ 𝜎𝑒 and recall that each of the representations of 𝜎𝑞 and 𝜎𝑒
is a list consisting of 𝑛 dereferenced expression representation, where 𝑛 is the number of nil
variables in the representation of 𝑒′ . The matching problems of 𝑞 in 𝑒 and of 𝑒 in 𝑞 therefore
reduce to matching problems of these two lists:
   11
     Copying the representation of the query is not necessary because, by using a log, bindings of query variables
can be undone before the query is tested against another stored expression.


                                                       104
Thomas Prokosch et al. CEUR Workshop Proceedings                                               93–108


    • If the expressions in the list representing 𝜎𝑞 and 𝜎𝑒 are pairwise variant of each other,
      then 𝑞 and 𝑒 are variant of each other.
    • If every expression in the list representing 𝜎𝑞 is an instance of its counterpart in the
      list representing 𝜎𝑒 , then 𝑞 is an instance of 𝑒, and if furthermore at least one of these
      expressions is a strict instance of its counterpart, then 𝑞 is a strict instance of 𝑒.
    • If every expression in the list representing 𝜎𝑞 is a generalisation of its counterpart in the
      list representing 𝜎𝑒 , then 𝑞 is a generalisation of 𝑒, and if furthermore at least one of these
      expressions is a strict generalisation of its counterpart, then 𝑞 is a strict generalisation of
      𝑒.
    • If the lists representing 𝜎𝑞 and 𝜎𝑒 unify, then 𝑞 and 𝑒 unify.

Versatile unification algorithm Instance tries make use of a unification algorithm which
determines in a single left-to-right traversal of dereferenced expression representations 𝑅1 and
𝑅2 whether the represented expressions 𝑒1 respectively 𝑒2 are variants of each other, or 𝑒1 is a
strict instance of 𝑒2 , or 𝑒1 is a strict generalisation of 𝑒2 , or none of this hold but 𝑒1 and 𝑒2 are
unifiable, or finally 𝑒1 and 𝑒2 are not unifiable. This algorithm is called “versatile” because it
is both a matching and a unification algorithm and because it further qualifies how matching
expressions do match.
   This unification algorithm exploits the afore-mentioned representation of expressions in an
instance trie for avoiding unnecessary occurs-checks in some cases:

    • on some offset variables under certain conditions
    • as long as the left-to-right expression traversal is still in a matching mode (that is, one of
      variant, strict instance, or strict generalisation)

   A description of this unification algorithm is beyond the scope of this paper and is given
in [32].

Traversal A left-to-right depth-first traversal of an instance trie is, like for any tree structure,
the algorithm of choice, because it has a simple recursive definition and minimizes the space
complexity. All query modes but the query mode Unification affect that traversal:

    • Variant: The traversal can be interrupted as soon as an answer is found because an
      instance tree does not contain distinct variant expressions.
    • Instance: Subtrees rooted at instances of the query do not have to be traversed because
      all expressions they store necessarily are answers to the query.
    • Generalisation: In a depth-first traversal of an instance tree, the first expression 𝑔 found
      which generalises 𝑞 and no children of which also generalise 𝑞 determines the answers to
      𝑞: Their set consists of 𝑔 and its ancestors (except the instance trie’s root). The subtree
      rooted at such a node 𝑔 is not traversed.


                                                 105
Thomas Prokosch et al. CEUR Workshop Proceedings                                                93–108


Insertion and deletion The insertion of an expression 𝑒 in an instance trie 𝑇 is realised by
first searching with the matching algorithm mentioned above for a strict generalisation 𝑔 of 𝑒 in
𝑇 , which is always found because the root of an instance trie is a strict generalisation of every
expression, second checking whether a child of 𝑔 is a instance of 𝑒. If a child of 𝑔 is a variant of
𝑒, then nothing is done. If a child of 𝑔 is a strict instance of 𝑒, then 𝑒 is inserted between 𝑔 and 𝑒
(what requires no more than updating two pointers). Otherwise, 𝑒 is inserted in the instance
trie as an as a new child of 𝑔.
   In order to maintain tree invariants (instances of an expression are always inserted below
their first generalization in a tree), children of 𝑔 to the right of 𝑒 need to be searched for instances
of 𝑒 and inserted below 𝑒 recursively.
   The deletion of an expression 𝑒 from an instance trie 𝑇 is realised by searching with the
matching algorithm mentioned above for a variant of 𝑒 in 𝑇 . If such a variant is found at a node
𝑁 , then this node 𝑁 is deleted (what requires no more than updating a pointer) and, after this
deletion, each expression stored at children node of 𝑁 is inserted in the subtree rooted at the
node 𝑃 which, before the deletion, was the parent node of 𝑁 . Otherwise nothing is done.


6. Conclusion
This article has introduced the data structure Instance Trie salient properties of which are:

    • Stability: Instance tries are stable in the sense that their structures are independent of the
      order in which they are filled or updated.
    • Versatility: Instance tries are versatile in the sense that they are well-suited to the retrieval
      of stored expressions in four query, or retrieval, modes: Variants, instances, generalisations
      and unification of expressions.
    • Incrementality: Instance tries’ storage based on the instance relationship gives rise to an
      incremental expression retrieval.

   The companion article [32] gives more details on the versatile unification algorithm mentioned
in Section 5. Further work will be devoted to analytical and empirical evaluations of instance tries
and to deploying instance tries in a run-time system for tabled logic programming supporting
meta-programming.


References
 [1] J. A. Robinson, A. Voronkov (Eds.), Handbook of Automated Reasoning, volume 1, MIT
     Press, 2001.
 [2] F. Bry, In praise of impredicativity: A contribution to the formalisation of meta-
     programming, Theory and Practice of Logic Programming 20 (2020) 99–146. URL:
     http://arxiv.org/abs/1807.06051.
 [3] H. Tamaki, T. Sato, OLD resolution with tabulation, in: E. Shapiro (Ed.), Proceedings of
     the Third International Conference on Logic Programming (ICLP), volume 225 of LNCS,
     Springer, 1986, pp. 84–98.


                                                  106
Thomas Prokosch et al. CEUR Workshop Proceedings                                          93–108


 [4] E. Johnson, C. R. Ramakrishnan, I. V. Ramakrishnan, P. Rao, A space efficient engine
     for subsumption-based tabled evaluation of logic programs, in: A. M. A., T. Sato (Eds.),
     Functional and Logic Programming – Proceedings of the International Symposium on
     Functional and Logic Programming (FLOPS), number 1722 in LNCS, Springer, 1999, pp.
     184–299.
 [5] I. V. Ramakrishnan, P. Rao, K. Sagonas, T. Swift, D. S. Warren, Efficient Tabling Mechanisms
     for Logic Programs, Technical Report, Department of Computer Science, State University
     of New York at Stony Brook, 1999.
 [6] F. M. F. Cruz, Call Subsumption Mechanism for Tabled Logic Programs, Ph.D. thesis,
     Departamento de Engenharia Informática, Faculdade de Engenharia da Universidade do
     Porto, Portugal, 2010.
 [7] P. Graf, Substitution tree indexing, in: Proceedings of the International Conference on
     Rewriting Techniques and Applications (RTA), number 914 in LNCS, Springer, 1995, pp.
     117–131.
 [8] P. Graf, Term Indexing, volume 1053 of LNCS, Springer, 1996.
 [9] R. Sekar, I. V. Ramakrishnan, A. Voronkov, Handbook of Automated Reasoning, volume 2,
     MIT Press, 2001, pp. 1853–1964.
[10] R. Nieuwenhuis, T. Hillenbrand, A. Riazanov, A. Voronkov, On the evaluation of indexing
     techniques for theorem proving, in: R. Gore, A. Leitsch, T. Nipkow (Eds.), Proceedings
     of the International Joint Conference on Automated Reasoning (IJCAR), number 2083 in
     LNCS, Springer, 2001, pp. 257–271.
[11] R. de la Briandais, File searching using variable length keys, in: Proceedings of the Western
     Joint Computer Conference, 1959, pp. 295–298.
[12] E. Fredkin, Trie memory, Communications of the ACM 3 (1960) 490–499.
[13] M. E. Stickel, The Path-Indexing Method For Indexing Terms, Technical Note 473, SRI
     International, Menlo Park, California, USA, 1989.
[14] R. Letz, J. Schumann, S. Bayerl, W. Bibel, SETHEO: A high-performance theorem prover,
     Journal of Automated Reasoning 8 (1992) 183–212.
[15] P. Graf, Extended path-indexing, in: Proceedings of the 2th Conference on Automated
     Deduction (CADE), number 814 in LNAI, Springer, 1994, pp. 514–528.
[16] W. McCune, An indexing method for finding more general formulas, Association for
     Automated Reasoning Newsletter 1 (1988) 7–8.
[17] W. McCune, Experiments with discrimination-tree indexing and path-indexing for term
     retrieval, Journal of Automated Reasoning 9 (1992) 147–167.
[18] A. Gräf, Left-to-right tree pattern matching, in: Proceedings of the 4th International Con-
     ference on Rewriting Techniques and Applications (RTA), volume 488 of LNCS, Springer,
     1991, pp. 323–334.
[19] R. Ramesh, I. Ramakrishnan, R. Sekar, Adaptive discrimination trees, in: Proceedings of
     the International Colloquium on Automata, Languages and Programming (ICALP), volume
     623 of LNCS, Springer, 1992, pp. 247–260.
[20] H. J. Ohlbach, Abstraction tree indexing for terms, in: Proceedings of the 9th European
     Conference on Artificial Intelligence (ECAI), 1990, pp. 479–484.
[21] G. Plotkin, A note on inductive generalization, Machine Intelligence 5 (1970) 153–163.
[22] G. Plotkin, A further note on inductive generalization, Machine Intelligence 6 (1971)


                                               107
Thomas Prokosch et al. CEUR Workshop Proceedings                                        93–108


     101–124.
[23] K. Hoder, A. Voronkov, Comparing unification algorithms in first-order theorem proving,
     in: B. Mertsching, M. Hund, M. Z. Aziz (Eds.), Proceedings of KI 2009 – Advances in
     Artificial Intelligence, 32nd Annual German Conference on AI, number 5803 in LNCS,
     Springer, 2009, pp. 435–443.
[24] A. Voronkov, The anatomy of Vampire – Implementing bottom-up procedures with code
     trees, Journal of Automated Reasoning 15 (1995) 237–265.
[25] H. Ganzinger, R. Nieuwenhuis, P. Nivela, Fast term indexing with coded context trees,
     Journal of Automated Reasoning 32 (2004) 103–120.
[26] P. Rao, C. R. Ramakrishnan, I. V. Ramakrishnan, A thread in time saves tabling time, in:
     Proceedings of the Joint International Conference and Symposium on Logic Programming
     (ICSLP), The MIT Press, 1996, pp. 112–126.
[27] F. Baader, T. Nipkow, Term Rewriting and All That, Cambridge University Press, 1999.
[28] E. Harzheim, Ordered Sets, Springer, 2006.
[29] J. A. Robinson, A machine-oriented logic based on the Resolution Principle, Journal of the
     ACM 12 (1965) 23–41.
[30] H.-J. Bürckert, Matching – a special case of unification?, Journal of Symbolic Computation
     8 (1989) 523–536.
[31] F. Baader, W. Snyder, Handbook of Automated Reasoning, Elsevier and MIT Press, 2001,
     pp. 445–532.
[32] T. Prokosch, F. cois Bry, Unification on the run, in: T. Kutsia, A. M. Marshall (Eds.),
     The 34th International Workshop on Unification (UNIF’20), number 20-10 in RISC Report
     Series, Research Institute for Symbolic Computation, Johannes Kepler University, Linz,
     Austria, 2020, pp. 13:1–13:5. URL: https://www.risc.jku.at/publications/download/risc_
     6129/proceedings-UNIF2020.pdf.


                                             108