<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Reuse of Introduced Symbols in Automatic Theorem Provers</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Michael</forename><surname>Rawson</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Martin</forename><surname>Suda</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Robotics</orgName>
								<orgName type="institution" key="instit1">Czech Institute of Informatics</orgName>
								<orgName type="institution" key="instit2">Cybernetics</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Petra</forename><surname>Hozzová</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Giles</forename><surname>Reger</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">T</forename><forename type="middle">U</forename><surname>Wien</surname></persName>
						</author>
						<title level="a" type="main">Reuse of Introduced Symbols in Automatic Theorem Provers</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">315019E31B7785670EA5891E27131E2E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Automatic theorem provers may introduce fresh function or predicate symbols for various reasons. Sometimes, such symbols can be reused. We describe a simple form of symbol reuse in the first-order system Vampire, investigate its practical effect, and propose future schemes for more aggressive reuse.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>An automatic theorem prover might introduce a fresh symbol -that is, a symbol it has not previously seen -at any time between the start and end of its execution. During traditional preprocessing <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>, the two most well-known examples are Skolemization to eliminate ∃-binders, and some variation of the Tseitin transformation 1 to avoid excessive duplication of subformulae. Fresh symbols may also be used in preprocessing to eliminate exotic input features <ref type="bibr" target="#b2">[3]</ref>, or to optimise inputs with respect to some search algorithm <ref type="bibr" target="#b3">[4]</ref>. New symbols may even be introduced during proof-search proper, such as for clause splitting <ref type="bibr" target="#b4">[5]</ref> or induction <ref type="bibr" target="#b5">[6]</ref>. In some cases, these symbols can be reused. We examine reuse of fresh symbols in the Vampire <ref type="bibr" target="#b6">[7]</ref> automatic theorem prover, but results could easily generalise to other similar systems.</p><p>For example, consider Skolemizing</p><formula xml:id="formula_0">∃x.P (x) ∨ Q(x) ∃x.P (x) ∨ Q(x)</formula><p>Plainly, these are the same formula and could employ the same Skolem constant, but both of Vampire's normal-form routines produce two constants -and therefore two clauses -instead of the possible one:</p><formula xml:id="formula_1">P (s 1 ) ∨ Q(s 1 ) P (s 2 ) ∨ Q(s 2 )</formula><p>PAAR'22: 8th Workshop on Practical Aspects of Automated Reasoning, August 11-12, 2022, Haifa, Israel</p><p>Vampire could in principle notice that these are the same formula, then only process it once. A natural way to achieve this is by formula sharing <ref type="bibr" target="#b7">[8]</ref>, but this would be quite a large change to Vampire and has some issues of its own. The example given here is very simple, but symbol reuse can be employed extensively: see §4 for some ideas.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.">Motivation</head><p>It may appear that reusing introduced symbols is a clear win for system performance.</p><p>In the case of definition introduction, removing duplicate definitions reduces the total number of clauses, which is considered likely to improve performance. However, reused symbols allow inference between parts of the search space that do not otherwise interact: in some cases this might shorten proofs or eliminate redundancy, but in others it expands the search space needlessly.</p><p>There is cause for optimism: Vampire uses an increasing amount of SAT/SMT technology to organise and perform ground reasoning tasks, such as AVATAR <ref type="bibr" target="#b8">[9]</ref>, AVATAR modulo theories <ref type="bibr" target="#b9">[10]</ref>, the InstGen calculus and global subsumption <ref type="bibr" target="#b10">[11]</ref>, MACE-style finite model building <ref type="bibr" target="#b3">[4]</ref>, theory instantiation <ref type="bibr" target="#b11">[12]</ref>, and more <ref type="bibr" target="#b12">[13]</ref>. Reusing symbols in this context reduces the SAT/SMT model space, which is likely to improve both ground and first-order performance. Additionally, concurrent proof attempts <ref type="bibr" target="#b13">[14]</ref> can share information over a common signature: reusing symbols consistently extends this common signature to introduced symbols.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2.">Justification</head><p>When is it possible to reuse a symbol? Informally, introduced symbols often represent or stand for some formula. Then, an existing symbol representing the same formula may be reused. This sleight of hand is most alluring with definition introduction, where a fresh predicate P is defined to be equivalent to F : P stands for F . Skolemization, when encountering ∃x.F , introduces a function s with semantics roughly "some s such that F": again, s stands for F .</p><p>It is possible to relax the requirement for identical formulae to merely equivalent formulae. Consider introducing a symbol s to represent F : naturally enough, s could also be used to refer to G if F ≡ G. First-order logic is sufficiently expressive to even allow s(x) to stand for F [x], and then use s(t) for F <ref type="bibr">[t]</ref>. Such relaxations can be powerful, see §4 for more on this.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.3.">Setting</head><p>Vampire <ref type="bibr" target="#b6">[7]</ref> is a state-of-the-art automatic theorem prover for first-order logic, with extensions to support various theories, induction <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b5">6]</ref>, rank-1 polymorphism <ref type="bibr" target="#b15">[16]</ref>, firstclass booleans <ref type="bibr" target="#b2">[3]</ref>, and higher-order logic <ref type="bibr" target="#b16">[17]</ref>. It implements many different techniques for reasoning efficiently despite these very hard problem domains. These techniques are not equally effective on all problems and have many tunable parameters, which can make the tool tricky for end-users: therefore, pre-tuned strategy schedules are provided that run a series of pre-tuned options.</p><p>Vampire operates exclusively within a suitably-extended clause normal form. All formulae not in this form are translated into it before proof search begins, using Vampire's advanced preprocessing routines <ref type="bibr" target="#b7">[8]</ref>. This preprocessing frequently introduces fresh symbols.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Implementation and Pitfalls</head><p>We implement symbol reuse for Skolemization and definition introduction in Vampire's preprocessing routines. Formulae are identified up to α-equivalence for this initial work, and symbol reuse for Skolem symbols and definitions can be switched on and off separately. We also describe some unpleasant bugs we encountered during testing (subsections beginning Pitfall ) which we hope the reader can now avoid in their own implementation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Key Functions</head><p>The basic implementation idea is to define a "key" function k from a formula to some reasonable key type, such as a formula, string or code sequence. This key can then be used to lookup and reuse symbols that have been used for the same key before. Suppose we need a symbol to stand for some formula F . If we have an existing symbol for k(F ), then we may re-use this symbol. Otherwise, we create a fresh symbol and store it with k(F ) so that we may reuse it later. In this work we consider a relatively simple key function (see below), but the general setting allows more complex key functions in future ( §4).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Detecting α-equivalence</head><p>The simplest key function is the identity function k id , but this will only reuse symbols for really trivial cases, and even a renaming of variables will defeat it. Consider introducing a symbol for</p><formula xml:id="formula_2">F ≡ ∀x∀y. H[x, y] versus G ≡ ∀y∀x. H[y, x]</formula><p>for example: k id (F ) = k id (G). We implement a slightly more sophisticated key function k α to detect α-equivalence, which canonically renames formulae left-to-right using a sequence x i for each new variable encountered. Then both the above formulae map to the same key,</p><formula xml:id="formula_3">k α (F ) = k α (G) = ∀x 0 ∀x 1 .H[x 0 , x 1 ]</formula><p>allowing more generous symbol reuse. This still does not handle more complex cases, such as identifying ∀x∀y. H[x, y] and ∀x∀y. H[y, x]. We note that for our purposes this is equivalent in power to more-complex schemes like de Bruijn indices, as we only wish to identify strictly α-equivalent formulae. We do not need other operations such as capture-avoiding substitution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Pitfall: Free Variables</head><p>Introduced symbols must usually close over the free variables of the formula they represent. For example, consider Skolemizing ∀x∀y∃z.P (x, y, z). The outer two universal quantifiers are removed, and we obtain P (x, y, s(x, y)), where s is a fresh symbol applied to the now-free variables x and y. The reuse key in this case is</p><formula xml:id="formula_4">k α (∃z.P (x, y, z)) = ∃x 0 .P (x 1 , x 2 , x 0 ).</formula><p>If we were now to additionally Skolemize ∀x∀y∃z.P (y, x, z), then the key is the same and P (y, x, s(y, x)) is a legitimate instance of symbol reuse. Here we draw the reader's attention to the order of variables in the Skolem term. However, Vampire's existing Skolemization routine created Skolem terms with variables in the order of binding, not of occurrence, and therefore produced P (y, x, s(x, y)), which is unsound. To see this, consider the satisfiable set of formulae</p><formula xml:id="formula_5">∀x∀y∃z. z = f (x, y) ∀x∀y∃z. z = f (y, x) f (a, b) = f (b, a)</formula><p>Implementors must ensure that terms are created with variables in the order they occur in subformulae (or at least in a consistent order with respect to keys), lest they fall prey to this mis-reuse.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Pitfall: Sorts and ad-hoc Polymorphism</head><p>Some symbols are usually assumed to be ad-hoc polymorphic, even in monomorphic many-sorted systems. A good example is the equality predicate, but Vampire also considers e.g. arithmetic operators to be ad-hoc polymorphic over the integers, rationals, and reals. The problem here is that the reuse key must differentiate between the types of arguments the overloaded symbol is applied to. Consider Skolemizing the input problem ∀x : σ. ∀y : σ. ∃z : ρ. P (z) ∨ x = y ∀x : τ. ∀y : τ. ∃z : ρ. P (z) ∨ x = y where σ, τ, ρ are distinct sorts. The reuse key in both cases is</p><formula xml:id="formula_6">k α (∃z : ρ. P (z) ∨ x = y) = ∃x 0 : ρ. P (x 0 ) ∨ x 1 = x 2</formula><p>But the symbol cannot be reused in this case as it would be ill-typed. One solution (which we implement) is to include the types of free variables in the reuse key, so that the keys are</p><formula xml:id="formula_7">x 1 : σ, x 2 : σ ∃x 0 : ρ. P (x 0 ) ∨ x 1 = x 2 x 1 : τ, x 2 : τ ∃x 0 : ρ. P (x 0 ) ∨ x 1 = x 2</formula><p>Another solution, perhaps neater, is annotating ad-hoc overloaded symbols with their argument types, resulting in keys</p><formula xml:id="formula_8">∃x 0 : ρ. P (x 0 ) ∨ x 1 = σ x 2 ∃x 0 : ρ. P (x 0 ) ∨ x 1 = τ x 2</formula><p>Yet another is to have the reused symbol be polymorphic over the ad-hoc type variables in the reuse key, although this does require rank-1 polymorphism. We implement the first solution for this work, although at some point in the future expect to switch to another.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Summary of Implemented Technique</head><p>To recap, we define a key function k α (F ) = Γ F , where Γ is an ordered list of pairs x : τ giving the types of free variables in F , and F is a canonically-renamed copy of F . If two formulae F and G have the same key, we may use the same introduced symbol s in terms/predicates representing F or G. We reuse symbols in this way while performing definition introduction or Skolemization. The term for a given formula F is constructed from a possibly-reused symbol s and the free variables of F in order of their occurrence.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.6.">Compromise: Quantifier Blocks and Skolemization</head><p>While developing the technique, we noticed that on certain extreme problems attempting to reuse Skolem symbols leads to a degradation of performance. A careful look revealed a prohibitive quadratic complexity of repetitive key computation for deep existential quantifier blocks. For instance, on a formula ∃x 1 x 2 . . . x n . F we would need to compute</p><formula xml:id="formula_9">k α (∃x 1 x 2 . . . x n . F ), k α (∃x 2 . . . x n . F ), k α (∃x 3 . . . x n . F ) .</formula><p>. . , which becomes too expensive for a large n and F . We decided to avoid this expensive case by storing Skolem symbols for reuse on perquantifier-block basis. This way, we only need one call to the key function for each (existential) quantifier block and nominally store a whole vector of Skolems with the key. In the actual implementation, however, only one symbol is stored, because we can assume the remaining ones occupy consecutive slots in the symbol table. As a mild concession, this trick gives up on the ability to reuse symbols within blocks: for example, when s 1 , s 2 . . . , s n get jointly stored at k α (∃x 1 x 2 . . . x n . F ), we are not able to later selectively retrieve, e.g., s n alone for k α (∃x n . F ). However, in practice, this seems to be a reasonable compromise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Practical Effect</head><p>With the change described in §2, the example from §1 now works as expected. Practical effectiveness now depends upon two factors: whether benchmarks actually contain αequivalent (sub)formulae for which Vampire can reuse introduced symbols, and whether this reuse is beneficial for proof search in Vampire (or downstream users of Vampire's clausification routines). </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Reusable Symbols</head><p>We consider the untyped first-order ("FOF") benchmarks of the TPTP <ref type="bibr" target="#b17">[18]</ref> problem library, version 7.5.0, which amounts to 9091 problems over a moderate number of domains. Vampire processed these problems using its default clausification routine and attempted to reuse definition and Skolem symbols up to α-equivalence. Of these 9091 problems, Vampire could reuse at least one symbol for 4311 problems. In some cases, a large number of symbols could be reused many times: in the case of ITP024+4, Vampire was able to reuse one of 3978 introduced symbols on 19442 separate occasions, with the most commonly-reused symbol reused in 184 different locations. We consider this relatively strong evidence that attempting to reuse introduced symbols may be worthwhile, even for smaller systems that have otherwise simple preprocessing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Effect on Proof Search</head><p>To measure the effect of symbol reuse on proof search, we ran Vampire<ref type="foot" target="#foot_0">2</ref> (version 4.6.1) in its default (single-strategy) setting and its variations using symbol reuse on the mentioned 9091 FOF TPTP problems. The experiment was run on a server with Intel® Xeon® Gold 6140 CPUs clocked at 2.3 GHz with 500 GB RAM. To utilise the parallelism of our server while maintaining stability of the experiment we limited the runs using an instruction limit (rather than the more usual time limit). <ref type="foot" target="#foot_1">3</ref> More specifically, we used a limit of 50 billion instructions, which approximately corresponds to 10 seconds on the machine. The result of our experiment are shown in Table <ref type="table" target="#tab_0">1</ref>. We can see that there is a consistent (although modest) improvement by both symbol reuse applications and that there is a combined benefit of using then jointly. The column with unique solutions reminds us that neither of the techniques is beneficial universally. However, as a sign of complementarity, unique solutions indicate that the newly available symbol reuse options will be useful at constructing more powerful strategy schedules.</p><p>To get a more robust version of the results, resistant to possible influence of statistical noise, we rerun the whole experiment in a shuffling mode, randomizing the prover at various don't-care non-deterministic call sites and averaging the results over many independently seeded runs (please check out our previous work <ref type="bibr" target="#b18">[19]</ref> for the exact description of the method). Table <ref type="table" target="#tab_1">2</ref> reports on the computed averages as well as the estimated standard deviation (sigma). We can see that the positive effect of definition reuse on its own could not be confirmed, but the combined strategy using both definition reuse and Skolem symbol reuse is still the best one. All the averages are slightly lower than the values in Table <ref type="table" target="#tab_0">1</ref> due to the overhead of shuffling (in accord with a remark in <ref type="bibr" target="#b18">[19]</ref>).</p><p>We could identify a single TPTP problem as a robust gain of symbol reuse, namely the problem LCL667+1.010, which means the probability of solving this problem by default was 0.0 and by skr+dr 1.0, and no robust loss. Vampire reused 344 Skolem symbols on LCL667+1.010.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Bonus: Induction</head><p>We were pleasantly surprised to find that Vampire can now also reuse symbols generated during proof search with inductive reasoning. For inductive reasoning, Vampire uses many different inference rules <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b19">20,</ref><ref type="bibr" target="#b20">21]</ref>. The simplest is of the form</p><formula xml:id="formula_10">¬L[t] ∨ C cnf(F → ∀x.L[x])</formula><p>where t is a ground term, L is a ground literal, C is a clause, cnf(•) denotes a translation to clausal normal form, and F → ∀x.L[x] is a valid induction schema. The schema can be instantiated with various induction axioms, such as structural induction axiom for inductive datatypes or induction axiom with symbolic bounds for integers <ref type="bibr" target="#b5">[6]</ref>.</p><p>All clauses from the clausified conclusion of the rule contain the literal L[x]. After adding these clauses to the search space, Vampire immediately resolves them against the premise of the rule, ¬L[t] ∨ C, obtaining cnf(¬F ) ∨ C.</p><p>(</p><p>Since the resulting clauses (1) do not contain the term t from ¬L[t] ∨ C, the result of applying induction on ¬L[t]∨C is the same as on ¬L[t ]∨C for different t and t . Vampire therefore reduces redundancy by applying the induction rule on a premise ¬L[t ] ∨ C only if it did not already apply induction with the same axiom on some ¬L[t] ∨ C. However, this redundancy-avoiding heuristic does not work for some induction axioms having a more complex consequent, used in more-complex induction inference rules. One such case is the upward integer induction axiom with default bound <ref type="bibr" target="#b5">[6]</ref>:</p><formula xml:id="formula_12">L[0] ∧ ∀y.(y ≥ 0 ∧ L[y] =⇒ L[y + 1]) =⇒ ∀x.(x ≥ 0 =⇒ L[x])<label>(2)</label></formula><p>Consider applying the induction rule with the axiom (2) on the premise ¬L[t]. The result of clausifying and Skolemizing the axiom is:</p><formula xml:id="formula_13">¬L[0] ∨ s ≥ 0 ∨ x &lt; 0 ∨ L[x] ¬L[0] ∨ L[s] ∨ x &lt; 0 ∨ L[x] (3) ¬L[0] ∨ ¬L[s + 1] ∨ x &lt; 0 ∨ L[x]</formula><p>where s is a Skolem constant corresponding to y in <ref type="bibr" target="#b1">(2)</ref>. Resolving the clauses (3) with the premise ¬L[t] results in clauses containing t:</p><formula xml:id="formula_14">¬L[0] ∨ s ≥ 0 ∨ t &lt; 0 ¬L[0] ∨ L[s] ∨ t &lt; 0 (4) ¬L[0] ∨ ¬L[s + 1] ∨ t &lt; 0</formula><p>If Vampire later encounters ¬L[t ], it applies induction with the same axiom (2) on it, since the resulting clauses will be different -they will contain t instead of t. However, to do that, the negated antecedent of (2) needs to be clausified and Skolemized twice.</p><p>Vampire can therefore apply symbol reuse to obtain the clauses:</p><formula xml:id="formula_15">¬L[0] ∨ s ≥ 0 ∨ t &lt; 0 ¬L[0] ∨ L[s] ∨ t &lt; 0 ¬L[0] ∨ ¬L[s + 1] ∨ t &lt; 0</formula><p>with the same Skolem constant s as in <ref type="bibr" target="#b3">(4)</ref>. In this way, symbol reuse has the potential to reduce work when using some integer induction rules. For example, on one benchmark<ref type="foot" target="#foot_2">4</ref> from the inductive benchmark set <ref type="bibr" target="#b21">[22]</ref>, Vampire can reuse 6 Skolem constants, and generates only 1430 clauses to find a proof compared to 1737 without symbol reuse.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Future Schemes</head><p>If F ≡ G, then the same symbol can be used to stand for F or G. However, since this criterion is undecidable for first-order logic, we suggest some more-pragmatic schemes below that we have not yet implemented. If, on the other hand, reusing symbols is of paramount importance, sub-formulae are typically small and the criterion easy compared to the main problem, then determining F ≡ G with the theorem prover itself may be an option.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Normalised Formulae and AC Operators</head><p>Our existing scheme to detect α-equivalence ( §2.2) still cannot identify even some triviallyequivalent formulae: consider e.g. F ∧ G ≡ G ∧ F . By a suitable choice of key function it would be possible to identify some such cases with little computational effort. More aggressive schemes are possible in exchange for more computation, such as with the key "sorted conjunctive normal form". We note that even more care must be taken with free variables (refer §2.3) as the order of occurrence of variables may be changed in the resulting keys. We suspect this scheme may be particularly useful when considering clausification during proof search <ref type="bibr" target="#b22">[23]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Generalisation and Instantiation</head><p>The schemes discussed so far cannot reuse symbols for the following scenario. Suppose that we have an often-repeated formula F [t i ] containing a series of different terms t i :</p><formula xml:id="formula_16">F [t 1 ], F [t 2</formula><p>] and so on, and that we wish to introduce symbols to represent these formulae. Such formulae are quite common in common-sense reasoning over large knowledge bases, such as those exported from SUMO <ref type="bibr" target="#b23">[24]</ref>.</p><p>To fix this shortcoming, we can generalise</p><formula xml:id="formula_17">F [t i ] to F [x]</formula><p>for some fresh variable x, introduce a single symbol s(x) and use s(t i ) to represent each F [t i ]. However, producing candidate generalisations from a set of formulae is not easy to implement efficiently, and application of this technique would be necessarily heuristic since it is not clear which generalisation(s) are best for proof search. An alternative is to maximally generalise k gen (F ) such that no non-variable terms occur in it. For example, consider</p><formula xml:id="formula_18">F ≡ P (c, g(d)) ∧ ¬Q(c, x)</formula><p>Then, the reuse key is k gen (F ) = P (x 0 , x 1 ) ∧ ¬Q(x 2 , x 3 ) and the term s(c, g(d), c, x) can be used to represent F . While the potential for symbol reuse is very high with this approach (particularly when combined with §4.1), this technique also increases symbol arity enormously, which is generally viewed negatively from the perspective of system performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Related Work</head><p>Computing normal forms <ref type="bibr" target="#b0">[1]</ref> optimised for consumption <ref type="bibr" target="#b1">[2]</ref> by automated theorem provers is well-studied. However, there is less work that aims to reduce the number of introduced symbols specifically, rather than reducing number or size of clauses. Other techniques in Vampire have achieved such a reduction as a side-effect <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b14">15]</ref>. In the context of Spass <ref type="bibr" target="#b24">[25]</ref>, the technique of generalized renaming <ref type="bibr" target="#b25">[26]</ref> is effective at reducing the number of introduced definitions. Generalized renaming is similar in spirit to §4.2 but with a greater degree of sophistication.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>Reusing symbols is often overlooked as a topic for preprocessing, but can be effective, especially for systems using some amount of ground reasoning. We show that even a simple form of reuse can be practically effective in Vampire and suggest some future schemes for reusing a greater number of symbols. The techniques proposed and evaluated here are relatively easy to implement, despite the hazards that we highlight, and are widely-applicable to other ATP systems.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Number of TPTP problems solved by Vampire's default strategy under a 50 billion instruction limit, with formula definition reuse (dr), Skolem symbol reuse (skr), and both at once (skr+dr).</figDesc><table><row><cell></cell><cell cols="2">solved uniquely</cell></row><row><cell>default</cell><cell>4290</cell><cell>11</cell></row><row><cell>dr</cell><cell>4297</cell><cell>14</cell></row><row><cell>skr</cell><cell>4300</cell><cell>12</cell></row><row><cell>skr+dr</cell><cell>4305</cell><cell>15</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>An analogue of Table1for problems solved on average (based on the methodology from<ref type="bibr" target="#b18">[19]</ref>).</figDesc><table><row><cell></cell><cell cols="2">av. solved sigma</cell></row><row><cell>default</cell><cell>4281.3</cell><cell>10.4</cell></row><row><cell>dr</cell><cell>4281.2</cell><cell>10.5</cell></row><row><cell>skr</cell><cell>4287.2</cell><cell>10.4</cell></row><row><cell>skr+dr</cell><cell>4289.2</cell><cell>10.5</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://github.com/vprover/vampire</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">See appendix A of our previous work<ref type="bibr" target="#b18">[19]</ref> for more details on instruction limiting.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">https://github.com/vprover/inductive_benchmarks/blob/master/benchmarks/int/val/smt2/ declared_axall_conjall_valconst_unint.smt2</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Petra Hozzová and Michael Rawson were supported by the ERC CoG ARTIST 101002685 and the FWF grant LogiCS W1255-N23. Martin Suda was supported by the Czech Science Foundation project no. 20-06390Y and the project RICAIP no. 857306 under the EU-H2020 programme.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Normal form transformations</title>
		<author>
			<persName><forename type="first">M</forename><surname>Baaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Egly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Leitsch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of automated reasoning</title>
				<imprint>
			<publisher>Elsevier</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="273" to="333" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Computing small clause normal forms</title>
		<author>
			<persName><forename type="first">A</forename><surname>Nonnengart</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Weidenbach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of automated reasoning</title>
				<imprint>
			<publisher>Elsevier</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="335" to="367" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The Vampire and the FOOL</title>
		<author>
			<persName><forename type="first">E</forename><surname>Kotelnikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs</title>
				<meeting>the 5th ACM SIGPLAN Conference on Certified Programs and Proofs</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="37" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">New techniques that improve MACE-style finite model finding</title>
		<author>
			<persName><forename type="first">K</forename><surname>Claessen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sörensson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the CADE-19 Workshop: Model Computation-Principles, Algorithms, Applications</title>
				<meeting>the CADE-19 Workshop: Model Computation-Principles, Algorithms, Applications</meeting>
		<imprint>
			<publisher>Citeseer</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="11" to="27" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Splitting without backtracking</title>
		<author>
			<persName><forename type="first">A</forename><surname>Riazanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001</title>
				<editor>
			<persName><forename type="first">B</forename><surname>Nebel</surname></persName>
		</editor>
		<meeting>the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI 2001</meeting>
		<imprint>
			<publisher>Morgan Kaufmann</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="611" to="617" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Integer induction in saturation</title>
		<author>
			<persName><forename type="first">P</forename><surname>Hozzová</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of CADE</title>
				<meeting>CADE</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">12699</biblScope>
			<biblScope unit="page" from="361" to="377" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">First-order theorem proving and Vampire</title>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Computer Aided Verification</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="1" to="35" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">New techniques in clausal form generation</title>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">GCAI 2016</title>
		<title level="s">EPiC Series in Computing</title>
		<imprint>
			<publisher>EasyChair</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="11" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">AVATAR: The architecture for first-order theorem provers</title>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Computer Aided Verification</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="696" to="710" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">AVATAR modulo theories</title>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">S</forename><surname>Bjørner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">GCAI 2016</title>
		<title level="s">EPiC Series in Computing</title>
		<imprint>
			<publisher>EasyChair</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="39" to="52" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Instantiation-based automated reasoning: From theory to practice</title>
		<author>
			<persName><forename type="first">K</forename><surname>Korovin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Automated Deduction</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="163" to="166" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Unification with abstraction and theory instantiation in saturation-based reasoning</title>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Tools and Algorithms for the Construction and Analysis of Systems</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="3" to="22" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">The uses of SAT solvers in Vampire</title>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st and 2nd Vampire Workshops</title>
		<title level="s">EPiC Series in Computing</title>
		<meeting>the 1st and 2nd Vampire Workshops</meeting>
		<imprint>
			<publisher>EasyChair</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="63" to="69" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A multithreaded Vampire with shared persistent grounding</title>
		<author>
			<persName><forename type="first">M</forename><surname>Rawson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">FMCAD 2021</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page">280</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Induction in Saturation-Based Proof Search</title>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of CADE</title>
				<editor>
			<persName><forename type="first">P</forename><surname>Fontaine</surname></persName>
		</editor>
		<meeting>CADE</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">11716</biblScope>
			<biblScope unit="page" from="477" to="494" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A polymorphic Vampire</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bhayat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Joint Conference on Automated Reasoning</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="361" to="368" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A combinator-based superposition calculus for higher-order logic</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bhayat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Reger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Joint Conference on Automated Reasoning</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="278" to="296" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">The TPTP problem library and associated infrastructure</title>
		<author>
			<persName><forename type="first">G</forename><surname>Sutcliffe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Automated Reasoning</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="337" to="362" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">Vampire getting noisy: Will random bits help conquer chaos? (system description)</title>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>EasyChair</publisher>
			<biblScope unit="volume">7719</biblScope>
		</imprint>
	</monogr>
	<note>EasyChair Preprint no</note>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Induction with Generalization in Superposition Reasoning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hajdú</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hozzová</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schoisswohl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-53518-6_8</idno>
	</analytic>
	<monogr>
		<title level="m">Proc. of CICM</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Benzmüller</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Miller</surname></persName>
		</editor>
		<meeting>of CICM</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">12236</biblScope>
			<biblScope unit="page" from="123" to="137" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Induction with recursive definitions in superposition</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hajdu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hozzová</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Formal Methods in Computer Aided Design (FMCAD)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2021">2021. 2021</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Inductive benchmarks for automated reasoning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hajdu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hozzová</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Kovács</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schoisswohl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Voronkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Intelligent Computer Mathematics</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Kamareddine</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Sacerdoti Coen</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="124" to="129" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Superposition with first-class booleans and inprocessing clausification</title>
		<author>
			<persName><forename type="first">V</forename><surname>Nummelin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bentkamp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tourret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vukmirovic</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of CADE</title>
				<meeting>CADE</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">12699</biblScope>
			<biblScope unit="page" from="378" to="395" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Large theory reasoning with SUMO at CASC</title>
		<author>
			<persName><forename type="first">A</forename><surname>Pease</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sutcliffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Siegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Trac</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AI communications</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="137" to="144" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Spass version 3.5</title>
		<author>
			<persName><forename type="first">C</forename><surname>Weidenbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dimova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fietzke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Suda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wischnewski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Automated Deduction</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="140" to="145" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Computing tiny clause normal forms</title>
		<author>
			<persName><forename type="first">N</forename><surname>Azmy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Weidenbach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of CADE</title>
				<meeting>CADE</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">7898</biblScope>
			<biblScope unit="page" from="109" to="125" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
