<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Decision Processes for Classical, Intuitionistic, and Modal Connection Calculi</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fredrik Rømming</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jens Otten</string-name>
          <email>jeotten@ifi.uio.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sean B. Holden</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>such MDPs.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Building</institution>
          ,
          <addr-line>15 JJ Thomson Avenue, Cambridge CB3 0FD</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Reinforcement Learning</institution>
          ,
          <addr-line>Automated Reasoning, Connection Calculus, Intuitionistic Logic, Modal Logic</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Cambridge, Department of Computer Science and Technology</institution>
          ,
          <addr-line>The Computer Laboratory, William Gates</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Oslo</institution>
          ,
          <addr-line>Gaustadalléen 23B, 0373 Oslo</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <fpage>107</fpage>
      <lpage>118</lpage>
      <abstract>
        <p>This paper introduces a framework for integrating Reinforcement Learning (RL) with proof search in connection calculi for classical, intuitionistic, and modal logic. We specify a mapping from the relevant connection calculi to Markov Decision Processes (MDPs), and provide a Python library implementing http://www.cl.cam.ac.uk/~fr409 (F. Rømming); http://jens-otten.de/ (J. Otten); http://www.cl.cam.ac.uk/~sbh11 0000-0001-7545-4662 (F. Rømming); 0000-0002-4331-8698 (J. Otten); 0000-0001-7979-1148 (S. B. Holden) Workshop Proceedings htp:/ceur-ws.org CEUR Workshop Proceedings (CEUR-WS.org) ISN1613-073</p>
      </abstract>
      <kwd-group>
        <kwd>Calculi</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Automated Theorem Proving (ATP) is concerned with determining whether a given formula is
valid in a specific logic. Complementary to classical logic, intuitionistic logic is used within
interactive proof assistants such as NuPRL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Coq [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], while modal logics [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] have
applications in planning, natural language processing and program verification. The time complexity
of ATP in these non-classical logics is higher than in classical logic (PSPACE-complete [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]
compared to NP-complete [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for the propositional fragment). So far only a few ATP systems for
these first-order
      </p>
      <p>
        non-classical logics exist, even though applications would benefit from more
eficient ATP systems. One approach to dealing with these non-classical logics is to encode
their Kripke semantics with prefixes [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. Two powerful ATP systems—ileanCoP [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and
MleanCoP [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for intuitionistic and modal logic, respectively—use prefixes and are based on
(clausal) connection calculi for non-classical logics [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ].
      </p>
      <p>
        Combining ATP and Machine Learning (ML) can enhance existing ATP systems (or theorem
provers) [
        <xref ref-type="bibr" rid="ref15 ref16 ref17 ref18">15, 16, 17, 18</xref>
        ]. Using ML to guide the proof search clearly has the potential to lead to
more eficient ATP systems, while preserving their ability to provide formal proofs. ML can be
used in fully automated ATP systems for premise selection, strategy choice [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and inference
choice. Whereas the first two approaches use ML in a pre-processing step, in the third approach
ML is tightly integrated into the proof search.
nEvelop-O
LGOBE
(S. B. Holden)
CEUR
      </p>
      <sec id="sec-2-1">
        <title>AReCCa 2023 107</title>
      </sec>
      <sec id="sec-2-2">
        <title>CEUR-WS.org</title>
        <p>This paper introduces a framework for integrating Reinforcement Learning (RL) with proof
search in connection calculi for classical, intuitionistic, and modal first-order logic. There are
two main contributions:
1. The discussion and definition of a mapping from classical and non-classical connection
calculi to Markov Decision Processes (MDPs).
2. A Python library implementing such MDPs providing seamless integration with the ML
ecosystem facilitating RL experiments for in-prover guidance.</p>
        <p>We introduce the connection calculi and ML techniques in Section 2. Section 3 specifies a
mapping from proof search to the RL setting. Section 4 describes the implementation of the
library. The paper concludes with a summary and a plan for future work in Section 5.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Preliminaries</title>
      <sec id="sec-3-1">
        <title>2.1. Classical, Intuitionistic and Modal Connection Calculi</title>
        <p>
          The following methods are based on uniform clausal connection calculi for classical, intuitionistic
and modal logic. We provide a short overview of these calculi; more details can be found
in [
          <xref ref-type="bibr" rid="ref11 ref12 ref20 ref21">20, 12, 21, 11</xref>
          ].
        </p>
        <p>
          An atomic formula (denoted by  ) is built up from predicate symbols (denoted by  , ,  ),
function symbols and term variables (,  ). A (first-order) formula (denoted by  ) is built up from
atomic formulae, the connectives ¬, ∧, ∨, ⇒, and the first-order quantifiers ∀ and ∃. A modal
formula might also include the modal operators □ and ◇. A literal  has the form  or ¬ . In
the (clausal) connection calculus a formula is represented as a matrix, which is a representation
of the formula in a (prefixed) clausal form.
2.1.1. Classical Logic
The classical matrix  ( ) of a formula  is its representation as a set of clauses, where each clause
is a set of literals. It is the representation of  in disjunctive normal form, where Skolemization
of the eigenvariables [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] is done in the usual way. In the graphical representation of a matrix,
its clauses are arranged horizontally, while the literals of each clause are arranged vertically.
In contrast to sequent calculi [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] and (standard) tableau calculi [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], the connection calculus
uses a connection-driven search to find a proof for the validity of a matrix  ( ) for a given
formula  . A connection is a set { 1, ¬ 2} of literals with the same predicate symbol but diferent
polarities. A term substitution   assigns terms to variables. A connection is   -complementary
if   ( 1) =   ( 2).
        </p>
        <p>
          The axiom and the three rules of the (clausal) connection calculus are given in Figure 1 (the
prefixes ∶ 1 and ∶ 2 are to be ignored for classical logic) [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The words/language of the
calculus are tuples “,  , Path”, where  is a matrix,  is a (subgoal) clause or  and (the active)
Path is a set of literals or  . A copy of a clause  is made by renaming all variables in  . The
rigid term substitution  =   is calculated by using one of the well-known term unification
algorithms whenever a connection is identified. A connection proof of  is a proof of ,  ,  in
the connection calculus with a substitution  =   .
Axiom (A)
{}, , ℎ
        </p>
        <p>Start (S)  2, , {}
, , 
and  2 is copy of  1∈
Reduction (R)
, , ℎ∪{</p>
        <p>2∶ 2}
∪{ 1∶ 1}, , ℎ∪{
2∶ 2}</p>
        <p>and { 1∶ 1,  2∶ 2} is  -complementary
Extension (E)  2⧵{ 2∶ 2}, , ℎ∪{</p>
        <p>
          1∶ 1} , , ℎ
∪{ 1∶ 1}, , ℎ
and  2 is a copy of  1∈ ,  2∶ 2∈ 2,
{ 1∶ 1,  2∶ 2} is  -complementary
2.1.2. Non-Classical Logics
For intuitionistic and modal logic, the matrix and the calculus are extended by prefixes,
representing world paths in the Kripke semantics; see [
          <xref ref-type="bibr" rid="ref24 ref8 ref9">24, 9, 8</xref>
          ]. A prefix  is a string consisting
of variables (denoted by  ,  ,
        </p>
        <p>) and constants (denoted by ,  ) and assigned to each literal.</p>
        <p>
          Skolemization is not only used for the (first-order) eigenvariables, but extended to prefix
constants [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Using the occurs-check during unification ensures that the reduction ordering [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
is acyclic. The intuitionistic and modal matrix  ( )
of a formula  is a representation of  in
standard clausal form, in which each literal  is marked with its prefix  ; see [
          <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
          ].
        </p>
        <p>
          A prefix substitution   assigns strings to prefix variables and is calculated by a prefix
unification that depends on the specific non-classical logic [
          <xref ref-type="bibr" rid="ref11 ref12 ref13">12, 13, 11</xref>
          ]. In intuitionistic and modal logic,
a connection { 1∶ 1,  2∶ 2} is  -complementary if both its literals and prefixes can be unified
under a combined substitution  = (  ,   ); that is, additionally   ( 1) =   ( 2) must hold. An
intuitionistic/modal connection proof of  is a proof of ,  , 
in the connection calculus of
Example 1. Consider the formula  1 ∶= ( ( ∨ ∀ ¬ (  ⇒  )) ∧  ) ⇒ ( ∧ )
following intuitionistic (prefixed) matrix  ∶=  (
1).
 ( 1) = { { 0∶  2∗,  0∶  2∗}, { 1∶  1,  0∶   
∗ ∗
1  1 }, { 1∶  2,  1∶   
where  1∗ ∶=  1( ,  , ,  )
,  1∗ ∶=  1( ,  , ,  )
        </p>
        <p>,  2∗ ∶=  2( ) ,  2∗ ∶=  2( ) .1 
graphical representation and graphical connection proof with the term substitution   () = 
and the prefix substitution   ( 1) =  2 ,   ( 2) =  2∗,   ( 3) =  1∗,   ( 4) =  2∗ (literals of each
∗
connection are connected with a line).</p>
        <p>. It has the
∗
1  3}, { 1∶  4} }
has the following


0
∶  2∗
[ [ 0
∶  2∗] [ 0∶   

1
∶  1
1∗ 1∗] [ 1∶   

1
∶  2</p>
        <p>1
1∗ 3] [ ∶  4] ]
The formal connection proof of  (where prefixes have been omitted for better readability) is
shown in Figure 2.</p>
        <p>
          1The polarities 0 and 1 are used to mark non-negated and negated literals, respectively (see [
          <xref ref-type="bibr" rid="ref23 ref8">23, 8</xref>
          ]).
{ }, , {
        </p>
        <p>1
{ }, , {
0
0,  0 ′} A
,  0 ′} R
{</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. MDPs and Reinforcement Learning</title>
        <p>
          We now provide an introduction to MDPs and RL—details can be found in [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ].
        </p>
        <p>Most proof procedures are search algorithms: there is an initial state, states can be modified by
actions, and the goal is to find a proof state. The use of heuristics is crucial for performance. For
example, in the case of saturation provers, for choosing a pair of clauses to resolve. One might
imagine an agent, armed with a heuristic, acting to change the initial state of its environment
into a state representing a proof.</p>
        <p>An MDP represents a more general formulation of this kind of problem. Let  denote the set
the environment moves to a new state  ′
∈  with probability  (
′
|, ) ; that is,  ′
∼  (
of states, and let  denote the set of actions. When an agent performs action  ∈ 
in state  ∈  ,
At the same time, the agent receives a reward ℛ( ; , ) ∈ ℝ . The tuple (, ,  , ℛ)
′
|, ) .</p>
        <p>defines
′
the MDP. Let subscripts  denote the sequence of states, actions and rewards through time, and
imagine the agent has a policy  ∶  →</p>
        <p>telling it which action to employ in any given state;
that is, at time  the agent always applies   =  (  ).2 Then, starting from a state  0, the agent
will move through states</p>
        <p>0 →  1 ∼  ( 1| 0,  ( 0)) →  2 ∼  ( 2| 1,  ( 1)) → ⋯
and receive a sequence of rewards</p>
        <p>0 = ℛ( 1;  0,  ( 0)) →  1 ∼ ℛ( 2;  1,  ( 1)) → ⋯ .</p>
        <p>A utility function</p>
        <p>
          () computes the overall accumulated reward associated with the use of  ,
starting from state  . As future rewards are often perceived as less valuable than short-term
rewards, a common function is
  () =  [ 0 +  1 +  2 2 + ⋯] =  [∑     ]
∞
=0
where the expected value is with respect to the randomness governing the state transitions,
 ⋆ is one satisfying  ⋆() = argmax 
 () , and which leads to utility 

() .
and  ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] sets the trade-of between short-term and long-term rewards⋆. An optimal policy
2In general, policies may also be stochastic, i.e., distributions over actions given state. This is particularly useful
for exploration during the learning process.
        </p>
        <p>Both the optimal policy and its corresponding utility can be expressed by considering what
happens if we take particular actions from the current state and follow the optimal policy
thereafter




′
′
⋆</p>
        <p>() = max ∑  ( ′|, ) (ℛ( ′; , ) + 
 ⋆() = argmax ∑  ( ′|, ) (ℛ( ′; , ) +</p>
        <p>⋆
 ( ′))</p>
        <p>⋆
 ( ′)) .</p>
        <p>Numerous algorithms exist for inferring an optimal policy for an MDP, depending on what is
known about the MDP. If little is known, we must learn about the environment by exploring
actions and their efects, and this is what RL achieves.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Connection Calculi as Markov Decision Processes</title>
      <p>As described in the previous section, RL concerns agents interacting with environments. In
the case of proof procedures, one can consider an agent deciding which choice to make at each
point in the proof search. Hence, to apply RL we need to define the proof search environment
and its choice points. We now address the description of proof search procedures in connection
calculi using MDPs.</p>
      <p>While the reader familiar with the connection calculus might be tempted to see the relationship
between proof and MDP as straightforward, it is in fact more subtle than is apparent at first
glance, and requires some care to define correctly. While the connection calculi define procedures
whereby sequential decisions are made to find proofs, they do not directly define MDPs. For
confluent proof calculi such as resolution, one can treat the words of the calculus as observations
and inference rules as actions, since all information about the state of the proof is carried in the
most recently generated word. However, this cannot be done with connection calculi, because
it is unclear how to handle branching and backtracking. The words of the connection calculi do
not carry enough information to know whether the current state is a goal state (a proof), or to
know what inferences have or have not been attempted. Allowing dead-end states that are not
proofs would be unfaithful to the underlying dynamics (provers run until they have found a
proof), so backtracking should be incorporated as part of the MDP.</p>
      <p>We now specify the state space, action space, transition space, and reward function tuple
(, ,  , ℛ)</p>
      <p>for CC-MDP, one possible MDP representation of proof search in connection calculi.
Definition 1 (CC-MDP State Space). The state space  is defined as the set of all possible
derivations (a derivation is an incomplete proof in which some leaves are not closed by axioms) in the
connection calculus together with the substitution  =   . Further, to keep track of the open
backtracking choices, each node in the connection derivation is marked with the possible inference
steps that have not been attempted so far.</p>
      <p>
        In general, there might be exponentially many unifiers for one prefix equation (of one
connection) [
        <xref ref-type="bibr" rid="ref12 ref14 ref26">12, 14, 26</xref>
        ]. Therefore, the rigid prefix substitution   (for the non-classical logics)
is calculated only after a classical proof has been found. This has turned out to be the most
eficient approach [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].   is therefore not part of the states in  (see also the description of the
implementation in Section 4).
Example 2. Consider the formula ( ( ∨ ∀ ¬( ⇒ )) ∧  ) ⇒ ( ∧ ) from Example 1 and its
classical matrix  = {{ , }, {¬ , }, {¬ , ¬}, {¬}} . Figure 3 shows a (possible) state  ∈  in
the proof search of  .  includes a derivation of  together with the substitution   , and for
each node a list of non-attempted inference steps.
      </p>
      <p>Definition 2 (CC-MDP Action Space). The action space  consists of rule application actions
 inferences and a backtracking action   . There is a rule application action for each rule in
the connection calculus. Hence, a rule application action  , ∈  inferences is specified by the rule
name  ∈ {, , } (for Start, Reduction or Extension) and the associated clause and/or literal  .
Specifically, an action  , can have one of the following forms:  , 2 for the Start rule,  , 2 for the
Reduction rule and  , 2/ 2 for the Extension rule.</p>
      <p>Example 3. To get from the initial state ,  ,  to the state in Figure 3 one can take the actions:
 ,{,} ,  ,{¬, ′}/¬ ,  ,{¬,¬}/¬ . These actions are a start step with the clause { , } ,
followed by two extension steps connecting the leftmost literal 3 in the leftmost open subgoals of
the proof tree to the literal  in the clause {¬ ,  ′} and the literal ¬ in the clause {¬, ¬ } .</p>
      <p>
        We say that action  is valid in state  if action  =  backtrack or  is a rule application
action  , ∈  inferences denoting a valid rule application to the leftmost literal in the leftmost
open subgoal in the proof tree of  . A valid rule application takes the rigid term substitution
  into account, so   ( 1) =   ( 2). As for the states in the state space  , the rigid prefix
substitution   is not taken into account (and updated) for the non-classical logics. To handle
proper backtracking, when a rule application action  is taken from state  to  ′,  is no longer
counted as a non-attempted inference for the node in the tableau of  ′ corresponding to the
principal node [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] of  . The special action  backtrack backtracks the state’s derivation from
the leftmost literal in the leftmost subgoal to the previous choice point, which still has
nonattempted inferences. This is one of many ways to model backtracking. In particular, this
method ensures that the complete (leanCoP) policy: “always choose the first non-attempted
inference, otherwise backtrack” can be expressed easily.
      </p>
      <p>In general an MDP models state transitions stochastically—performing action  in state 
leads to a new state  ′ ∼  ( ′|, ) . In a connection prover the transition is deterministic, in the
sense that performing action  in state  leads reliably to a single outcome state  ′. This gives
rise to the following deterministic state transition function.</p>
      <p>3All clauses (including subgoal clauses) are treated as ordered sets of literals.
Definition 3 (CC-MDP Transition Function). The transition distribution  is defined as
 (
′ =  ′|, ) =
⎧1
⎨
⎩0
if  is valid in  and  ′ is the (deterministic) result
of applying  to 
otherwise.</p>
      <p>Notice that the transition function is necessarily deterministic, as any probabilistic transition
function would not accurately describe the dynamics of the underlying system.</p>
      <p>We only consider the first literal of the leftmost open subgoal for rule application. This
is because all subgoals need to be closed, so we do not consider alternative subgoals and
literals within subgoals as choice points for the MDP. Including these as choice points could
be interesting, but it would also increase the size of the action space and general complexity
introducing a trade-of.</p>
      <p>Definition 4 (CC-MDP Reward Function). To remain faithful to the underlying problem, we
consider the following relatively sparse reward function
ℛ( ′; , ) =
where  ′ is a proof if the derivation of  ′ is a proof under the unique (combined) substitution  =  
(for classical logic) or  = (  ,   ) of  ′.</p>
      <p>This reward function is the simplest function accurately describing the goal while preserving
optimality of solutions.</p>
      <p>Example 4. Figure 4 shows the graph representation of a part of CC-MDP.</p>
      <p>ε, {{P, R}, {¬P, Qx}, {¬P, ¬Qc}, {¬R}}, ε</p>
      <p>aS,{P,R}
{P, R}, M, {}
ε, {{P, R}, {¬P, Qx}, {¬P, ¬Qc}, {¬R}}, ε</p>
      <p>S
aE,{¬P,¬Qc}/¬Qc
{¬Qc} , M, {P } {R}, M, {} E</p>
      <p>{P, R}, M, {}
ε, {{P, R}, {¬P, Qx}, {¬P, ¬Qc}, {¬R}}, ε</p>
      <p>S
{Qx′} , M, {P } {R}, M, {} E</p>
      <p>{P, R}, M, {}
ε, {{P, R}, {¬P, Qx}, {¬P, ¬Qc}, {¬R}}, ε</p>
      <p>S
aE,{¬P,Qx}/¬P
aE,{¬P,¬Qc}/¬Qc
{¬P }, M, {P, Qx′} {}, M, {P } E
{Qx′} , M, {P }</p>
      <p>{P, R}, M, {}
ε, {{P, R}, {¬P, Qx}, {¬P, ¬Qc}, {¬R}}, ε</p>
      <p>S
{R}, M, {} E</p>
      <p>Proof search in the classical and non-classical connection calculi can be framed as an agent
interacting with the CC-MDP environment, giving the necessary theoretical framework for
applying RL to the proof search in these connection calculi. Furthermore, techniques such as
positive start clauses and iterative deepening can be accounted for with minor tweaks to the
components of CC-MDP.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Implementation</title>
      <sec id="sec-5-1">
        <title>4.1. Connection Calculi as MDPs in Python</title>
        <p>
          Connections [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] is a Python library of connection calculi implemented as MDPs, providing
environments for proof search in connection calculi. It provides OpenAI
Gym/Gymnasiumlike [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] environments for proof search in connection calculi for classical, intuitionistic, and
modal first-order logic. It currently supports the modal logics D, T, S4, and S5, each for the
constant, cumulative, and varying domains.
        </p>
        <p>
          Connections implements the basic calculi for classical, intuitionistic, and modal logic as
described above, enhanced by regularity [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. The observation and action spaces are as described
in Section 3, treating literals and (first-order) terms as objects associated with locations in a
matrix (represented as a list of lists of literals) and as a tableau-like proof tree. For intuitionistic
and modal logic, literals and terms have an extra field for their prefixes represented as (first-order)
terms. Connections is implemented natively in Python with no dependencies. As Python is the
de-facto language for ML, the library provides an accessible and reproducible way to conduct
RL experiments with provers based on connection calculi. Using standardized frameworks
increases confidence in the correctness of the implementation, and the environments can easily
be incorporated into the rest of the ML ecosystem alongside frameworks such as RLlib [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ],
Stable Baselines [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ], PyTorch [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ], Tensorflow [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] and others. Figure 5 gives an overview of
Connections and how it fits into the RL setting.
        </p>
        <p>ConnectionEnv</p>
        <p>Prover agent
action
state
reward
Connections</p>
        <p>Connection Environments
Unification
Logical primitives</p>
        <p>Matrix
ConnectionAction</p>
        <p>ConnectionState
Term unification</p>
        <p>Prefix unification (D, T, S4, S5, Intuitionistic)</p>
        <p>Literal</p>
        <p>Term</p>
        <p>
          Compared to conducting learning experiments with external calls to Prolog and OCaml
implementations of leanCoP, using the Connections environments drastically reduces the complexity
needed to control the prover. This is due to eliminating the need for remote procedure calls, and
to the imperative basis of the environments, which allow fine-grained control while respecting
abstraction levels. Furthmore, using an imperative language to implement Connections allows
a more direct control of the proof search, e.g. of backtracking, than is possible in the declarative
Prolog programming language (see also [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ]).
        </p>
        <p>The non-classical Connections environments inherit from the classical environment, adding
logic-specific prefixes and prefix unification algorithms. The non-classical provers based on
Connections perform a classical proof search, in which the prefixes of the literals in each
connection are collected. After a classical proof is found, these prefixes are unified by a prefix
unification algorithm to ensure that the classical proof is also a valid non-classical proof.</p>
        <p>
          Besides the basic calculi, the Connections environments implement two additional
wellknown optimizations, significantly reducing the underlying search space while preserving
completeness. The start clause  1/ 2 of the start rule is restricted to positive start clauses (those
without negated literals, which likely represent conjecture clauses in the disjunctive normal
form) and the regularity condition [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] is employed. The translation into a (prefixed) matrix is
done in a pre-processing step. If the problem contains explicit axiom and conjecture formulas,
the standard/naive translation into clausal form is performed for the axiom formulas, while a
definitional translation [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] is performed for the conjecture formula. This approach has shown
the best performance [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ].
        </p>
        <p>The Connections environments are not provers by themselves; they expose an interface that
agents can use to train and make inferences, completing the RL agent-environment interaction
loop, as shown in Figure 5. A prover in this context is an agent making consecutive steps in a
Connections environment until it has found proof or timed out.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Python Connection Provers for Classic and Non-classical Logics</title>
        <p>The Connections environments can be used to build both learning and non-learning connection
provers, depending on the agent used. For example, by using non-learning agents that always
choose the first available action, we obtain standard Python connection provers in an elegant
and straightforward way—the provers emerge from the interaction between the “always-first”
agent and a Connections environment. The complete Python code implementing such a prover
is shown in Figure 6.
env = ConnectionEnv("problem_path")
observation, info = env.reset()
while True:
action = env.action_space[0] # Always choose first available action
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:</p>
        <p>break</p>
        <p>
          Depending on the environment used (ConnectionEnv, IConnectionEnv or MConnectionEnv)
this results in three (stand-alone) Python provers [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] for classical, intuitionistic and modal
logics, called pyCoP, ipyCoP and mpyCoP respectively. These are based on the same connection
calculi as the leanCoP family of theorem provers implemented in Prolog [
          <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref20 ref21">20, 21, 12, 10, 13, 11</xref>
          ].
By design, the pyCoP provers mimic the classical proof steps of leanCoP 1.0, using the positive
start clause technique and iterative deepening on the size of the active path. However, the
pyCoP provers do not reorder clauses during proof search, and integrate an enhanced regularity
check [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. This corresponds to version 1.0f of leanCoP [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ].
        </p>
        <p>While the main purpose of the Connections environments is to facilitate easy implementation
of learned provers for classical, intuitionistic, and modal logic using the MDP + agent view of
connection proof search, the pyCoP provers highlight the general (learning and non-learning)
capabilities of the Connections environments and give confidence in the correctness of their
implementation by showing that they can be used to emulate leanCoP, ileanCoP, and MleanCoP.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>We present a Python library providing a framework for ML in connection calculi for classical and
non-classical logics, and with specific emphasis on facilitating experiments using RL to guide
proof search. Aside from its ML-centric component, this also represents the first non-Prolog
implementation of provers based on the clausal non-classical connection calculi, and using
prefix unification to capture the Kripke semantics of intuitionistic and modal first-order logics.</p>
      <p>We are at present using this library to experiment with RL methods in an attempt to improve
the performance of the unmodified provers, and we hope that the library inspires and facilitates
others to explore their own ideas within this space.</p>
      <p>
        In future work, we intend to extend the library to allow us more fully to address restricted
backtracking, refutation techniques, and to include both further modal logics, and non-clausal
methods such as those of nanoCoP [
        <xref ref-type="bibr" rid="ref35 ref36">35, 36</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Constable</surname>
          </string-name>
          , et al.,
          <article-title>Implementing Mathematics with the NuPRL proof development system</article-title>
          , Prentice-Hall, Englewood Clifs, NJ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bertot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castéran</surname>
          </string-name>
          ,
          <article-title>Interactive Theorem Proving and Program Development Coq'Art: The Calculus of Inductive Constructions</article-title>
          , EATCS Series, Springer, Heidelberg,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. van Bentham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          , Handbook of Modal Logic, Elsevier, Amsterdam,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fitting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Mendelsohn</surname>
          </string-name>
          ,
          <string-name>
            <surname>First-Order Modal</surname>
            <given-names>Logic</given-names>
          </string-name>
          , Kluwer, Dordrecht,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>R. E. Ladner,</surname>
          </string-name>
          <article-title>The computational complexity of provability in systems of modal propositional logic</article-title>
          ,
          <source>SIAM Journal on Computing</source>
          <volume>6</volume>
          (
          <year>1977</year>
          )
          <fpage>467</fpage>
          -
          <lpage>480</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Statman</surname>
          </string-name>
          ,
          <article-title>Intuitionistic propositional logic is polynomial-space complete</article-title>
          ,
          <source>Theoretical Computer Science</source>
          <volume>9</volume>
          (
          <year>1979</year>
          )
          <fpage>67</fpage>
          -
          <lpage>72</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Cook</surname>
          </string-name>
          ,
          <article-title>The complexity of theorem-proving procedures</article-title>
          ,
          <source>in: Third Annual ACM Symposium on Theory of Computing</source>
          , ACM, New York,
          <year>1971</year>
          , pp.
          <fpage>151</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Wallen</surname>
          </string-name>
          ,
          <source>Automated Deduction in Non-Classical Logics</source>
          , MIT Press, Cambridge,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          ,
          <article-title>Connections in nonclassical logics</article-title>
          , in: A.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Voronkov (Eds.),
          <source>Handbook of Automated Reasoning, Elsevier Science</source>
          , Amsterdam,
          <year>2001</year>
          , pp.
          <fpage>1487</fpage>
          -
          <lpage>1578</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[10] J. Otten, leanCoP 2.0 and ileanCoP 1</source>
          .
          <article-title>2: High performance lean theorem proving in classical and intuitionistic logic</article-title>
          , in: A.
          <string-name>
            <surname>Armando</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Baumgartner</surname>
          </string-name>
          , G. Dowek (Eds.),
          <source>Automated Reasoning (IJCAR</source>
          <year>2008</year>
          ), volume
          <volume>5195</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer, Heidelberg,
          <year>2008</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>MleanCoP: A connection prover for first-order modal logic</article-title>
          , in: S. Demri,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kapur</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          Weidenbach (Eds.),
          <source>Automated Reasoning (IJCAR</source>
          <year>2014</year>
          ), volume
          <volume>8562</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer, Heidelberg,
          <year>2014</year>
          , pp.
          <fpage>269</fpage>
          -
          <lpage>276</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Clausal connection-based theorem proving in intuitionistic first-order logic</article-title>
          ,
          <source>in: TABLEAUX</source>
          <year>2005</year>
          , volume
          <volume>3702</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer, Heidelberg,
          <year>2005</year>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>261</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Implementing connection calculi for first-order modal logics</article-title>
          , in: E. Ternovska,
          <string-name>
            <given-names>K.</given-names>
            <surname>Korovin</surname>
          </string-name>
          , S. Schulz (Eds.),
          <source>9th International Workshop on the Implementation of Logics (IWIL</source>
          <year>2012</year>
          ), volume
          <volume>22</volume>
          of EPIC, EasyChair,
          <year>2012</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          , W. Bibel,
          <article-title>Advances in connection-based automated theorem proving</article-title>
          , in: M.
          <string-name>
            <surname>Hinchey</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          <string-name>
            <surname>Bowen</surname>
          </string-name>
          , E.-R. Olderog (Eds.),
          <source>Provably Correct Systems, NASA Monographs in Systems and Software Engineering</source>
          , Springer, Cham,
          <year>2017</year>
          , pp.
          <fpage>211</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Urban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vyskočil</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. Štěpánek,</surname>
          </string-name>
          <article-title>MaLeCoP Machine Learning Connection Prover</article-title>
          , in: K. Brünnler, G. Metcalfe (Eds.),
          <source>TABLEAUX 2011, Lecture Notes in Computer Science</source>
          , Springer, Berlin, Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>263</fpage>
          -
          <lpage>277</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G.</given-names>
            <surname>Irving</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Alemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Een</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Chollet</surname>
          </string-name>
          , J. Urban, DeepMath
          <article-title>- Deep Sequence Models for Premise Selection</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>29</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Kaliszyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Urban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Michalewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Olšák</surname>
          </string-name>
          ,
          <article-title>Reinforcement Learning of Theorem Proving</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>31</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zombori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Urban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Brown</surname>
          </string-name>
          , Prolog Technology Reinforcement Learning Prover, in: N.
          <string-name>
            <surname>Peltier</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Sofronie-Stokkermans</surname>
          </string-name>
          (Eds.),
          <source>Automated Reasoning (IJCAR</source>
          <year>2020</year>
          ), Lecture Notes in Computer Science, Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>489</fpage>
          -
          <lpage>507</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mangla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Holden</surname>
          </string-name>
          , L. Paulson,
          <article-title>Bayesian ranking for strategy scheduling in automated theorem provers</article-title>
          , in: J.
          <string-name>
            <surname>Blanchette</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kovács</surname>
          </string-name>
          , D. Pattinson (Eds.),
          <source>Automated Reasoning (IJCAR</source>
          <year>2022</year>
          ), volume
          <volume>13385</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>559</fpage>
          -
          <lpage>577</lpage>
          . 19 pages.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          , W. Bibel,
          <article-title>leanCoP: lean connection-based theorem proving</article-title>
          ,
          <source>Journal of Symbolic Computation</source>
          <volume>36</volume>
          (
          <year>2003</year>
          )
          <fpage>139</fpage>
          -
          <lpage>161</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Restricting backtracking in connection calculi</article-title>
          ,
          <source>AI Commun</source>
          .
          <volume>23</volume>
          (
          <year>2010</year>
          )
          <fpage>159</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G.</given-names>
            <surname>Gentzen</surname>
          </string-name>
          ,
          <article-title>Untersuchungen über das Logische Schließen</article-title>
          ,
          <source>Mathematische Zeitschrift</source>
          <volume>39</volume>
          (
          <year>1935</year>
          )
          <fpage>176</fpage>
          -
          <lpage>210</lpage>
          ,
          <fpage>405</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>R. M. Smullyan</surname>
          </string-name>
          ,
          <string-name>
            <surname>First-Order</surname>
            <given-names>Logic</given-names>
          </string-name>
          ,
          <source>Ergebnisse der Mathematik und ihrer Grenzgebiete</source>
          , Springer-Verlag, Berlin, Heidelberg, New York,
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Non-clausal connection calculi for non-classical logics</article-title>
          , in: R.
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          Nalon (Eds.),
          <source>TABLEAUX</source>
          <year>2017</year>
          , volume
          <volume>10501</volume>
          <source>of LNAI</source>
          , Springer, Cham,
          <year>2017</year>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Barto</surname>
          </string-name>
          ,
          <source>Reinforcement Learning: An Introduction</source>
          , 2nd edition ed., MIT Press,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>Advancing automated theorem proving for the modal logics D and S5</article-title>
          , in: C.
          <string-name>
            <surname>Benzmüller</surname>
          </string-name>
          , J. Otten (Eds.),
          <source>Automated Reasoning in Quantified Non-Classical Logics (ARQNL</source>
          <year>2022</year>
          ),
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>91</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rømming</surname>
          </string-name>
          , Connections,
          <year>2023</year>
          . URL: https://github.com/fredrrom/connections.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brockman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Cheung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pettersson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schulman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          , W. Zaremba, OpenAI Gym,
          <year>2016</year>
          . ArXiv:
          <volume>1606</volume>
          .01540 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>R.</given-names>
            <surname>Letz</surname>
          </string-name>
          , G. Stenz,
          <article-title>Model elimination and connection tableau procedures</article-title>
          ,
          <source>in: Handbook of Automated Reasoning</source>
          , Elsevier Science Publishers, Amsterdam,
          <year>2001</year>
          , pp.
          <fpage>2015</fpage>
          -
          <lpage>2112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>E.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Liaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nishihara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Moritz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Stoica</surname>
          </string-name>
          ,
          <article-title>RLlib: Abstractions for Distributed Reinforcement Learning</article-title>
          ,
          <source>in: Proceedings of the 35th International Conference on Machine Learning, PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>3053</fpage>
          -
          <lpage>3062</lpage>
          . ISSN:
          <fpage>2640</fpage>
          -
          <lpage>3498</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rafin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gleave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kanervisto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ernestus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Dormann</surname>
          </string-name>
          , Stable-baselines3:
          <article-title>Reliable reinforcement learning implementations</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>22</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>A.</given-names>
            <surname>Paszke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Massa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lerer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bradbury</surname>
          </string-name>
          , G. Chanan,
          <string-name>
            <given-names>T.</given-names>
            <surname>Killeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gimelshein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Antiga</surname>
          </string-name>
          , et al.,
          <article-title>Pytorch: An imperative style, high-performance deep learning library</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Brevdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Citro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Devin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harp</surname>
          </string-name>
          , G. Irving,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kudlur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Levenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mané</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Monga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Murray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Talwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vasudevan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Warden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wattenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <source>TensorFlow: Large-scale machine learning on heterogeneous systems</source>
          ,
          <year>2015</year>
          .
          <article-title>Software available from tensorflow</article-title>
          .
          <source>org.</source>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Holden</surname>
          </string-name>
          , Connect++
          <article-title>: A new automated theorem prover based on the connection calculus</article-title>
          ,
          <source>in: Proceedings of the Workshop on Automated Reasoning with Connection Calculi (AReCCa)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>A non-clausal connection calculus</article-title>
          , in: K. Brünnler, G. Metcalfe (Eds.),
          <source>TABLEAUX</source>
          <year>2011</year>
          , volume
          <volume>6793</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer, Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>226</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>J.</given-names>
            <surname>Otten</surname>
          </string-name>
          ,
          <article-title>nanoCoP: A non-clausal connection prover</article-title>
          , in: N.
          <string-name>
            <surname>Olivetti</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Tiwari (Eds.),
          <source>Automated Reasoning (IJCAR</source>
          <year>2016</year>
          ), volume
          <volume>9706</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , Springer, Heidelberg,
          <year>2016</year>
          , pp.
          <fpage>302</fpage>
          -
          <lpage>312</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>