<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>German Conference on Artificial Intelligence, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards Explainability of Approximate Lifted Model Construction: A Geometric Perspective</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jan Speller</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Malte Luttermann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcel Gehrke</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanya Braun</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>German Research Center for Artificial Intelligence (DFKI)</institution>
          ,
          <addr-line>Ratzeburger Allee 160, 23562 Lübeck</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Hamburg, Institute for Humanities-Centered AI (CHAI)</institution>
          ,
          <addr-line>Warburgstraße 28, 20354 Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Münster, Computer Science Department</institution>
          ,
          <addr-line>Einsteinstraße 62, 48149 Münster</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>16</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Advanced colour passing (ACP) is the state-of-the-art algorithm for lifting a propositional probabilistic model to a first-order level by combining exchangeable factors, enabling the use of lifted inference algorithms to allow for tractable probabilistic inference with respect to domain sizes. More recently, an approximate version of ACP, called -ACP, ensures the practical applicability of ACP by accounting for inaccurate estimates of underlying distributions. -ACP permits underlying distributions, encoded as potential-based factorisations, to slightly deviate depending on a hyperparameter  while maintaining a bounded approximation error. To navigate through diferent levels of compression versus accuracy, a hierarchical version of -ACP has emerged that builds a hierarchy of  values. In a drive towards interpretability of results, this paper looks at geometric properties of -equivalence, a central notion employed by -ACP and its hierarchical version to quantify the maximum allowed deviation between potentials. Specifically, we present a unified view on the results for -ACP and its hierarchical version and provide a geometric interpretation of -equivalence in ℒ, thereby making results more interpretable.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Lifted Inference</kwd>
        <kwd>Geometric Interpretability</kwd>
        <kwd>Approximate Model</kwd>
        <kwd>Factor Graph</kwd>
        <kwd>Colour Passing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Intelligent systems tasked with modelling its environment have to contend with uncertainty in the world
as well as objects and relations among them. The former is canonically modelled using probabilistic
graphical models, encoding features as random variables (randvars) and relations between them in a
factorised probability distribution. The latter is often represented using some form of higher-order logic,
describing a world through hard constraints on objects and their relations. Combining both perspectives
has lead to various probabilistic relational formalisms: Parametric factor graphs (PFGs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] emerge from
the uncertainty perspective with logical constructs added to model objects and relations among them
under uncertainty. Markov logic networks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] originate from the logic perspective with weights added to
the constraints to denote how likely they are to hold. Both formalisms follow grounding semantics [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ],
yielding a full joint probability distribution over indistinguishable grounded (propositional) randvars.
Over the years, many researchers have focused on developing eficient inference algorithms in such
models, exploiting these first-order structures under the name of lifting [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for episodic reasoning
[
        <xref ref-type="bibr" rid="ref10 ref5 ref6 ref7 ref8 ref9">5, 6, 7, 8, 9, 10</xref>
        ], queries [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], evidence [
        <xref ref-type="bibr" rid="ref13 ref9">9, 13</xref>
        ], temporal reasoning [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ], and decision making
[
        <xref ref-type="bibr" rid="ref16 ref17 ref18 ref19 ref20 ref21 ref22">16, 17, 18, 19, 20, 21, 22</xref>
        ], allowing for tractable inference in the number of objects [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        While these works document the impressive progress made over the years, these works usually
assume a rfist-order model lifted from the ground level to start with. How to lift a propositional
model has been a more overlooked research question. Colour Passing (CP) is the state-of-the-art
algorithm to turn a propositional model into a first-order one, specifically lifting a factor graph ( FG)
to a PFG, grouping factors with identical potentials [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], based on the Weisfeiler-Leman algorithm for
graphs [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. CP has been revisited over ten years later to further optimise the algorithm, considering
commutativity among the randvars in factors [
        <xref ref-type="bibr" rid="ref25 ref26 ref27">25, 26, 27</xref>
        ] as well as scaling of potentials [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. To deal
with the fact that identical potentials are highly unlikely to emerge naturally in a ground FG, even if the
underlying objects behave almost indistinguishably, -Advanced Colour Passing (-ACP) constitutes an
approximate extension that groups factors whose potentials difer by a factor of at most (1 ± ), with 
as a hyperparameter [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. In a step towards making the algorithm hyperparameter free, Hierarchical
Advanced Colour Passing (HACP) proceeds to build a hierarchy of  values that lead to increasingly
more compact encodings at the expense of a higher approximation error [30], while keeping previous
compressions in tact to improve explainability.
      </p>
      <p>This paper takes the most recent advances in approximating Advanced Colour Passing (ACP) and
builds toward making the resulting compressed representation more interpretable by considering a
geometric perspective. Specifically, this paper contributes
(i) a unified overview of the results for -ACP and HACP,
(ii) proofs of novel properties of -equivalence in ℒ,
(iii) a discussion of their value for interpretability, and
(iv) a glimpse of potential advances for approximate lifted model construction.</p>
      <p>
        The remaining part of this paper is structured as follows. We start with basic definitions and a recap of
the results for -ACP and HACP. The main part then presents the geometric view of -equivalence in
the ℒ space, including a closer look at the special case of the Euclidean space. The paper ends with a
discussion and conclusion about the potential benefits of the new view and the resulting possibilities
for further generalisations of ACP and its approximate variants.
2. Factor Graphs and Approximate Colour Passing
To establish a formal foundation for lifted probabilistic inference under approximation, we use the same
notation and definitions of Luttermann et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] and its continuation [30]. The following sequence
introduces FGs, probabilistic queries, and then progresses towards the central notion of -equivalent
factors, which builds the backbone of the initial approximate lifted model construction approaches
(i.e., -ACP and HACP) and the foundation of a geometric perspective developed in Section 3. First,
properties of the introduced concepts are followed by a summary of the algorithms -ACP and HACP
for approximate lifted model construction and their asymptotic behaviour in Section 2.1 and Section 2.2.
Definition 1 (Factor graph [29, Def. 1]). An FG  = ( , ) is an undirected bipartite graph consisting
of a node set  =  ∪ Φ, where  = {1, . . . , } is a set of randvars and Φ = {1, . . . , } is a set
of factors (functions), as well as a set of edges  ⊆  × Φ. There is an edge between a randvar  ∈ 
and a factor  ∈ Φ in  if  appears in the argument list of  . A factor  (ℛ ) defines a function
 : × ∈ℛ range() ↦→ R&gt;0 that maps the ranges of its arguments ℛ (a sequence of randvars from )
to a positive real number, called potential. The term range() denotes the possible values a randvar  can
take. We further define the joint potential for an assignment  (with  being a shorthand for  = ) as
(1)
(2)
where  is a projection of  to the argument list of  . With  = ∑︀ ∏︀
=1  ( ) as the normalisation
constant, the full joint probability distribution encoded by  is then given by
      </p>
      <p>() = ∏︁  ( ),</p>
      <p>=1
 () = 1 ∏=︁1  ( ) = 1</p>
      <p>().



1
2

true
true
false
false

true
true
false
false

true
false
true
false

true
false
true
false
1(, )
 1
 2
 3
 4
2(, )
 1
 2
 3
 4</p>
      <p>Example 1 (Factor Graph). Take a look at the FG depicted in Fig. 1. For the sake of the example, it holds
that  = {, , }, Φ = {1, 2},  = {{, 1}, {, 1}, {, 2}, {, 2}}, and range() =
range() = range() = {true, false}. The potential tables (i.e., function definitions) of the factors
1 and 2 are shown next to the graph on the right. Specifically, it holds that 1(true, true) =  1,
1(true, false) =  2, and so on, where   ∈ R&gt;0,  = 1, . . . , 4, are arbitrary positive real numbers.</p>
      <p>An FG enables an eficient representation of complex distributions by decomposing them into factors
over subsets of variables. Each factor captures an aspect of the joint distribution, giving rise to natural
queries that we will define next to determine the objectives of probabilistic inference.
Definition 2 (Query [29, Def. 2]). A query  ( | 1 = 1, . . . ,  = ) consists of a query term 

and a set of events { =  }=1 where  and all  ,  = 1, . . . , , are randvars. To query a specific
probability instead of a distribution, the query term is an event  = .</p>
      <p>Example 2 (Query). The query  ( |  = true) asks for the probability distribution of  given that the
event  = true is observed.</p>
      <p>
        To enable approximate inference and structure compression, the notion of -equivalence has been
introduced by Luttermann et al. [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. -equivalence relaxes exact equality between potentials and
enables coarser, more eficient representations under bounded deviation.
      </p>
      <p>Definition 3 (-Equivalence [29, Def. 3]). Let  ∈ R&gt;0 be a positive real number. Two potentials  1,  2 ∈
R&gt;0 are -equivalent, denoted as  1 =  2, if  1 ∈ [ 2 · (1 − ),  2 · (1 + )] and  2 ∈ [ 1 · (1 − ),  1 ·
(1 + )]. Further, two factors 1(1, . . . , ) and 2(1′, . . . , ′) are -equivalent, denoted as 1 = 2,
if there exists a permutation  of {1, . . . , } such that for all assignments (1, . . . , ) ∈ × =1range(),
where 1(1, . . . , ) =  1 and 2( (1), . . . ,  ()) =  2, it holds that  1 =  2.</p>
      <p>Example 3 (-Equivalence). Consider the potentials  1 = 0.49,  2 = 0.5, and let  = 0.1. Due to
 2 = 0.5 ∈ [ 1· (1− ) = 0.441,  1· (1+) = 0.539] and  1 = 0.49 ∈ [ 2· (1− ) = 0.45,  2· (1+) =
0.55], it holds that  1 and  2 are -equivalent (for  = 0.1).</p>
      <p>
        In general, it might happen that indistinguishable randvars are located at diferent positions in the
argument list of their respective factors, which is the reason the definition of -equivalence involves
permutations of arguments. For simplicity, in this paper, we stipulate that  is the identity function
(that is, we assume that for two -equivalent factors, all potentials in their potential tables are row-wise
-equivalent). However, all results of this paper also apply to any other choice of  [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>While the above definition captures only indirectly the smallest possible value 0 for which
equivalence holds, Speller et al. [30] introduced an easier access to the concept via a vector-based
distance measure called one-dimensional -equivalence distance (1DEED), which enables us to quantify
deviations pairwise between factors and decide -equivalence in a scalable and computationally more
accessible manner. In the following, we denote the potential table of a factor  as a vector in R&gt; 0, where
() denotes the -th entry, i.e., the potential associated with the -th row, in the potential table of .
Definition 4 (One-dimensional -equivalence distance and -equivalence [30, Def. 4]). 1DEED, defined
as the mapping ∞ : R&gt; 0 × R&gt; 0 → R for two n-dimensional vectors 1, 2 ∈ R&gt; 0, is given by
∞(1, 2) :=</p>
      <p>max
=1,..., ⃒⃒</p>
      <p>max
=1,...,
︂{</p>
      <p>|1() − 2()|
min{|1()|, |2()|}
(iii) -equivalence is not transitive.</p>
      <p>(ii) It holds that ∞(1, 2) = 0 if and only if |1() − 2()| = 0 for all  = 1, . . . , , which holds</p>
      <sec id="sec-1-1">
        <title>Theorem.</title>
        <p>∞(1, 2) ≤  holds.</p>
        <p>Theorem 1 ([30, Thm. 2]). Two vectors 1, 2 ∈ R&gt; 0 are -equivalent (Definition 3) if and only if</p>
        <p>The next lemma provides a brief overview of the main characteristics of 1DEED and -equivalence.
Lemma 2 ([30, Cor. 1, Prop. 9]). The following properties hold.</p>
        <p>(i) 1DEED is non-negative and symmetric.
(3)
(4)
(5)
(

,  ) =</p>
        <p>︁) 2
∑︁ (︁ (1, . . . , ) −  (1, . . . , ) ,
1,...,
* of each group is chosen as
with  = 1, . . . ,  denoting all possible assignments to the arguments of  and  . The representative</p>
        <p>With Def. 4, the concept of -equivalence can be re-expressed in terms of 1DEED, which Speller et al.
[30] demonstrate to be equivalent to the original definition of -equivalence (Def. 3) via the upcoming
* = arg min ∑︁ (

,  ),</p>
        <p>∈</p>
        <p>Since transitivity of -equivalence does not hold (Lemma 2, iii), -equivalence cannot be considered
an equivalence relation and thus is not suitable for defining equivalence classes. This is the main reason
why the consideration of a hierarchical variant makes sense, because there is not necessarily a unique
ordering of groups for all possible -values.</p>
        <p>
          For more comprehensive examples of the defined concepts and the proofs of their first properties, we
refer the reader to the detailed descriptions given by Luttermann et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] and Speller et al. [30].
        </p>
        <sec id="sec-1-1-1">
          <title>2.1. The Algorithms -ACP and HACP</title>
          <p>
            two factors  and  is defined as the sum of squared deviations
Both the -ACP algorithm and the HACP algorithm are primarily based on the ACP algorithm [
            <xref ref-type="bibr" rid="ref25">25</xref>
            ].
The ACP algorithm aims to identify symmetries within an FG, specifically those arising from exactly
matching factors (that is, factors whose potentials are strictly equal). In this context, scalar multiples
of factors can also be treated as equivalent, as a rescaling to the same magnitude can be achieved via
normalisation [31]. In case the number of detected symmetries is insuficient or an
          </p>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>FG with potentials</title>
        <p>estimated from data is given, for which scalability is desired, the concept of -equivalence (Def. 3 and
Def. 4) has been introduced to approximate the construction of a lifted (compressed) model. The concept
of -equivalence forms the foundation for the development of the -ACP algorithm.</p>
        <p>Given  ≥
0, groups  = {1, . . . ,   } of pairwise -equivalent factors 
 ∈ R+ are determined.</p>
        <p>A representative factor is chosen for every group  by computing the optimal solution that minimises
the loss function between the representative and every factor in , where the loss function between
to minimise the total deviation between * and the group. All 
 ∈  are replaced by the representative
* , which is computed as the arithmetic mean over all factors in the group  [29, Thm. 1]:
* () =
 =1</p>
        <p>1 ∑︁ ().</p>
        <p>
          Replacing a group  of factors by a single representative significantly reduces the storage requirements
and thus also drastically speeds up run times for probabilistic inference depending on the size of ,
provided suficient symmetries are present. For more details, see the work by Luttermann et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ].
        </p>
        <p>Building on the sorted groups of pairwise -equivalent factors, a hierarchical algorithm (HACP) has
been introduced [30, Alg. 2] to ensure a consistent comparison of various approximate compressed
FGs and to guide the choice of suitable  values. To this end, an ordering of group memberships is
ifrst established based on 1DEED [30, Alg. 1], which can be interpreted as an agglomerative clustering
algorithm using 1DEED as a base distance. Only completely pairwise -equivalent groups are allowed
to be merged (complete linkage within maximal deviation). The design of the implementation of the
-ACP algorithm and the HACP algorithm guarantees the following useful properties.
Corollary 3 ([29, Cor. 2]). If  = 0, -ACP returns the same PFG as ACP.</p>
        <p>Corollary 4. If  = 0, HACP returns the same PFG as ACP.</p>
        <p>Proof. Follows directly from [30, Prop. 4], with  = 0.
2.2. Asymptotic Properties
A key advantage of the HACP algorithm lies in its ability to retain the same upper bound on the deviation
of probabilistic queries as established for -ACP. For a fixed  &gt; 0, -ACP ensures that the deviation
of any query result in the approximate compressed FG from the original distribution remains within
a boundary. HACP, although introducing a hierarchical structure, builds upon the same guarantee.
To quantify the change in query results between the original and modified (compressed) model, we
adopt the symmetric divergence measure introduced by Chan and Darwiche [32], which provides a
tight bound on the maximal deviation between two distributions  and ′ with respect to any
assignment  via
′ () ′ ()
( , ′ ) := ln max
  () − ln min
  ()</p>
        <p>The next results are proven asymptotic properties for the -ACP and HACP algorithm shown by</p>
      </sec>
      <sec id="sec-1-3">
        <title>Luttermann et al. [29] and Speller et al. [30], respectively.</title>
        <p>Theorem 5 ([29, Thm. 7], [30, Prop. 4]). Let  = ( ∪ Φ, ) be an FG and let  ′ be the output of
-ACP or the output of HACP when run on  . With  and ′ being the underlying full joint probability
distributions encoded by  and  ′, respectively, and  = |Φ|, it holds that
( , ′ ) ≤ ln
︃( (︀ 1 + −  1 )︀( 1 + )︀ )︃
where the bound given in Eq. (8) is optimal (i.e., sharp).</p>
        <p>Even though the worst-case bound can be attained in constructed examples, such cases are rare
in practice. Nonetheless, the need to control this bound persists, as the selection of a specific  for
either algorithm can be guided by a prior assessment of the maximum permissible deviation. This
enables a more informed decision before executing the actual lifted probabilistic inference algorithm
and, consequently, before executing the -ACP or HACP algorithm.
(6)
(7)
(8)
(9)
(10)
running -ACP or HACP on  can be bounded by
Theorem 6 ([30, Thm. 5]). The maximal absolute deviation between any initial probability  =  ( | )
of  given  in model  and the probability ′ = ′ ( | ) in the modified model  ′ resulting from
max Δ :=</p>
        <p>max
for any |
| − ′| ≤
√ − 1
√ + 1
with  = ( ,  ′ ).</p>
        <p>Due to the monotonicity of √√−+11 in , Theorem 6 can easily be converted into the next corollary.
Corollary 7 ([30, Cor. 6]). The change in any probabilistic query in an initial model  and a modified
model  ′ obtained by running -ACP or HACP is bounded by
≤
√1 − 1
√1 + 1
√2 − 1
√2 + 1
max Δ ≤
with 1 = ( ,  ′ )
with 2 = ln
︃( (︀ 1 + −  1 )︀( 1 + )︀ )︃</p>
        <p>.
 ∈ (0, 1), which is smaller or equal to</p>
        <p>While we have calculated the deviation that follows from a specific choice of  so far, we can also
calculate how large  can be chosen to get a maximal deviation of *Δ we want to allow.
Theorem 8 ([30, Thm. 7]). For any given *Δ ∈ (0, 12 ], the output of -ACP and HACP guarantees for any
with  = ln
before the algorithm has been started.</p>
        <p>1 = −
︁( *Δ+1 )︁ 2
1− *Δ
1 + − 1</p>
        <p>1 √
 − 
2 − 1

⎯
⎸
+ ⎷⎸(︁ −
the bound max Δ ≤ *Δ.
3. Geometric View of -Equivalence</p>
        <p>Therefore, a calculation of 1(*Δ) bounds the maximal deviation *Δ of -ACP and HACP, respectively,
than sticking to the Euclidean setting.</p>
        <p>Although -equivalence works well in practice, it introduces a conceptual barrier due to its reduction of
the relation between two factors to a single relative distance via 1DEED. This section aims to enhance
the interpretability of the concept and to enable potential generalisations for -ACP and HACP. We
investigate the geometric implications of -equivalence and the behaviour of pairwise -equivalent
groups. All factors are assumed to lie in R+ , which corresponds to an orthant in the Euclidean space.
However, to allow for a more general setting, we consider the general ℒ norm in Section 3.1 rather
more precisely be replaced by the expression 1 +1 .</p>
        <p>Before we start, we summarise the proven results on the relative ordering of lengths directly derived
from the original definition of -equivalence (Def. 3). Two -equivalent factors may deviate maximally
in the positive and negative direction, respectively, and due to symmetry, the expression (1 − ) can
1 +1 , 2() · (1 + )] and 2() ∈ [1() · 1 +1 , 1() · (1 + )] for  = 1, . . . , .</p>
        <p>Lemma 9 ([29, Lem. 6]). For two -equivalent factors 1, 2 ∈ R+ , it holds that 1() ∈ [2() ·</p>
      </sec>
      <sec id="sec-1-4">
        <title>From Lemma 9, we derive a corollary reformulating the same statement in terms of quotients.</title>
        <p>Corollary 10. If 1 = 2 holds for two factors, then it also holds that
︂[</p>
        <p>1
1 +</p>
        <p>︂]
, 1 +  for all  ∈ {1, . . . , }.</p>
        <p>(11)</p>
        <p>Corollary 10 implies that the length deviations per dimension remain relatively small, and that
deviations can be interpreted around the multiplicative identity 1.
− 1-sphere in ℒ for  ≥
(i) ‖1 − 2‖ ≤ , and</p>
        <p>1. If 1 = 2 holds, then we have</p>
        <p>Considering distances between two -equivalent, normalised factors yields the following result.
Lemma 11. Let 1, 2 be two factors in R+ ∩ − 1 with − 1 := { ∈ R : ‖‖ = 1} being the unit
-ball with centre  and radius  &gt; 0.</p>
        <p>(ii) 3−  ∈  () ∩ − 1 for  = 1, 2 with  () := { ∈ R : ‖ − ‖ ≤ } being the closed
Proof. By definition of -equivalence, we get for () the property</p>
        <p>1() ∈ [(1 − )2(), (1 + )2()] for all  = 1, . . . , ,
which is equivalent to</p>
        <p>1() − 2() ∈ [− 2(), 2()] for all  = 1, . . . , ,
taking into account the normalisation ‖2‖ = 1, we obtain the desired result:
or in shorter form |1() − 2()| ≤ 2() for all  = 1, . . . , . Substituting this into the ℒ norm
‖1 − 2‖ = ∑︁ |1() − 2()|</p>
        <p>=1</p>
        <p>=1</p>
        <p>=1
≤</p>
        <p>∑︁ |2()|
=  ∑︁ |2()| = ‖2‖ = .</p>
      </sec>
      <sec id="sec-1-5">
        <title>Property (ii) follows directly from property (i). The previous proof also contains a result for non-normalised factors, presented in the next corollary.</title>
        <p>= 1, 2.</p>
        <p>Corollary 12. Let 1, 2 be two -equivalent factors in R+ . Then, ‖1 − 2‖ ≤
‖‖ holds for
Proof. Follows from Eq. (16). By symmetry, the claim holds for  = 2 and also  = 1.</p>
        <p>In contrast, when viewing the problem from the opposite direction, we can only establish a relatively
weak statement about -equivalence.</p>
        <p>Lemma 13. Let 1, 2 be two factors in R+ . If ‖1 − 2‖ ≤ , then 1 =′ 2 holds with</p>
      </sec>
      <sec id="sec-1-6">
        <title>Proof. Consider</title>
      </sec>
      <sec id="sec-1-7">
        <title>Therefore, we get</title>
        <p>Since |1() − 2()| ≥
′ :=</p>
        <p>min
=1
‖1 − 2‖ = ∑︁ |1() − 2()| ≤ .
1() ≤  + 2() =
︂(
0 for all , Eq. (18) also holds for every  individually: |1() − 2()| ≤ .
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
1() ≥ −
 + 2() =</p>
        <p>1
︂(</p>
        <p>)︂
− 2()</p>
        <p>2(),</p>
        <p>)︂
− 2()</p>
        <p>)︂
− 1()
and
which implies
is given by
Analogously, we get the opposite inequality and interval inclusion for 2()
︂(
︂(
2(), 1 +</p>
        <p>2() .
1(), 1 +</p>
        <p>1() .</p>
        <p>)︂
2()</p>
        <p>)︂
smallest denominator among all possibilities, which is given by min=1,...,,=1,2{()}.
for  = 1, 2, for  = 1, . . . , , and a fixed</p>
        <p>′ and therefore to guarantee -equivalence, we need to

choose the largest () value for ′. It is easy to see that the broadest interval is generated by the</p>
        <p>For a given  &gt; 0, we can construct an example that hits the same bound as mentioned in Lemma 13
independently of the choice of the -norm, because the worst deviation may occur in a single dimension:
Example 4. Let  be any value in [1, ∞) and  = 0.01, then the ℒ norm of the two factors
1 =
︂( 0.01)︂
1.0
, 2 =</p>
        <p>min</p>
        <p>‖1 − 2‖ = ∑︁ |1() − 2()| = 0.01 + 0 = 0.01 = 
and therefore fulfils the conditions of Lemma 13. Still, it leads to a comparably high value for -equivalence
via 2(1) = 2.0 · 1(1) = (1 + ′)1(1) and
This example can be extended with identical values 1() = 2() for all  &gt; 1 to arbitrary large
dimensions . When normalising the two vectors ‖‖ = 1 for  = 1, 2, the diferences in one dimension
will afect the others, resulting in an ′ that depends on  and the number of dimensions . Moreover,
even normalisation cannot improve upon the result of Lemma 13, as entries of a single dimension  can be
arbitrarily small, leading to a large discrepancy between 1() and 2() in relative terms.
3.2. Groups of Pairwise -Equivalent Factors
and let * () = 1 ∑︀
-equivalent factors.</p>
        <p>The current implementation of -ACP and HACP uses the arithmetic mean over all factors in a group of
pairwise -equivalent factors. An important property is that the factor defined as the arithmetic mean
over a group of pairwise -equivalent factors itself is again -equivalent to all factors in the group.
Lemma 14 ([29, Lem. 5]). Let  = {1, . . . , } denote a group of pairwise -equivalent factors in R+
=1 () for all assignments . Then, * = {1, . . . , , * } is a group of pairwise</p>
        <p>According to Luttermann et al. [29, Thm. 1], the arithmetic mean is the optimal choice for the squared
loss objective used for minimisation within the error function (,  ) (see Eq. (4)). However, one
may wonder whether the arithmetic mean is always the optimal choice for any given task. In principle,
any weighted mean can be used as a representative -equivalent factor, as shown in the next theorem.
Theorem 15. Let  = {1, . . . , } denote a group of pairwise -equivalent factors and let () =
∑︀=1 () be the weighted mean with weights  ≥ 0 and ∑︀=1  = 1 for all assignments . Then,
* = {1, . . . , , } is a group of pairwise -equivalent factors.</p>
        <p>Proof. We show the claim in two directions by proving that () ∈ [() · (1 − ), () · (1 + )]
and () ∈ [() · (1 − ), () · (1 + )] hold for any assignment  and  ∈ .</p>
        <p>For the first direction, let  be an arbitrary assignment and let  ∈ . As all factors in  are pairwise
-equivalent, it holds that
and analogously also for the upper bound
() · (1 − ) ≤ ∈</p>
        <p>min  ()
() · (1 + ) ≥ ∈</p>
        <p>max  ()
≤
≥</p>
        <p>= min  () ∑︁  = ∑︁  · ∈</p>
        <p>min  ()
∈ =1 =1</p>
        <p>∑︁  ·  () = ()
=1</p>
        <p>= max  () ∑︁  = ∑︁  · ∈</p>
        <p>max  ()
∈ =1 =1</p>
        <p>∑︁  ·  () = ().</p>
        <p>=1
For the second direction, it holds that for any assignment , every  ∈  is contained in the interval
[ () · (1 − ),  () · (1 + )] for any  ∈ {1, . . . , }. Therefore, we can get the lower bound for a
specific  ∈ {1, . . . , } and obtain</p>
        <p>(1 − )() = (1 − ) ∑︁  · ()</p>
        <p>=1

= ∑︁  · (1 − )()
=1
 
≤ ∑︁  () =  () ∑︁  =  ()</p>
        <p>=1 =1
as well as the upper bound for a specific :</p>
        <p>(1 + )() = (1 + ) ∑︁  · ()</p>
        <p>=1

= ∑︁  · (1 + )()
=1
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
Therefore, we get for every  the property  () ∈ [(1 − )(), (1 + )()].</p>
        <p>Geometrically, this property can be viewed as follows: The convex hull of a group of pairwise
-equivalent factors  consists entirely of points that are -equivalent to each other:
conv(1, . . . , ) :=</p>
        <p>{︃ ∑=︁1  ⃒⃒⃒⃒⃒  ≥ 0, ∑=︁1  = 1}︃ .</p>
        <p>Besides the arithmetic mean, this also includes the geometric mean of the original factors, which can
most easily be shown by an appropriate choice of weights.</p>
        <p>Corollary 16. Let  = {1, . . . , } denote a group of pairwise -equivalent factors and let () :=
√︀∏︀</p>
        <p>=1 () be the geometric mean for all assignments . Then, * = {1, . . . , , } is a group
of pairwise -equivalent factors.</p>
        <p>Proof. Since () ∈ [min=1,..., (), max=1,..., ()], we can use
1 ≥ 1 :=
1 ≥ 2 :=</p>
        <p>() −
max=1,..., () −
min=1,..., ()</p>
        <p>min=1,..., () ≥ 0
max=1,..., () − ()
max=1,..., () − min=1,..., () ≥ 0,
which summarises to 1 + 2 = 1 and with Theorem 15, the proof can be completed:
() = 1 · =m1,.i.n., () + 2 · =m1,a...x, ().</p>
        <p>Note 1. For positive real numbers 1(), . . . , () the geometric mean is always smaller or equal to the
arithmetic mean
⎯⎸ 
⎷ 1 ∑︁ () = * (),
() = ⎸∏︁ () ≤</p>
        <p>=1 =1
and more robust against outliers.</p>
        <p>Results previously shown in this subsection required no ℒ normalisation, assuming that the weighted
average localises on a spherical shell.</p>
        <p>Lemma 17. Let  = {1, . . . , } denote a group of pairwise -equivalent factors with  ∈ − 1 for
 = 1, . . . ,  and let () = ∑︀ =1  = 1
=1 () be the weighted mean with weights  ≥ 0 and ∑︀
for all assignments . Then, the length of  is bounded by ‖‖ ∈ [︁ 1 +1 , 1]︁.</p>
        <p>{︁ }︁</p>
        <p>In other words,  ∈ 1(0) ∖ 1/(1+)(0) =  ∈ R : 1 +1 ≤ ‖ ‖ ≤ 1 .</p>
        <p>Proof. For  ≥ 1 we can use the Jensen-inequality for the convex function () := || (also called
convexity of ℒ norm, [33]) and get
‖‖ = ⃦⃦⃦⃦ ∑︁ ⃦⃦⃦⃦  = ∑︁ (︃ ∑︁ ())︃
⃦ =1 ⃦  =1 =1</p>
        <p>≤ ∑︁ ∑︁ () = ∑︁  ∑︁ ()
=1 =1 =1 =1
(39)
(40)
(41)
(42)
(43)
(44)
By the pairwise -equivalence of , from Theorem 15, we get () ≥
for  = 1, . . . , . We therefore obtain
max=1,...,{ ()} / (1 + )
(45)
(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)
The last inequality is a reduction of the maximum to one arbitrary, but specific factor (in this case 1),
to be able to take its scaling into account. Taking the -th root results in the desired lower bound. In
the final inequality, we reduce the maximum to a single factor 1, allowing us to use its unit property
in ℒ. This completes the proof, as taking the -th root yields the desired lower bound.</p>
        <p>Consequently, we can naturally combine the idea of the convex hull from Theorem 15 with the
normalisation from Lemma 17 to obtain an intuitive visual perspective. Given a set of factors 1, . . . ,  ∈ R+ ,

we normalise them to ensure fair starting conditions such that norm :=  / ‖‖ ∈ R+ ∩ − 1, i.e.,
each lies on the positive orthant of the -unit -sphere. If they are pairwise -equivalent, then any
weighted mean in their convex hull remains pairwise -equivalent to them. Additionally, the ℒ norm
1
of any such mean lies between 1+ and 1. However, this norm is in general strictly less than 1 (except
boundary factors), implying that normalisation does not preserve the normalised property under convex
combinations. This leads to the subsequent observation, which prevents the geometric interpretation
from being further simplified to a convex cone.</p>
        <p>Note 2. Let  := cone() = { ∈ R+ :  = ∑︀=1  with  ≥ 0,  ∈ } be the finitely
generated convex cone from a group of factors  = {1, . . . , }, whose normalised set norm :=
{n orm =  / ‖‖,  = 1, . . . , } is pairwise -equivalent, contains normalised factors norm ∈  with
‖norm‖ = 1 that are not pairwise -equivalent to all elements of norm. Formally, there exists norm ∈ 
with ‖norm‖ = 1 such that for some  ∈ {1, . . . , } we have norm ̸= n orm. Example 5 provides a
counterexample that invalidates this geometric interpretation in general.</p>
        <p>Example 5. When normalising any factor within the generated cone, it is not necessarily -equivalent to
the original generating normalised factors. This is illustrated by the following example with
We begin by determining the -equivalence of the original factors:</p>
        <p>∞(1, 2) = 0.1 = ∞(1, 3) and ∞(2, 3) = 0.05.</p>
        <p>After normalisation, for instance using the ℒ2-norm via n orm :=  / ‖‖2 for all , we obtain:
∞(n1orm, n2orm) = 0.07244888 = ∞(n1orm, n3orm) and ∞(n2orm, n3orm) = 0.05.
Due to Def. 4, we have pairwise -equivalence for
for this triple of normalised factors. For the convex combination  := 0 · n1orm + 0.1 · n2orm + 0.9 · n3orm,
pairwise 0-equivalence still holds (see also Theorem 15) with ∞(, n orm) ≤ 0 for  = 1, 2, 3. However,
the normalised version norm :=  /‖‖2 has ∞(norm, n1orm) = 0.07256301 &gt; 0 and is in particular
not 0-equivalent to n1orm and automatically not pairwise -equivalent to the normalised group anymore.</p>
        <p>
          This example shows that the original idea of finding symmetries within the ACP framework can
be extended from exact symmetries to normalised and also to -equivalent approximate symmetries.
However, the latter cannot be interpreted as a finitely generated convex cone. Consequently, scaling
assumes a more profound role than previously anticipated, particularly in determining a unique order
in the hierarchical version HACP, making its choice significant. Nonetheless, the proven results ofer
positive prospects by enabling extensions of the -ACP and HACP algorithms via alternative loss
functions that permit diferent optimal representatives for a pairwise -equivalent group through
weighted mean selection. In addition, the convex hull of a group of pairwise -equivalent factors is the
smallest set containing these weighted elements. If it can be shown that a new factor is also pairwise
-equivalent to each factor of the given group, the new convex hull is a superset and extends the possible
representatives while retaining all previous ones.
3.3. Special Case: Euclidean Perspective
We now focus on the geometric interpretation in the Euclidean space ℒ2 as a special case of Section 3.1.
Although the counterexample from Example 5 still applies, this setting has advantages. In particular, it
allows a more interpretable and intuitive understanding of -equivalence due to the accessibility of
Euclidean norms and angles, which provide a one-dimensional comparison similar to 1DEED; see also
[
          <xref ref-type="bibr" rid="ref15">15, 31</xref>
          ] for related discussions.
        </p>
        <p>Definition 5. The Cosine distance cos for two non-zero vectors 1, 2 ∈ R is defined as
cos(1, 2) := 1 −</p>
        <p>In [31], a refined definition of the Cosine distance for factors is given, which is based more precisely
on the assignments . It also includes the notion of exchangeable factors, i.e., two factors that are
identical after scaling and permutation of potentials.</p>
        <p>Theorem 18 (Theorem 2, [31]). Let 1(1, . . . , ) and 2(1, . . . , ) denote two factors. If 1 and
2 are exchangeable, then it holds that cos(1, 2) = 0.</p>
        <p>When normalising the factors in ℒ2, the Cosine distance satisfies the following well-known property.
Lemma 19. The Cosine distance for two normalised factors 1, 2 ∈ R&gt; 0 and the Euclidean distance for
two normalised factors 1, 2 ∈ R&gt; 0 with ‖‖2 = 1 for  = 1, 2 are the same besides a scaling factor, i.e.,
2cos(1, 2) = ||1 − 2||2.
(54)
(55)
(56)
(57)
(58)</p>
        <p>As the Cosine distance for  &gt; 0 can be seen as an angle between two factors, we obtain a bound on
the maximal angle diference between -equivalent factors.</p>
        <p>Lemma 20. The angle  ∈ [0, 2 ] of two -equivalent factors 1, 2 ∈ R&gt; 0 ∩ 2− 1 is at most
 () ≤

2
{︃arccos(1 − 2
 ) for  ≤ 2
for  &gt; 2.</p>
      </sec>
      <sec id="sec-1-8">
        <title>Lemma 19 the inequality:</title>
        <p>Proof. The angle of two vectors within an orthant in the Eucledian space is always maximal 2 , which
will be the upper bound for  &gt; 2. For the other case of 0 ≤  ≤ 2 we get from Lemma 11 () and
Proof. With the normalised vectors ||||2 = 1 for  = 1, 2, consider
||1 − 2||2 = (1 − 2)(1 − 2)</p>
        <p>
          ‖1 − 2‖2 = cos(1, 2) = 1 − cos( )
both sides switches the direction of Eq. (66). The operation remains well-defined if  ≤ 2.
In the last step, taking the strictly monotonically decreasing arccos for values in [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] or  ∈ [
          <xref ref-type="bibr" rid="ref2">0, 2</xref>
          ] on
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Discussion</title>
      <p>Understanding of -equivalence.</p>
      <p>As an extension of exchangeable factors, the general concept
of -equivalence and the more manageable 1DEED seem to be a promising option to extend lifted
inference to a new level via approximation of similar factors. Therefore, the understanding of similarity
in this context is key. Building on the maximum metric (Chebyshev distance), 1DEED takes relative
lengths into account, resulting in the lack of transitivity. However, the property of -equivalence is
easy to check and leads to multiple practical properties including the guaranteed asymptotic bounds
of Section 2.2 and a compressed (compact) and more interpretable model depending on the choice
of . The hierarchical approach additionally helps to understand the complexity of the given FG by
pre-analysing the 1DEED of all factors.</p>
      <p>Geometric interpretability.</p>
      <p>Understanding diferences between factors as an angle between vectors
within the Euclidean space is not a new concept [31]. However, it is one way to interpret the concept of
-equivalence. Pairwise -equivalent factors can also be seen as set of -dimensional vectors, which
bound and generate a compact set (the convex hull) within the ℒ
basically the minimal set containing the group of -equivalent factors. If another factor that is pairwise
-equivalent to all factors in a group is added to the group, the resulting set becomes a superset and its
newly generated convex hull is again a set of pairwise -equivalent factors containing the old set.
 space. This geometric object is
(59)
(60)
(61)
(62)
(63)
(64)
(65)
(66)
New potential for future exploration. Using weighted means as representatives for groups of
-equivalent factors enriches the possibilities of how the algorithms -ACP and HACP can be applied.
The usage of weighted means opens the whole concept of approximate lifted model construction based
on -equivalence to possible robustifications and generalisations by changing the original squared loss
function and its optimal choice of the arithmetic mean as representatives for groups of -equivalent
factors. Choosing diferent loss functions and their new corresponding optimal solution of a representative
increases the flexibility of the -ACP algorithm and the HACP algorithm.</p>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion</title>
      <p>We presented fundamental properties of -equivalence related to lifted model construction under a
unified view and have proven several additional properties for the ℒ space. By viewing the concept
of -equivalence from geometric perspective in ℒ, we provided a solid foundation for geometric
interpretability and gave a new perspective on the concept of pairwise -equivalence via the convex hull
of groups of -equivalent factors. Our geometric interpretation opens up for advancements with respect
to the -ACP algorithm and its hierarchical version HACP in terms of generalisation and robustification
of the previously introduced compression techniques.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>This work was partially funded by the Ministry of Culture and Science of the German State of North
Rhine-Westphalia. The research of Malte Luttermann was funded by the BMBF project AnoMed
16KISA057.</p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <sec id="sec-5-1">
        <title>The authors have not employed any generative AI tools.</title>
      </sec>
      <sec id="sec-5-2">
        <title>Construction, in: Proceedings of the Thirty-Fourth International Joint Conference on Artificial</title>
        <p>Intelligence (IJCAI-2025), 2025. Https://arxiv.org/abs/2504.20784.
[30] J. Speller, M. Luttermann, M. Gehrke, T. Braun, Compression versus Accuracy: A Hierarchy of</p>
      </sec>
      <sec id="sec-5-3">
        <title>Lifted Models, 2025. Https://arxiv.org/abs/2505.22288.</title>
        <p>[31] M. Luttermann, R. Möller, M. Gehrke, Lifted Model Construction without Normalisation: A
Vectorised Approach to Exploit Symmetries in Factor Graphs, in: Proceedings of the Third</p>
      </sec>
      <sec id="sec-5-4">
        <title>Learning on Graphs Conference (LoG-2024), PMLR, 2025, pp. 46:1–46:17.</title>
        <p>[32] H. Chan, A. Darwiche, A Distance Measure for Bounding Probabilistic Belief Change, International</p>
      </sec>
      <sec id="sec-5-5">
        <title>Journal of Approximate Reasoning 38 (2005) 149–174.</title>
        <p>[33] J. L. W. V. Jensen, Sur les fonctions convexes et les inégalités entre les valeurs moyennes, Acta
mathematica 30 (1906) 175–193.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Poole</surname>
          </string-name>
          ,
          <article-title>First-order Probabilistic Inference</article-title>
          ,
          <source>in: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-2003)</source>
          , IJCAI Organization,
          <year>2003</year>
          , pp.
          <fpage>985</fpage>
          -
          <lpage>991</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          , Markov Logic Networks,
          <source>Machine Learning</source>
          <volume>62</volume>
          (
          <year>2006</year>
          )
          <fpage>107</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fuhr</surname>
          </string-name>
          ,
          <article-title>Probabilistic Datalog - A Logic for Powerful Retrieval Methods</article-title>
          , in: SIGIR-95
          <source>Proc. of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , ACM,
          <year>1995</year>
          , pp.
          <fpage>282</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Sato</surname>
          </string-name>
          ,
          <article-title>A Statistical Learning Method for Logic Programs with Distribution Semantics</article-title>
          ,
          <source>in: Proc. of the 12th International Conference on Logic Programming</source>
          , MIT Press,
          <year>1995</year>
          , pp.
          <fpage>715</fpage>
          -
          <lpage>729</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ahmadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kersting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mladenov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Natarajan</surname>
          </string-name>
          ,
          <article-title>Exploiting Symmetries for Scaling Loopy Belief Propagation</article-title>
          and
          <string-name>
            <given-names>Relational</given-names>
            <surname>Training</surname>
          </string-name>
          ,
          <source>Machine Learning</source>
          <volume>92</volume>
          (
          <year>2013</year>
          )
          <fpage>91</fpage>
          -
          <lpage>132</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          , Lifted Junction Tree Algorithm,
          <source>in: Proceedings of the Thirty-Ninth German Conference on Artificial Intelligence (KI-2016)</source>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>30</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Gogate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>Probabilistic Theorem Proving</article-title>
          ,
          <source>in: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI-2011)</source>
          , AUAI Press,
          <year>2011</year>
          , pp.
          <fpage>256</fpage>
          -
          <lpage>265</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hartwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <source>T. Braun, An Extended View on Lifting Gaussian Bayesian Networks, Artificial Intelligence</source>
          <volume>330</volume>
          (
          <year>2024</year>
          )
          <fpage>104082</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Taghipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fierens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Blockeel</surname>
          </string-name>
          , Lifted Variable Elimination:
          <article-title>Decoupling the Operators from the Constraint Language</article-title>
          ,
          <source>Journal of Artificial Intelligence Research</source>
          <volume>47</volume>
          (
          <year>2013</year>
          )
          <fpage>393</fpage>
          -
          <lpage>439</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Van den Broeck</surname>
          </string-name>
          , N. Taghipour,
          <string-name>
            <given-names>W.</given-names>
            <surname>Meert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davis</surname>
          </string-name>
          , L. De Raedt,
          <article-title>Lifted Probabilistic Inference by First-order Knowledge Compilation</article-title>
          ,
          <source>in: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI-2011)</source>
          , IJCAI Organization,
          <year>2011</year>
          , pp.
          <fpage>2178</fpage>
          -
          <lpage>2185</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Parameterised Queries and Lifted Query Answering</article-title>
          ,
          <source>in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-2018)</source>
          , IJCAI Organization,
          <year>2018</year>
          , pp.
          <fpage>4980</fpage>
          -
          <lpage>4986</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>G. Van den Broeck</surname>
          </string-name>
          ,
          <source>Lifted Inference and Learning in Statistical Relational Models, Ph.D. thesis, KU Leuven</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Van den Broeck</surname>
          </string-name>
          , J. Davis,
          <article-title>Conditioning in First-order Knowledge Compilation and Lifted Probabilistic Inference</article-title>
          ,
          <source>in: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-2012)</source>
          , AAAI Press,
          <year>2012</year>
          , pp.
          <fpage>1961</fpage>
          -
          <lpage>1967</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Relational Forward Backward Algorithm for Multiple Queries</article-title>
          , in: FLAIRS-19
          <source>Proc. of the 32nd International Florida Artificial Intelligence Research</source>
          Society Conference, AAAI Press,
          <year>2019</year>
          , pp.
          <fpage>464</fpage>
          -
          <lpage>469</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Taming Reasoning in Temporal Probabilistic Relational Models</article-title>
          ,
          <source>in: 9th International Workshop on Statistical Relational AI at the 34th AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>U.</given-names>
            <surname>Apsel</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. I. Brafman</surname>
          </string-name>
          ,
          <article-title>Extended Lifted Inference with Joint Formulas</article-title>
          , in: UAI-11
          <source>Proc. of the 27th Conference on Uncertainty in Artificial Intelligence</source>
          , AUAI Press,
          <year>2011</year>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>A Language for Relational Decision Theory</article-title>
          , in
          <source>: Proc. of the 6th International Workshop on Statistical Relational Learning</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Lifted Temporal Maximum Expected Utility</article-title>
          ,
          <source>in: Proc. of the 32nd Canadian Conference on Artificial Intelligence</source>
          ,
          <source>Canadian AI</source>
          <year>2019</year>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>380</fpage>
          -
          <lpage>386</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Polovina</surname>
          </string-name>
          ,
          <article-title>Restricting the Maximum Number of Actions for Decision Support under Uncertaitny</article-title>
          ,
          <source>in: ICCS-20 Proc. of the 25th International Conference on Conceptual Structures</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>145</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sanner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kersting</surname>
          </string-name>
          ,
          <article-title>Symbolic Dynamic Programming for First-order POMDPs</article-title>
          , in: AAAI-10
          <source>Proc. of the 24th AAAI Conference on Artificial Intelligence</source>
          , AAAI Press,
          <year>2010</year>
          , pp.
          <fpage>1140</fpage>
          -
          <lpage>1146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Lifting in Multi-agent Systems under Uncertainty</article-title>
          ,
          <source>in: UAI-22 Proc. of the 38th Conference on Uncertainty in Artificial Intelligence</source>
          , AUAI Press,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Karabulut</surname>
          </string-name>
          , T. Braun, Lifting Partially Observable Stochastic Games, in: SUM-24
          <source>Proceedings of the 16th International Conference on Scalable Uncertainty Management</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>201</fpage>
          -
          <lpage>216</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Niepert</surname>
          </string-name>
          , G. Van den Broeck,
          <article-title>Tractability through Exchangeability: A New Perspective on Eficient Probabilistic Inference</article-title>
          ,
          <source>in: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI-2014)</source>
          , AAAI Press,
          <year>2014</year>
          , pp.
          <fpage>2467</fpage>
          -
          <lpage>2475</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>B.</given-names>
            <surname>Weisfeiler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Leman</surname>
          </string-name>
          ,
          <article-title>A Reduction of a Graph to a Canonical Form and Algebra Arising during this Reduction, Nauchno-Technicheskaya Informatsia 2 (</article-title>
          <year>1968</year>
          )
          <fpage>12</fpage>
          -
          <lpage>16</lpage>
          . Translated from Russian.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Luttermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          , Colour Passing Revisited:
          <article-title>Lifted Model Construction with Commutative Factors</article-title>
          ,
          <source>in: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-2024)</source>
          , AAAI Press,
          <year>2024</year>
          , pp.
          <fpage>20500</fpage>
          -
          <lpage>20507</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>M.</given-names>
            <surname>Luttermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Machemer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <article-title>Eficient Detection of Exchangeable Factors in Factor Graphs</article-title>
          , in: FLAIRS-24
          <source>Proceedings of the 37th International FLAIRS Conference</source>
          , volume
          <volume>37</volume>
          ,
          <string-name>
            <surname>Florida</surname>
            <given-names>Online Journals</given-names>
          </string-name>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>M.</given-names>
            <surname>Luttermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Machemer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <article-title>Eficient Detection of Commutative Factors in Factor Graphs</article-title>
          , in: PGM-24
          <source>Proceedings of the 12th International Conference on Probabilistic Graphical Models</source>
          , volume
          <volume>246</volume>
          ,
          <string-name>
            <surname>PMLR</surname>
          </string-name>
          ,
          <year>2024</year>
          , pp.
          <fpage>38</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M.</given-names>
            <surname>Luttermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <article-title>Lifted Model Construction without Normalisation: A Vectorised Approach to Exploit Symmetries in Factor Graphs</article-title>
          ,
          <source>in: Proceedings of the Third Learning on Graphs Conference (LoG-2024)</source>
          , PMLR,
          <year>2025</year>
          , pp.
          <volume>46</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>46</lpage>
          :
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>M.</given-names>
            <surname>Luttermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Speller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hartwig</surname>
          </string-name>
          , Approximate Lifted Model
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>