<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>TyRaL: End-to-End Document-level Relation Extraction via Type-Constrained Rule Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mierzhati Alimu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chaochao Du</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaowang Zhang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Intelligence and Computing, Tianjin University</institution>
          ,
          <addr-line>Tianjin, 300350</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, Document-level Relation Extraction (DocRE) has encountered significant challenges in capturing complex entity relationships and reasoning over long-range dependencies. Existing methods primarily focus on learning implicit representations or applying chain-like logical rules, but they often overlook diferences in entity types and the significance of type constraints, potentially leading to errors in relation reasoning. This poster introduces a type-constrained enhanced chain-like rule (TC rule) and proposes an end-to-end document-level relation extraction framework (TyRaL) to address this issue. By incorporating a novel rule reasoning module, TyRaL transforms the discrete rule learning problem into a parameter optimization task in continuous space, enabling both explicit and implicit learning of entity type constraint rules and thereby enhancing the model's logical consistency and interpretability. Experimental results on the standard DWIE dataset show that TyRaL significantly outperforms existing rule-enhanced methods in both F1 and Ign F1 metrics. It demonstrates superior logical modeling and semantic reasoning capabilities while ofering new perspectives and solutions for research in the DocRE field.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Document-level Relation Extraction</kwd>
        <kwd>Logical Rules</kwd>
        <kwd>Type Constraints</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Input Document</title>
        <p>
          [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]Prince Harry gets engaged to actress
Meghan Markle. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]Britain's Prince Harry is
engaged to his US partner Meghan Markle, his
father Prince Charles has announced. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]... and
the couple are to live in
Kensington Palace.
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]Ashwathy Kurup, better known by her stage
name Parvathy, is an Indian film actress and
classical dancer ...
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Type-Constrained Rule Reasoning</title>
        <p>B
a
c
k
b
o
n
e
l
o
g
i
t
s
×N</p>
      </sec>
      <sec id="sec-1-3">
        <title>Module</title>
        <p>Rule N
Predicate selection
Predicate selection
forPcrhedaincaetdeasteolmecstion
for chained
atoms
for chained atoms
Predicate selection
Predicate
selection
for Ptyrepdeicaotnestsrealienctsion
for type constraints
for type constraints
l = 1
thru</p>
      </sec>
      <sec id="sec-1-4">
        <title>Backbone</title>
        <p>Entity embedding
Logsumexp
SoftMax
Classifier
Rule N
Predicate selection
Predicate selection
forPcrhedaincaetdeasteolmecstion
for chained</p>
        <p>atoms
for chained atoms
Predicate selection
Predicate</p>
        <p>selection
for Ptyrepdeicaotnestsrealienctsion
for type constraints
for type constraints</p>
        <p>l = 2
Residual connection</p>
        <p>L
o
g
i
t
outperforms existing rule-based
models in logical consistency
and
relation
extraction.</p>
      </sec>
      <sec id="sec-1-5">
        <title>Training</title>
        <p>Loss λＬ1</p>
        <p>Loss Ｌ2
Ｌ</p>
        <p>Logical rules
hasChild(x,y) ← hasSpouse(x,z) ∧ hasChild(z,y)
hasFather(x,y) ← hasParent(x,z) ∧ Male(y)</p>
        <p>···
hasFather(x,y) ← hasParent(x,z) ∧ (brotherOf(y,u) ∨
uncleOf(y,v))
maleLeadOf(x,y) ← actCharacter(x,z) ∧ CharacterOf(z,y)
∧ mainCharacter(z) ∧ (brotherOf(x,u) ∨ uncleOf(x,v))
I
I
I
d
d
d
t
t
t
h
h
h
(Meghan Markle, Harry)</p>
        <p>proach
2.1.</p>
        <sec id="sec-1-5-1">
          <title>Proble m</title>
        </sec>
        <sec id="sec-1-5-2">
          <title>Definition</title>
          <p>Given
a</p>
          <p>document 
the
semantic relation</p>
          <p>∈  ∪
predefined relation types, and
the
document, denoted
as
{
{NA}</p>
          <p>NA


}



=1
containing
a
set of
named
entities</p>
          <p>between
all distinct entity
pairs
represents
no relation.</p>
          <p>An
=
{
}

 =1
entity 

, the
goal of</p>
          <p>DocRE is to
predict
(
ℎ
, 

),
where 
denotes
a
set of
may
have
multiple
mentions in
.</p>
          <p>The existence
of a relationship
between
entities needs to
be judged
based
on
comprehensive contextual evidence between these
mentions in the
document.</p>
          <p>An
original</p>
          <p>DocRE
model usually
calculates a score
vector  (
ℎ
, 

,  )
∈
ℝ
||+1
for each
entity
pair,
where the
k-th
element represents the logit value
of the
k-th
relation
type, and the last element
corresponds to
”no
relation”</p>
          <p>NA.</p>
          <p>During
the training
phase,</p>
          <p>Binary</p>
          <p>Cross-Entropy
(BCE)
or</p>
          <p>Adaptive
Thresholding (AT) loss functions are
usually
used.</p>
          <p>In the inference
phase, the
model uses an
activation function 
(such
as</p>
          <p>Softmax) to
map logits to
probability values, and filters them
according to a threshold
to predict the set of relation triples, which

is in the form:

=
{( ,  , 
ℎ

) ∣ [
( (
ℎ
, 

,  )
)]

&gt;  }
where</p>
          <p>is the set confidence threshold.
2.2.</p>
        </sec>
        <sec id="sec-1-5-3">
          <title>Chain-like and</title>
        </sec>
        <sec id="sec-1-5-4">
          <title>Type-Constrained</title>
        </sec>
        <sec id="sec-1-5-5">
          <title>Rules</title>
          <p>We introduce an interpretable logical rule structure to
model implicit semantic paths between
entities
in
a
document.</p>
          <p>Define
a
binary
variable

( , 
) to indicate
whether the relation  ∈ 
holds between
entities 
and 
.</p>
          <p>W
hen the relation is true, 
( , 
1; otherwise, 
( , 
) =
0.</p>
          <p>A
chain-like logical rule consists of a rule
head
a rule body.</p>
          <p>The rule</p>
          <p>head represents the target
) =
and
relation 
ℎ
( , 
), and the rule
body is a conjunction
of binary
atoms,
where each
body
atom
shares
a variable
with the adjacent previous atom
and
another variable
with the adjacent next atom, forming
a chain structure.</p>
          <p>The
general form</p>
          <p>of a chain-like logical rule is as follows:

ℎ
( , 
)
←</p>
          <p>∈  are entity types, and   are intermediate relation paths. This rule not only
depends on the relational path structure but also requires each entity node on the path to satisfy specific
type conditions, thereby improving the semantic rationality and interpretability of the rule.
2.3. Type-Constrained Rule Reasoning Module
the downstream relation prediction target.</p>
          <p>Let</p>
          <p>be the maximum number of rules to be learned,  the maximum number of atoms in each rule,
and define the extended relation set as  ∗ = ∪</p>
          <p>− ∪{ } , where  = {  }1≤≤ denotes the original relation
set,  − = {  }+1≤≤2
logit  +(,  , ) ∈ ℝ
all 1 ≤  ≤  , and [ +(,  , )]
2+1 , where [ +(,  , )]  = [ ( (,  , ))]</p>
          <p>and [ +(,  , )] + = [ ( ( , , ))]
the inverse relations, and  =  2+1 the identity relation. We define the extended
 for
2+1 = 1 if  =  , or 0 otherwise, with  denoting the sigmoid function.</p>
          <p>The goal of our rule reasoning module is, given an entity pair (,  ) ∈   ×   and a document  , to
estimate a truth degree  ,,,
( ,)</p>
          <p>for each relation  ∈  ∗, indicating whether the relation can be inferred
through at most  type-constrained rules with length  . For each original relation  ∈  , the  -th rule
(1 ≤  ≤  ), and the  -th rule atom (1 ≤  ≤  ), the intermediate truth degree  ,,,
(,)
is defined as

⎧ (,1)
⎪ ,,,
⎨ (,)
⎪ ,,,
⎩
2+1
=1
=  
(,1)
( ) 
(,+1) ( ) ∑  (,,1) [ + (,  , 
)] ,</p>
          <p>=  
(,+1) ( ) ∑  (,,)
2+1
=1</p>
          <p>∑
(, 
, )∈  × ∗×</p>
          <p>
            = 1

(,−1) [ + (,  , 
,,,
)] , 2 ≤  ≤ 

follows:
where  ,,
ditions:

(,)
,,,
=
(4)
(5)
(6)
∈ [
            <xref ref-type="bibr" rid="ref1">0, 1</xref>
            ]2+1 is the predicate selection weight of the  -th atom in the  -th rule, normalized
by Softmax to approximate one-hot, simulating the predicate selection process.
          </p>
          <p>(,) ( )is a type constraint function representing the score that entity  satisfies specific type
con
∑ ℎ</p>
          <p>
            =1
 (,) ( )=  01 ( (,,)
(,,)  ( (,   , 
 )∈  
) +  (,,)
2
∑ ℎ+
=1
(,,)
 
  ))
where  01() = max(min(, 1), 0), ℎ,, ∈ [
            <xref ref-type="bibr" rid="ref1">0, 1</xref>
            ]+2 is trainable type selection weights, and  
denotes the interaction between entity  and relation   . The parameters  (,,) and  (,,) control
  =   ⊤  
whether explicit and implicit type constraints are applied.
          </p>
          <p>The ultimate truth degree is calculated by aggregating the intermediate degrees of N rules:

( ,)
,,,</p>
          <p>= ∑  
=1
()</p>
          <p>(,)
⋅  ,,,
Where</p>
          <p>
            () ∈ [
            <xref ref-type="bibr" rid="ref1">−1, 1</xref>
            ] is the confidence of rule  , normalized by the Tanh activation function.
          </p>
          <p>Then, we define the final logit prediction by combining output logits from the original DocRE model
with the ultimate truth degrees from the type-constrained rule reasoning module:
  ( ,  , ) =
[ ( ,  , )</p>
          <p>( ,)
] +  ,,,
(7)</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Experiments</title>
      <p>We uniformly denote the enhanced model as TyRaL-X, where  represents the name of the original
DocRE model. Table 1 shows the experimental results of TyRaL on the DWIE dataset. The results
indicate that TyRaL achieves stable and significant performance on all integrated DocRE backbone
models, and is comprehensively superior to the original models in F1 and Ign F1 metrics,
demonstrating good generality and robustness. Compared with the current state-of-the-art rule-enhanced
methods, CaDRL and JMRL, TyRaL introduces key innovations in logical modeling. CaDRL relies on
diferentiable chain-like rule learning to improve logical consistency, while JMRL alleviates the error
propagation problem through a joint training mechanism. In contrast, TyRaL proposes more refined
type-constrained rules, significantly expanding the expressive power of the rules and enabling the
capture of more fine-grained semantic constraints and structural relationships between entity types—rules
of this kind have not been systematically modeled in existing methods. In our experiments, we adopt
the F1 metric. However, some relational facts appear in both the training and the dev/test sets. As a
result, a model may memorize these relations during training and achieve artificially high performance
on the dev/test set, introducing evaluation bias. Such overlap is inevitable, since many common
relational facts are likely to occur across diferent documents. Therefore, we also report the F1 scores after
excluding those relational facts shared by the training and dev/test sets, which we denote as Ign F1.</p>
    </sec>
    <sec id="sec-3">
      <title>4. Limitation</title>
      <p>While our study has made some progress, several limitations remain. First, the experiments were
conducted exclusively on the DWIE dataset, which raises concerns about the generalizability of the
findings to other domains and datasets. In addition, the current evaluation relies primarily on quantitative
metrics and lacks case studies. We plan to address these limitations in future work.</p>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>In this poster, we propose an end-to-end learning framework, TyRaL, featuring a type-constrained rule
reasoning module that simulates logical rules to enhance reasoning ability. Experiments on the DWIE
dataset demonstrate its efectiveness and superiority. Future work will explore integrating logical
constraints into large language models to discover more accurate and generalizable rules.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was supported by the Project of Science and Technology Research and Development Plan
of China Railway Corporation (N2023J044).</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, we used ChatGPT in order to: Grammar and spelling check.
After using this tool, we reviewed and edited the content as needed and take full responsibility for the
publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <article-title>Boosting document-level relation extraction by mining and injecting logical rules</article-title>
          ,
          <source>in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>10311</fpage>
          -
          <lpage>10323</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Document-level relationship extraction by bidirectional constraints of beta rules</article-title>
          ,
          <source>in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>2256</fpage>
          -
          <lpage>2266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , P. Wu,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          , Cadrl:
          <article-title>Document-level relation extraction via context-aware diferentiable rule learning</article-title>
          ,
          <source>in: Proceedings of the 31st International Conference on Computational Linguistics</source>
          ,
          <year>2025</year>
          , pp.
          <fpage>8272</fpage>
          -
          <lpage>8284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <article-title>End-to-end learning of logical rules for enhancing document-level relation extraction</article-title>
          ,
          <source>in: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2024</year>
          , pp.
          <fpage>7247</fpage>
          -
          <lpage>7263</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zaporojets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Deleu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Develder</surname>
          </string-name>
          , T. Demeester,
          <string-name>
            <surname>Dwie:</surname>
          </string-name>
          <article-title>An entity-centric dataset for multi-task document-level information extraction</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>58</volume>
          (
          <year>2021</year>
          )
          <fpage>102563</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Docred: A large-scale document-level relation extraction dataset</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>06127</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Double graph based reasoning for document-level relation extraction</article-title>
          , arXiv preprint arXiv:
          <year>2009</year>
          .
          <volume>13752</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Huang</surname>
          </string-name>
          , T. Ma, J. Huang,
          <article-title>Document-level relation extraction with adaptive thresholding and localized context pooling</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>35</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>14612</fpage>
          -
          <lpage>14620</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>