<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Ocid</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Efects of Graph Neural Network Aggregation Functions on Generalizability for Solving Abstract Argumentation Semantics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dennis Craandijk</string-name>
          <email>d.f.w.craandijk@uu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Floris Bex</string-name>
          <email>f.bex@uu.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Hagen, Germany</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Police Lab AI</institution>
          ,
          <country>Netherlands Police</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Tilburg University</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Utrecht University</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>argumentation has gained significant attention as a formal framework for representing and reasoning about complex decision-making processes [1]. Advancements in various domains, such as legal reasoning, multi-agent systems and human-computer interaction, highlight the importance of developing eficient solvers for various computational reasoning problems within this approach. Computational problems, such as enumerating extensions or deciding whether an argument is accepted in one or all such extensions, are commonly solved with exact solvers [2, 3]. While exact solvers are efective for small-scale problems, these programs can struggle to handle the computational demands of large and complex argumentation frameworks. Several recent approaches have been proposed for defining approximate algorithms, which aim to provide solutions that are close to the exact one but can be computed more eficiently [ 4, 5, 6, 7].</p>
      </abstract>
      <kwd-group>
        <kwd>Aggregation</kwd>
        <kwd>Abstract</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>CEUR</p>
      <p>ceur-ws.org
to their ability to learn from graph-structured data. Since an argumentation framework (AF)
can naturally be represented as graph, GNNs can efectively capture the interactions between
arguments and counterarguments, allowing them to approximate exact reasoning about
argumentation semantics. The allure of this approach is that GNNs can solve problems with linear
time complexity relative to the input size, while being learned from data and thus alleviating
the need for manual feature engineering and expert knowledge.</p>
      <p>
        Several works have shown that GNNs are able to predict argument acceptance under various
argumentation semantics with high accuracy [
        <xref ref-type="bibr" rid="ref4 ref5 ref6 ref7">4, 5, 6, 7</xref>
        ]. These studies employ various GNN
architectures, mainly characterized by distinct aggregation and update functions, and diferent
training and evaluation regimes. The lack of a systematic and uniform evaluation protocol
has hindered direct comparisons between these approaches, thereby missing insights into the
most efective design choices. To address this gap, this work aims at uniformly comparing
diferent GNN architectures based on their performance on a number of benchmark datasets. We
specifically focus on the most efective aggregation function, contributing to the development
of accurate and robust models.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <sec id="sec-2-1">
        <title>2.1. Abstract argumentation</title>
        <sec id="sec-2-1-1">
          <title>We recall Dung’s abstract argumentation frameworks [8].</title>
          <p>Definition 1. An abstract argumentation framework (AF) is a pair  = (, ) where A is a
(finite) set of arguments and  ⊆  ×  is the attack relation. The pair (, ) ∈  means that 
attacks  . A set  ⊆  attacks  if there is an  ∈  , such that (, ) ∈  . An argument  ∈  is
defended by  ⊆  if, for each  ∈  such that (, ) ∈  ,  attacks  .</p>
          <p>Dung-style semantics define the sets of arguments that can jointly be accepted ( extensions). A
 -extension refers to an extension under semantics  . We consider admissible sets and preferred
and grounded semantics with the following functions respectively adm, prf, grd.
Definition 2. Let  = (, ) be an AF. A set  ⊆  is conflict-free (in F), if there are no ,  ∈  ,
such that (, ) ∈  . The collection of sets which are conflict-free is denoted by cf( ) . For  ∈ cf( ) ,
it holds that:  ∈ adm( ) , if each  ∈  is defended by  ;  ∈ prf( ) , if  ∈ adm( ) and for each
 ∈ adm( ) ,  ⊄  and for each  ∈  defended by  it holds that  ∈  ;  ∈ grd( ) .</p>
          <p>Furthermore, for some argument  that is part of  , we can determine if it is credulously
accepted under semantics  –  is contained in at least one  -extension – or sceptically accepted
under  –  is contained in all  -extensions.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Graph neural networks</title>
        <p>
          We recall graph neural networks as used for abstract argumentation [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Let  = ( , ) be a
graph representation of an AF  , where at message passing step  = 0 each node  is assigned
a real-valued vector    ∈  and each edge between node  and  is assigned a vector   ∈ 
neighboring nodes, expressing the local graph structure when updating node representations. To
generate the embedding for node A, the messages from A’s graph neighbors are aggregated. In turn, the
messages coming from these neighbors are based on the aggregated messages from their respective
neighbors, and so on. (source: Hamilton et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ])
representation, such that
indicating whether it represents a directed, reciprocal or self-loop edge. At subsequent message
passing steps  , each node  aggregates messages   from its neighbours  and updates its vector
 +1 =MSG( 

, 


,   )

+1 =UPDT(
        </p>
        <p />
        <p>∈ ()
, AGGR( 
+1 ))
(1)
(2)
where  ()</p>
        <p>denotes all neighbours of node  . The message function MSG computes a messages
based on the vectors of two connected nodes along with the edge representations. The update
function UPDT updates the node representation based on the previous node representation
and the messages aggregated by AGGR. The message and update functions are parameterized
neural networks which, in conjunction with the aggregation function, yield a neural message
passing algorithm whose parameters can be optimised (i.e. learned) based on data. After each
step  , node representations can be read out with the readout function READ that maps a node
representation to a likelihood  
  = READ(</p>
        <p>) of the respective argument being accepted.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Aggregation function</title>
      <p>A central component of GNNs is the aggregation function, which summarizes the information
from neighboring nodes (see Figure 1). Diferent aggregation functions can impact the model’s
performance and the type of information it captures from the graph. We discuss five common
aggregation functions: Sum, Max, Mean, Attention, and Convolution.</p>
      <p>
        The Sum aggregation function adds up the feature vectors of neighboring nodes, allowing each
node to accumulate information from its neighbors equally. This approach is straightforward
and shown to be one of the most expressive aggregation functions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However, it may lead to
unstable node embeddings when scaling up to AFs larger than those seen during training [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        Max aggregation selects the maximum value for each feature dimension across the neighbors,
making it useful for highlighting the most significant features. This method might lose some
nuanced information since it does not consider the entire neighborhood’s contribution. Mean
aggregation computes the average of the feature vectors of neighboring nodes, providing a
balanced representation. This approach can help mitigate the influence irrelevant neighbors but
may fail to capture the unique importance of individual nodes. Attention mechanisms compute
a weighted Mean of the neighbors’ features based on their relevance to the central node. By
assigning higher weights to more important neighbors, Attention aggregation enables the model
to focus on the most relevant information and improve its discriminative power [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Finally,
convolutional aggregation functions, inspired by Convolutional Neural Networks (CNNs), apply
learnable filters to the neighborhood features [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This approach allows the model to capture
spatial patterns and local structures within the graph, making it particularly suitable for tasks
that require understanding of geometric properties.
      </p>
      <p>
        In the field of computational argumentation, diferent works have used diferent aggregation
functions in their GNN architectures. Where Kuhlmann and Thimm [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Malmqvist et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] use
the Convolution aggregation function, Craandijk and Bex [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] use the Sum aggregator and Cibier
and Mailly [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] use the Attention aggregator. Whereas Craandijk and Bex [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] use a combination
of aggregators, to the best of our knowledge no work has yet evaluated the performance of Max
and Mean aggregators on predicting argument acceptability. Additionally, all mentioned works
employ the various aggregators in combination with diferent update and message functions and
training and evaluation regimes, making them dificult to compare. In this work we compare all
mentioned aggregators uniformly in terms of architectural design and evaluation setup.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Data</title>
      <p>
        Since GNNs learn to solve problems based on data, the characteristics of the data used to
train and evaluate the architecture are consequential to their performance. The diferent
works on using GNNs to learn argumentation semantics use diferent datasets, hindering direct
comparison between methods. Additionally most works train and validate on datasets with the
same characteristics. Kuhlmann and Thimm [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] show that, while diferent GNN architectures
generally yield high quality results when tested on in-distribution AFs, performance severely
degrades when generalizing to AF types not seen during training. This indicates that GNNs tend
to learn superficial features of the data, rather than a general applicable rule (a fundamental
problem found in various fields of deep learning).
      </p>
      <p>
        In this work we aim at comparing GNN design choices with the goal of developing accurate
and robust approximate solvers. Therefore we adopt the evaluation datasets of Kuhlmann and
Thimm [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] (i.e. PBBG, KWT and ICCMA) with the aim of testing scalability and generalization
of diferent aggregation functions. Generalization and scalability ensure that a learned GNN
solver for abstract argumentation can be reliably deployed in various settings. A scalable solver
ensures that the system can handle increasingly larger argumentation frameworks without a
significant drop in performance. Generalizability refers to the model’s ability to perform well
AFs with graph properties not seen during training (i.e. out-of-domain AFs).
      </p>
      <p>
        A PBBG dataset consists of AFs generated by generators from ICCMA 2020 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] as used by
Craandijk and Bex [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], namely AFBenchGen2, AFGen Benchmark Generator, GroundedGenerator,
SccGenerator, StableGenerator. This set of generators can generate AFs of various sizes, making
them suitable to evaluate in-domain scalability. The KWT dataset is tailored by Kuhlmann
et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] towards generating abstract argumentation frameworks that are particularly hard for
tasks related to deciding preferred acceptance by avoiding as much as possible the easy cases
(where the accepted arguments are (almost) similar to the grounded extension). These AFs are
particularly suited to test out-of-domain generalization. The ICCMA dataset consists of AFs
that are part of the ICCMA 2017 benchmark competition.1 These are large AFs with on average
around 650 AFs. Notably, a part of this set is generated by generators not in PBBG, namely
ABA2AF, admbuster, Planning2AF, sembuster, and trafic . These AFs are suitable to evaluate
both scalability as generalization.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental analysis</title>
      <p>
        To assess the performance of diferent Graph Neural Network (GNN) aggregation functions
in predicting sceptical argument acceptance under preferred semantics, we conduct a series
of experiments. We focus on this particular problem due to its computational complexity
and the fact that the KWT dataset, specifically designed for this task, presents a challenging
scenario for GNNs to solve [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. We create a training dataset, PBBG-train, comprising 100,000
argumentation frameworks (AFs) with argument counts ranging from 5 to 25. Utilizing the
same set of generators, we generate three additional datasets: PBBG-val and PBBG-test, each
containing 1,000 AFs with exactly 25 arguments, and PBBG-scale, which consists of 1,000 AFs
with 100 arguments. PBBG-val is used as a validation set, PBBG-test serves to test the model’s
learning eficacy, while PBBG-scale evaluates in-domain scalability. Lastly, we use the same
1,000 KWT instances as previously employed by Kuhlmann and Thimm [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] to test
out-ofdomain generalization and the same 450 AFs from the ICCMA 2017 competition to assess both
scalability and out-of-domain generalization capabilities. We adopt the training procedure 2
as described by Craandijk and Bex [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. We evaluate with the Matthews correlation coeficient
(MCC) as it is regarded as a balanced measure, even when classes are unequally distributed. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
The MCC is a correlation coeficient value between -1 and +1, where +1 represents a perfect
prediction, 0 an average random prediction and -1 an inverse prediction.
      </p>
      <p>Aggregator
Sum
Mean
Max
Attention
Convolution</p>
      <p>PBBG-test PBBG-scale</p>
      <p>KWT ICCMA
0.99
0.98
0.95
0.99
0.98
0.81
0.94
0.92
0.93
0.96
0.57
0.61
0.72
0.88
0.62
0.25
0.23
0.58
0.58
0.30</p>
      <sec id="sec-5-1">
        <title>1http://argumentationcompetition.org/2017/ 2https://github.com/DennisCraandijk/DL-abstract-argumentation</title>
        <p>
          As Table 1 illustrates, all aggregation functions are able to capture the characteristics of the
PBBG-train data as demonstrated by the PBBG-test dataset. When scaling to larger graphs
within the same domain, as evidenced by the PBBG-scale dataset, all aggregation functions
generalize well. Only the performance of the Sum aggregator drops, which can be caused by
unstable node embedding as mentioned in Section 3. A disparity between aggregators emerges
on the KWT dataset. Here, the Sum, Mean, and Convolution aggregators struggle to generalize
beyond their training data, yielding MCCscores below 0.65. Conversely, the Max and Attention
aggregators excel, likely due to their ability to concentrate on specific features within a node’s
neighborhood rather than merging all features indiscriminately. Despite not being the main
goal of this work, both aggregators even surpass the previous state-of-the-art result, as reported
by Kuhlmann et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], in this training regime by a large margin. This efect is also visible on
the ICCMA dataset, which however still proves to be challenging for all GNN variants as scores
top at 0.58 MCC.
        </p>
        <p>The Max and Attention aggregators emerge as optimal choices for GNN applications in
abstract argumentation. Contrary to the Mean, Sum, and Convolution aggregators, which
aggregate neighbor features uniformly, the Max and Attention mechanisms empower GNNs to
selectively hone in on specific information. This selective focus allows GNNs to capitalize on
the most significant interactions between arguments and counterarguments, thereby enhancing
their performance in abstract argumentation tasks.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, we explored the efects of diferent aggregation functions on the generalizability of
GNN’s for solving abstract argumentation semantics. Through a comprehensive experimental
analysis, we demonstrated that the choice of aggregation function plays a central role in
determining the performance and robustness of GNNs in this context. Our findings highlight
that while most aggregation functions perform similarly in terms of in-domain scalability,
significant diferences emerge when evaluating out-of-domain generalization. Specifically,
the Max and Attention aggregation functions show better performance in handling AFs with
graph properties not seen during training, indicating their potential to capture the dynamics
between arguments from the argumentation frameworks. Future research could investigate the
impact of other architectural design choices and training strategies on model performance and
generalizability, especially on the challenging ICCMA dataset.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Atkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Baroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Giacomin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hunter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Prakken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. R.</given-names>
            <surname>Simari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thimm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Villata</surname>
          </string-name>
          , Towards artificial argumentation,
          <source>AI Mag</source>
          .
          <volume>38</volume>
          (
          <year>2017</year>
          )
          <fpage>25</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Charwat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dvorák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Gaggl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Wallner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Woltran</surname>
          </string-name>
          ,
          <article-title>Methods for solving reasoning problems in abstract argumentation - A survey</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>220</volume>
          (
          <year>2015</year>
          )
          <fpage>28</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Gaggl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Linsbichler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maratea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Woltran</surname>
          </string-name>
          ,
          <article-title>Design and results of the second international competition on computational models of argumentation, Artif</article-title>
          . Intell.
          <volume>279</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Craandijk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bex</surname>
          </string-name>
          ,
          <article-title>Deep learning for abstract argumentation semantics</article-title>
          ,
          <source>in: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI</source>
          <year>2020</year>
          ,
          <article-title>ijcai</article-title>
          .org,
          <year>2020</year>
          , pp.
          <fpage>1667</fpage>
          -
          <lpage>1673</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kuhlmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thimm</surname>
          </string-name>
          ,
          <article-title>Using graph convolutional networks for approximate reasoning with abstract argumentation frameworks: A feasibility study</article-title>
          , in: N. B.
          <string-name>
            <surname>Amor</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Quost</surname>
          </string-name>
          , M. Theobald (Eds.),
          <source>Scalable Uncertainty Management - 13th International Conference, SUM</source>
          <year>2019</year>
          , Compiègne, France,
          <source>December 16-18</source>
          ,
          <year>2019</year>
          , Proceedings, volume
          <volume>11940</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2019</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Malmqvist</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nightingale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manandhar</surname>
          </string-name>
          ,
          <article-title>Determining the acceptability of abstract arguments with graph convolutional networks, in: S. A</article-title>
          .
          <string-name>
            <surname>Gaggl</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Thimm</surname>
          </string-name>
          , M. Vallati (Eds.),
          <source>Proceedings of the Third International Workshop on Systems and Algorithms for Formal Argumentation co-located with the 8th International Conference on Computational Models of Argument (COMMA</source>
          <year>2020</year>
          ),
          <year>September 8</year>
          ,
          <year>2020</year>
          , volume
          <volume>2672</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cibier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mailly</surname>
          </string-name>
          ,
          <article-title>Graph convolutional networks and graph attention networks for approximating arguments acceptability - technical report</article-title>
          ,
          <source>CoRR abs/2404</source>
          .18672 (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Dung</surname>
          </string-name>
          ,
          <article-title>On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>77</volume>
          (
          <year>1995</year>
          )
          <fpage>321</fpage>
          -
          <lpage>358</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ying</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <article-title>Representation learning on graphs: Methods and applications</article-title>
          ,
          <source>IEEE Data Engineering Bulletin</source>
          <volume>40</volume>
          (
          <year>2017</year>
          )
          <fpage>52</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jegelka</surname>
          </string-name>
          ,
          <article-title>How powerful are graph neural networks?</article-title>
          ,
          <source>in: 7th International Conference on Learning Representations, ICLR</source>
          <year>2019</year>
          ,
          <article-title>New Orleans</article-title>
          , LA, USA, May 6-
          <issue>9</issue>
          ,
          <year>2019</year>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>C. K. Joshi</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Cappart</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rousseau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Laurent</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Bresson</surname>
          </string-name>
          ,
          <article-title>Learning TSP requires rethinking generalization</article-title>
          , CoRR abs/
          <year>2006</year>
          .07054 (
          <year>2020</year>
          ). arXiv:
          <year>2006</year>
          .07054.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Veličković</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cucurull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Casanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Romero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liò</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          , Graph Attention Networks,
          <source>International Conference on Learning Representations</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Kipf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Semi-supervised classification with graph convolutional networks</article-title>
          ,
          <source>in: 5th International Conference on Learning Representations)</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Craandijk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bex</surname>
          </string-name>
          ,
          <article-title>Enforcement heuristics for argumentation with deep reinforcement learning</article-title>
          ,
          <source>in: Thirty-Sixth AAAI Conference on Artificial Intelligence</source>
          ,
          <source>AAAI 2022, ThirtyFourth Conference on Innovative Applications of Artificial Intelligence, IAAI</source>
          <year>2022</year>
          ,
          <source>The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1</source>
          ,
          <year>2022</year>
          , AAAI Press,
          <year>2022</year>
          , pp.
          <fpage>5573</fpage>
          -
          <lpage>5581</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kuhlmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wujek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thimm</surname>
          </string-name>
          ,
          <article-title>On the impact of data selection when applying machine learning in abstract argumentation</article-title>
          , in: F.
          <string-name>
            <surname>Toni</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Polberg</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Booth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Caminada</surname>
          </string-name>
          , H. Kido (Eds.),
          <source>Computational Models of Argument - Proceedings of COMMA</source>
          <year>2022</year>
          , Cardif, Wales, UK,
          <fpage>14</fpage>
          -16
          <source>September</source>
          <year>2022</year>
          , volume
          <volume>353</volume>
          <source>of Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press,
          <year>2022</year>
          , pp.
          <fpage>224</fpage>
          -
          <lpage>235</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Powers</surname>
          </string-name>
          ,
          <article-title>Evaluation: From precision, recall and f-measure to roc, informedness, markedness &amp; correlation</article-title>
          ,
          <source>Journal of Machice Learning Technology</source>
          <volume>2</volume>
          (
          <year>2011</year>
          )
          <fpage>2229</fpage>
          -
          <lpage>3981</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>