=Paper=
{{Paper
|id=Vol-3318/paper2
|storemode=property
|title=
|pdfUrl=https://ceur-ws.org/Vol-3318/paper2.pdf
|volume=Vol-3318
|authors=Michail Karavokyris,Spyros Sioutas
|dblpUrl=https://dblp.org/rec/conf/cikm/KaravokyrisS22
}}
====
<pdf width="1500px">https://ceur-ws.org/Vol-3318/paper2.pdf</pdf>
<pre>
Graph Neural Networks For Affective Social Media: A
Comprehensive Overview
Michail Karavokyris∗ , Spyros Sioutas
Computer Engineering and Informatics Department, University of Patras, Patras 26504, Hellas


                                        Abstract
                                        Social media have become the main platforms for expressing and supplementing nuanced human activity such as engaging
                                        in public and private conversations, creating and sharing multimedia content, participating to digital culture events, and
                                        recently describing emotions about events, places, or even products. In this survey, we provide a comprehensive overview of
                                        graph mining and machine learning on affective social media through graph neural networks (GNNs). The latter are capable
                                        of performing a variety of tasks, such as graph and vertex classification, link prediction, and graph clustering using vertex
                                        information, edge information, and topological structure. These capabilities are critical in harnessing the vast emotional
                                        information available in social media in order to generate meaningful and scalable affective analytics.

                                        Keywords
                                        graph neural networks, distributed computation, graph mining, graph convolution, network topology, convergence, link
                                        prediction, label prediction, community discovery, affective computing, PyTorch,


1. Introduction                                                                                              class of neural network architectures depending strongly
                                                                                                             on information propagation mechanisms such as mes-
Currently social media are widely considered to be the                                                       sage passing between graph nodes or attention functions
digital reflection, or even the digital twin in certain cases,                                               between network layers to encapsulate the higher order
of individuals and groups. Among the prime informa-                                                          communication flow and interplay inherent in graphs.
tion found in social media are affective indicators such                                                     Although their functionality may resemble that of other
as the emotional polarity of posts or reactions to them.                                                     architectures like the established multilayer perceptrons
This is especially true in Twitter which abounds with                                                        (MLPs) found in many machine learning (ML) applica-
long conversations full with emotionally charged replies                                                     tions, it is fundamentally different mainly because the
[1][2], whereas Facebook [3] and LinkedIn [4][5] have                                                        role of higher order patterns is more intense.
dedicated emotional reaction buttons for each post. Even                                                        The primary research objective of this conference pa-
Instagram contains images which have been reported to                                                        per is the presentation of the predominant GNN architec-
elicit emotional responses [6].                                                                              tures and their primary properties as well as how they
   Typically, in deep learning applications, such as fraud                                                   can be applied to basic tasks related to affective social
detection, natural language processing (NLP), biomedical                                                     network analysis. This will give the interested reader a
image processing, and computer vision, the datasets are                                                      brief yet concise view of the research landscape of a field
represented as manifolds in the Euclidean space. How-                                                        which is the focus of intense interdisciplinary research.
ever, recently the number of engineering scenarios requir-
ing non-Euclidean data and instead rely on graphs has
                                                               Table 1
been rising. Therein topological relations and intercon- Notation Summary
nectivity play a major role. Graphs enable the modeling
of important problems in various scientific fields includ-        Symbol                                                                   Meaning                           First in
ing complex systems, social networks, protein-protein             =
                                                                   △
                                                                                                                                           Equality by definition            Eq. (1)
interaction networks, logistics and long supply chains,           ẋ                                                                       First vector derivative           Eq. (4)
transportation networks, knowledge graphs, and others.            tanh (⋅)                                                                 Hyperbolic tangent                Eq. (8)
   Graph Neural Networks (GNNs) constitute a broad                deg (𝑣)                                                                  Degree of vertex 𝑣                Eq. (1)
                                                                                                                diag [𝑑1,1 , … , 𝑑𝑛,𝑛 ]    Diagonal matrix                   Eq. (1)
CIKM’22: 31st ACM International Conference on Information and                                                   I𝑛                         𝑛 × 𝑛 identity matrix             Eq. (2)
Knowledge Management (companion volume), October 17–21, 2022,
Atlanta, GA
∗
     Corresponding author.                                                                                                          The remainder of this work is structured as follows. In
Envelope-Open karavokyrism@gmail.com (M. Karavokyris);                                                                           section 2 the recent scientific literature regarding GNNs,
sioutas@ceid.upatras.gr (S. Sioutas)                                                                                             affective social media, and graph mining is overviewed.
Orcid 0000-0002-1263-0785 (M. Karavokyris); 0000-0003-1825-5565                                                                  Then in section 3 the primary properties of GNNs are enu-
(S. Sioutas)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License merated in detail, whereas in section 4 the applications of
                                    Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)                                                      GNNs to affective social network analysis are presented.


                                                                                                         1
Michail Karavokyris et al. CEUR Workshop Proceedings                                                               1–10


Future research directions are given in section 5. Capital   [44], approximating directed graphs with undirected ones
boldface letters denote matrices, small boldface vectors,    based on enegry criteria [45], managing graph streams
and normal small scalars. Acronyms are explained the         with relational algebra [46], computing graph topological
first time they are encountered in the text. Additionally,   correlation [47], efficiently inferring graph isomorphism
the terms vertex and node are used interchangeably in        [48] and performing generic pattern search [49] with
this work. The same holds true for the terms edge and        GNNs on graphs, and massive graph visualization with
link. In function definitions parameters follow the re-      feedback for graph matching [50]. Applications of graph
spective arguments after a semicolon. Finally, in table 1    mining include among others co-author recommenda-
the notation used in this work is summarized.                tion [51], efficient new drug discovery [52], consensus
                                                             protocols in blockchains [53], and energy management
                                                             in smart power grids [54]. Other considerations include
2. Related Work                                              fairness [55], explainability and automation [48], and
                                                             application to the emerging field microservices [56].
As stated earlier GNNs are neural networks tailored for
                                                                Social network analysis, although it relies heavily on
natively handling graphs or any kind of linked data for
                                                             graph mining [57], it is a distinct field since it also fo-
that matter [7]. Techniques for doing so include graph
                                                             cuses on social media functionality [58], which includes
embedding [8], message passing [9], and attention mech-
                                                             posts [59], conversations [60], and even digital trust as a
anisms [10], the latter primarily in the form of graph
                                                             conditional extension of the one found in the real world
attention networks (GATs) [11]. The current state of
                                                             [61, 62]. Moreover, psychological aspects such as self-
the art in GNNs allows them to perform link prediction
                                                             esteem [63] and cognitive ones like consumer engage-
[12], graph convolution [13], semi-supervised [14] and
                                                             ment and online time [64] play a central role. Among the
unsupervised [15] graph clustering, and node classifica-
                                                             numerous social media applications can be found stock
tion [16]. Regarding applications, GNNs have been used
                                                             market trend prediction [65], the acceleration under suit-
to evaluate the affective coherence of ordinary [17] and
                                                             able conditions of open innovation [66], the selection
fuzzy [18] Twitter graphs, to perform content filtering
                                                             database architecture according to social queries regard-
[19], to yield social recommendations [20], to compute
                                                             ing Twitter account influence [67], the alteration of the
recommendations at large scale systems [21], to perform
                                                             value of NFTs depending on the Twitter influence of the
image classification [22], to do vertex classification based
                                                             respective holder [68], and the data-driven deployment of
on their susceptibility in SIS-type propagation models
                                                             digital marketing [69]. Reviews of the field include [70]
[23], for fake news discovery [24], and for rumor tracing
                                                             which places special emphasis on community structure
[25]. Comprehensive field reviews regarding GNNs can
                                                             discovery, [71] which explores the dynamics of academic
be found in [26] and also in [27].
                                                             social networks and online communities, and [72] where
   Neural network architectures are ubiquitous in ML
                                                             collaborative innovation processes are explored.
[28, 29], especially in conjunction with low rank tensor
approximation [30], and signal processing [31]. Bayesian
neural networks stem directly from non-classical signal 3. Graph Neural Networks
estimation theory [32]. Convolutional neural networks
(CNNs) are extensively used in image processing [33]. Re- 3.1. Overview
cently deep neural networks have been trained to obey
physical laws [34]. In [35] a sequence of social graphs In this section first the most frequent tasks performed
is compressed with the two dimensional discrete cosine by the GNN architectures are described. Then, the most
transform (DCT2) but expanded with a tensor stack net- prominent GNN types and their properties are presented.
work (TSN) trained with information from the entire
sequence. Moreover, TSNs have been used for sound 3.2. GNN Tasks
classification [36] and large scale urban network speed
prediction [37]. Self organizing maps (SOMs) for cultural Typically, every application for affective social media fits
content recommendation are described in [38]. Recent into one of the following basic tasks:
and extensive reviews on neural network architectures          • Node classification: The goal is to predict miss-
include [39] and [40], where an extended and neural net-          ing node labels in a social network using the la-
work taxonomy is described as well.                               bels of the neighbor nodes. For example, the emo-
   Graph mining aims at locating and extracting latent            tional state of a user can be predicted as a function
and non-trivial knowledge from graphs such as cycle               of the attributes of that user and of its neighbours.
lengths in massive graphs [41], higher order spatiotempo-      • Link prediction: In this scenario the objective
ral patterns [42], and triangles [43]. Techniques include         is to predict the link between various entities in a
employing intelligent agents for autonomous mining


                                                             2
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                1–10


         network by utilizing a partial or otherwise incom-     There are two different types of graph convolution
         plete adjacency matrix. This task is frequently operations, which in turn determine the domain a given
         used in social network settings because it can pre- GCN is defined on:
         dict whether any two vertices, which may well
                                                                   • Spatial convolution: These GCNs operate di-
         be accounts, pages, or even entire communities,
                                                                      rectly on the graph adjacency matrix as if were a
         are likely to be connected. Moreover, in certain
                                                                      grid but with additional constraints. Thus, con-
         cases and depending on the available features, the
                                                                      volution is performed in a way similar to images
         strength of this link may be estimated as well.
                                                                      by using spatial features learned from the graph.
      • Community detection: The case here is to allo-
                                                                      This is the equivalent to the time domain filtering.
         cate nodes into clusters whose size is unknown be-
                                                                   • Spectral convolution: These GCNs utilize the
         forehand, namely it is a clustering problem. This
                                                                      eigendecomposition of the graph Laplacian ma-
         can be done by partitioning the vertex sex based
                                                                      trix in order to propagate information across
         on edge features like weights or, alternatively,
                                                                      nodes. Therefore, processing takes place in the
         by viewing the nodes as items and by grouping
                                                                      two-dimensional spatial frequency domain akin
         together items with comparable properties. For
                                                                      to the transform domain adaptive algorithms.
         instance, community detection can be used on
         affective social media analysis to locate commu-       Recall that the graph Laplacian of equation (2) can be
         nities with similar emotional characteristics.      defined based on the graph degree matrix of equation
                                                             (1). Observe that nodes of zero degree essentially do not
3.3. Architectures                                           contribute to the overall graph structure and thus are
                                                             considered to have been removed during a preprocessing
GNNs constitute a class of neural networks based on the stage. Therefore, matrix D is always invertible.
dependence between the elements of the graph. The term
                                                                               △
GNN does not refer to a single algorithm or architecture                    D = diag [deg (𝑣1 ), … , deg (𝑣𝑛 )]        (1)
but rather to a plethora of distinct algorithms. The com-
mon denominator for each GNN is the ability to exploit          With this knowledge the graph Laplacian matrix can
the information inherent in graph topology in order to then be constructed from the respective adjacency matrix
compute a global steady state. This is more evident in A as shown in equation (2). The eigenexpansion of L is
the message passing architectures, but this can also be the graph spectrum on the corresponding basis.
seen in some other of the most common GNN architec-                                    △
tures that have been developed in recent years like graph                           L = I𝑛 − D−1 A                     (2)
convolutional networks (GCNs) and graph attention net-
                                                                Although spectral GCNs can construct powerful graph
works (GATs). In table 2 the architectures examined here
                                                             representations and act as convolutional filters for graph
and their main properties are presented.
                                                             classification with considerable accuracy, they fail to uti-
                                                             lize feature locality commonly found in most graphs. Ad-
3.3.1. Graph Convolutional Networks                          ditionally, spectral GCNs come with great computational
Graph convolutional networks (GCNs), which seek to cost, especially for large networks.
imitate the functionality of ordinary CNNs, are currently       In order to address the issues of locality and computa-
the prime candidate architectures for most real life ap- tional complexity, ChebNets were developed in order to
plications. Specifically, the main idea behind GCNs is combine CNNs with the spectral networks theory. Thus,
to adapt CNNs to natively handle linked data, namely in ChebNets the representation of any feature vector
graphs. CNNs in order to create highly expressive repre- should only be influenced by the 𝑘-hop neighbors. There-
sentations can extract multiscale localized spatial infor- fore, ChebNets provide the essential algorithmic foun-
mation and combine it in order to yield the final result. In dation and effective schemes since the convolution is
this sense, they exploit the higher order patterns inherent computed using Chebyshev polynomials instead of the
in graphs. Since CNNs are able to capture meaningful eigenvectors of the Laplacian matrix. Therefore, spectral
features across the entire data sets, GCNs adjust the oper- GCNs can be considered as ChebNets where the neigh-
ation of convolution from grid data to graph data. Graph borhood depth equals one. The objective of this model is
convolution uses the features of the neighbors of a given to learn a function of features which operates on a graph
node to make predictions by transforming the features of 𝐺 represented as in equation (3):
a node in a latent space. The objective for these models is                              △
to train a function of features on a graph where the input                             𝐺 = (𝑉 , 𝐸)                     (3)
is a set of nodes and edges which are described from a
                                                                Specifically, a ChebNet is designed to build an 𝑁 × 𝐹
feature vector that contains their attributes.
                                                             output matrix where 𝐹 is the number of output attributes


                                                            3
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                     1–10


and 𝑁 is the number of vertices. Said matrix is iteratively        normalised coefficients from the unnormalized ones. Typ-
constructed given the following graph input.                       ically, the softmax function is the key to normalizing
                                                                   these coefficients as it can convert a set of raw scores to
     • Feature description vectors, one for each of the            an exponentially weighted distribution.
       𝑁 nodes, are stacked are form a 𝑁 × 𝐷 feature
       matrix where 𝐷 denotes the number of features.
                                                                   3.3.3. Message Passing Neural Networks
     • The 𝑁 × 𝑁 graph adjacency matrix. Therein are
       contained all local patterns and its powers encode          Message passing neural networks (MPNNs) are decentral-
       all higher order ones.                                      ized architectures which rely heavily on message passing
                                                                   in order to perform a given computation. Such communi-
   Each network layer has a nonlinear function which               cation may take place synchronously or asynchronously.
acts as the ChebNet propagation rule. Based in the choice          Each node starts with a local ground truth vector and
of the propagation rule and the numbers this is succes-            progressively based on input from neighboring vertices
sively applied models may vary. The most common prop-              evolves into a steady state vector. Although initially the
agation rule is ReLU operating on a linear combination             information exchanged between vertices may be inaccu-
of the outputs of the previous layers. The features pro-           rate, this is remedied at later stages, provided the update
cessed at each layer are aggregated to form the attributes         mechanisms are designed to do so. This is by no means
of the following layer. This implies that each node in             a trivial task as essentially this is a decentralized non-
the 𝑘-th layer will collect information from their 𝑘-hop           linear control problem. Therefore, extended care must
neighbors. It has been observed that a small number of             be taken beforehand in order to avoid effects such as
layers, typically at most four, suffices.                          Witsenhausen’s counterexample [73].
   Since in this model the aggregated representation of               In contrast to other neural network architectures,
each vertex includes only local features, namely those             MPNNs have a flat architecture in the sense that there
of its neighbors, this has to be taken into consideration          are no layers. This implies that the diameter of the net-
in the structure of the adjacency matrix. This is done in          work plays a crucial role as it represents the maximum
two ways, by adding the identity matrix to it to allow             amount of time, measured in the number of hops, which
the construction of its powers and also by normalizing it          is necessary in order for a given piece of information to
similarly to the graph Laplacian of (2). So when GCNs              be transmitted across the MPNN. Related metrics such
and ChebNets are trained by stochastic gradient descent            as the effective diameter reveal the links necessary for a
algorithms, which tend to be sensitive to the scale of input       considerable segment of the graph to be reached. Strong
features, there are no vanishing or exploding gradients            locality, expressed in the number of triangles or equiva-
which frequently delay or even derail training.                    lently in the clustering coefficient, contributes to quick
   It should be also mentioned that GCNs are mainly                propagation. On the contrary, bridges may be congestion
used for semi-supervised node classification, whether              points. In any case, topology is central in MPNNs and its
binary or multi-class by adding a softmax layer at the end.        effects are more intense compared to other GNN types.
Also by combining graph convolution layers with graph                 In table 2 are listed some of the most representative
pooling layers the GCN model will be able to predict the           convergence schemes proposed in the bibliography.
class labels for an entire graph.
                                                                   Table 2
3.3.2. Graph Attention Networks                                    GNN Architectures
Analogous to GCNs, GATs average hidden attributes on                  GNN architecture     Description
a local level. But unlike GCN, which compute the prop-
                                                                      Message passing      Communication with messages
agation weights explicitly during training, GATs define
                                                                      Graph convolution    Aggregation of hidden features
them implicitly. This is accomplished by the attention                ChebNet              Aggregation of attributes
mechanism, namely a learnable function to re-weight                   Graph attention      Self attention mechanism
synapses between neurons as a function of the values of
the hidden features. In this way, the significance of each
node can be specified by utilizing more information than
the structure of the graph and the connectivity patterns           3.4. Convergence
contained in the latter. However, this local aggregation
                                                                   3.4.1. State Vectors
has to be eventually compensated for when values are
propagated to other layers and this is in fact one of the          Convergence is a major topic since GNNs are distributed
factors differentiating GATs.                                      and, hence, there is not a single point of centralized con-
   In particular, the synaptic weights are computed as a           trol. As such, various techniques based on traditional
result of an attention mechanism which computes the                control equations such as those describing continuous,


                                                               4
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                                    1–10


linear, and time invariant systems as in equation (4) do             MPNNs which employ with proper scaling the sigmoid
not directly apply. Therein A is the system plant, b is the          or hyperbolic function as activation function as shown
input distribution vector, and x is the state vector.                in equation (8).
                 △
            ẋ = Ax + b𝑢,           A ∈ ℝ𝑛×𝑛 , b ∈ ℝ𝑛×1    (4)                             △                 △       𝑒 𝛽0 𝑠 − 𝑒 −𝛽0 𝑠
                                                                              𝜑(𝑠; 𝛼0 , 𝛽0 ) = 𝛼0 tanh (𝛽0 𝑠) = 𝛼0                       (8)
                                                                                                                     𝑒 𝛽0 𝑠 + 𝑒 −𝛽0 𝑠
   In equation (4) ẋ is defined as the column vector con-
taining the first time derivatives of the control variables   As stated earlier, topology plays a central role in con-
𝑥 [1] to 𝑥 [𝑛] as shown in equation (5). The selection of vergence, since it determines the average and maximum
these variables essentially determines the graph model. rate of spatial information propagation in terms of the
                                                            number of links between any two processing vertices.
          △   𝜕𝑥 [1] 𝜕𝑥 [2]           𝜕𝑥 [𝑛] 𝑇    𝑛×1         In table 3 are listed some of the most representative
       ẋ = [                   …           ] ∈ℝ        (5)
                𝜕𝑡        𝜕𝑡            𝜕𝑡                  convergence schemes proposed in the bibliography.

   Another control model based also on the concept of the
                                                                     Table 3
state vector which is more general but at the same time
                                                                     Graph Neural Network Convergence Criteria
less tractable is that of the nonlinear control model of
equation (6). In the latter 𝑓(⋅) is a nonlinear differentiable         Type                     Description
vector valued function codifying network dynamics.                     BFPL                     Based on continuous maps
             △                                                         State convergence        Aggregation of local convergence
           ẋ = 𝑓(x, 𝑢),            𝑓 ∶ ℝ(𝑛+1)×1 → ℝ𝑛×1    (6)

   Although the nonlinear control model of (6) covers
more cases than that of its linear counterpart of (4), there         3.5. Learning Tasks
are less analytical tools to explore and handle it. More-
over, many control related results depend heavily on the             Irrespective of their architectural classification GNNs
properties of 𝑓(⋅). On the contrary, the control model               are called to perform the following fundamental algo-
of (4) is appealing for a number of reasons including                rithmic tasks across a broad spectrum of applications.
tractability and explainability. To this end, often many             These include discovering graph community structure,
instances of (6) are linearlized with various methods to a           setting a message passing mechanism, performing vertex
time varying version of equation (4) where the properties            classification, and doing graph convolution.
of the latter hold true locally.

3.4.2. Brower’s Fixed Point Lemma
For most message passing architectures an alternative
methodology to monitor convergence lies in the Brower’s
fixed point lemma (BFPL). The latter states that any con-
tinuous function 𝑓(⋅) mapping any interval 𝐼0 to itself has
at least one fixed point 𝑠0 ∈ 𝐼0 as shown in (7).

                     𝑠0 = 𝑓(𝑠0 ),       𝑓 ∶ 𝐼0 → 𝐼 0       (7)

   The existence of the fixed point 𝑠0 guarantees that the
MPNN cannot escape from it and as such it is in one of the           Figure 1: Graph community discovery.
potentially many steady states. However, that requires
that a significant number of neurons reach that state
before they start propagating it to their neighbors. More-              Graph community structure discovery is paramount
over, methodologies based on the BFPL are considered to              in graph mining as it reveals latent dynamics as shown
be indirect in the sense that they monitor the output of             in figure 1. Still, in the scientific literature there is more
each node 𝑠 and not their internal state vector s as before.         than one definition of what makes a community as this
Therefore, the global convergence is tracked through                 may well depend on the semantics of the underlying
individual vertices. Still, they have been applied success-          domain. For instance, graphs may be weighted, signed,
fully, especially when the processing involves smooth                or undirected. Each such property adds constraints to
functions, in cases where the local computation is yields            community discovery. Moreover, since this task relies
a single scalar. For instance, BFPL has been applied to              on higher order patterns, it is also computationally chal-


                                                                 5
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                  1–10


lenging. Consequently, a number of diverse heuristics
have been developed for it.
   Message passing mechanisms are crucial in most engi-
neering scenarios involving graphs, even indirectly since
most networks are set up in order to achieve coherency
and communication. Especially in MPNNs selecting the
attributes represented in the ground truth vector of each
vertex is of paramount importance since that determines
what is exchanged during communication. A static snap-
shot of message passing is shown in figure 2.


                                                                 Figure 3: Node classification.


Figure 2: GNN message passing.


   Node classification is another important task where
                                                                 Figure 4: Graph convolution.
each vertex is assigned one out of many possible labels
drawn out of a finite label set based on a decision rule.
This functionality is shown in figure 3. Labels may be
repeated and, depending on the problem, some vertices
may already have a label. Moreover, this task has close
ties with the community discovery task, although in clas-
sification nonadjacent nodes may have the same label.
More recently ML models which can utilize structural
and functional attributes, whenever the latter are avail-
able, have been proposed in the literature. It should be
noted though that functional features depend heavily on
the underlying domain, whereas structural attributes can
be applied to any scenario.
   Graph convolution is an operation involving a pair of
graphs and yields a larger one whose topology depends            Figure 5: Link prediction.
on theirs. This allows the efficient discovery of local
patterns and, depending on how convolution is defined,
even their variants or incomplete ones. This operation           two given nodes exists. In order to determine whether
initially appeared in the field of computer vision and has       such link should be added to the graph, a segment of the
found numerous applications in social media analysis             graph considered as ground truth is used along with the
and ML. Figure 4 shows an instance of this operation.            assumption that scale free graphs exhibit self-similarity
   Finally in figure 5 the task of link prediction task is       in many levels. Alternatively, state vectors in every ver-
shown. It is an important task where given a partial             tex or structural patterns may be used to train an ML
graph or an evolving one and a decision rule must be             model. Either case may require a considerable amount
devised which can predict whether a link between any


                                                             6
Michail Karavokyris et al. CEUR Workshop Proceedings                                                               1–10


Table 4
Computational Tasks For Each Affective Computing Task

                  Affective task              Computational tasks
                  Node affective state        Graph attention, node classification
                  Edge emotional potential    Node classification, message passing, graph attention
                  Post emotional potential    Node classification, link prediction, graph convolution
                  Node affective influence    Message passing, link prediction, node classification
                  Affective communities       Community discovery, node classification, link prediction


of computational resources, depending on the algorithm. tive communities in case of a bridge, it also depends on
                                                              its functionality. As such, in addition to node classifica-
                                                              tion and graph attention analysis pertaining to message
4. Affective Social Media Analysis passing should be employed.
                                                                 Tracing the emotional effect of a post is more challeng-
Affective computing is a recent field which extends the
                                                              ing since a number of interconnected instances of the
existing knowledge in social network analysis with emo-
                                                              previous problem should be studied as a post propagates
tional attributes and their study. It has already bore fruits
                                                              through a graph. Moreover, possible variations of or in-
[5, 4] and its prospects look bright with the advent of
                                                              tentional modification to the latter should be also taken
sophisticated DL techniques such as the GNN architec-
                                                              into consideration as well as the overall information con-
tures described earlier but also like autoencoders, graph
                                                              text of the adjacent edges and vertices. Consequently, the
adversarial networks (GANs), and CNNs. All these mod-
                                                              entire route of a post should be analyzed in this case us-
els operate on a plethora of affective attributes including
                                                              ing graph convolutions and node classification, whereas
among others word length and polarity, number of sen-
                                                              certain propagation patterns of important posts may be
tences, use of punctuation, mentions, and words having
                                                              explained with link prediction techniques.
special meaning such as modifiers, negations, and of con-
                                                                 The affective influence of a node can be considered as a
siderable emotional weight.
                                                              generalization of a potentially nonlinear combination of
   As stated above, affective social media analysis places
                                                              determining the emotional state of a number of vertices
emphasis on the emotional state of social media accounts
                                                              with evaluating the impact of the posts of the node under
through their posts as well as through the interactions be-
                                                              consideration. This happens as influence is frequently
tween them. The methodologies most commonly found
                                                              taken to be a function of the topological properties of its
in the scientific literature can be broadly divided into the
                                                              high order neighborhood and of the emotional potential
following categories. Furthermore, in table 4 is shown
                                                              of its post. In order to evaluate said affective influence,
how each of the affective applications presented in this
                                                              node classification techniques, message passing, and link
section can take advantage of the potential offered by the
                                                              prediction are frequently employed.
learning tasks of GNNs.
                                                                 Finally, affective community discovery is perhaps the
   The determination of the affective state of a node or a
                                                              most challenging of the tasks commonly encountered in
group of nodes is paramount as it allows, among others,
                                                              affective social media analysis since it entails the compu-
for locating potential starting points for various online
                                                              tation of various higher order influence metrics. There-
digital campaigns with political, commercial, or social
                                                              fore, a considerable portion of or even the entire graph
topics. Moreover, it determines which sort or messages
                                                              topology and, depending on the problem perhaps the
are appropriate for a given node given its affective state.
                                                              associated functionality, must be factored in. However, a
To this end, a number of node classification techniques or,
                                                              far more accurate insight into the total network dynam-
more recently graph attention-based mechanisms, can be
                                                              ics is obtained. Therefore, approximate analysis of an
applied. Given the phenomenon of homophily in social
                                                              evolving network for a number of steps can take place
media stating that nodes with similar behavior eventu-
                                                              before such a computation can be performed again.
ally tend to connect with each other, the neighborhood
of the vertex under consideration may as well provide
additional affective attributes.                              5. Conclusions
   In a sense the dual problem of the above is finding out
the affective potential of an edge as the latter is primarily This conference paper focuses on a comprehensive pre-
a function of the affective state of its endpoints. How- sentation of a large number of graph neural network
ever, since links in a network may accommodate other architectures tailored for performing affective analysis
communication needs, for instance that of the respec- on social media. The latter abound with heterogeneous


                                                           7
Michail Karavokyris et al. CEUR Workshop Proceedings                                                              1–10


human emotional information coming from sources so                embedding and graph neural network, Information
diverse as text, music, images, and even direct emo-              Sciences 607 (2022) 1617–1636.
tional markings. Therefore, there is more than sufficient     [9] F. Gao, J. Zhang, Y. Zhang, Neural enhanced dy-
space in social networks to develop information process-          namic message passing, in: International Confer-
ing strategies aiming at deducing numerous affective at-          ence on Artificial Intelligence and Statistics, PMLR,
tributes such as word and sentence emotional polarities           2022, pp. 10471–10482.
Such attributes are critical in applications such as politi- [10] S. Miao, M. Liu, P. Li, Interpretable and generaliz-
cal or commercial digital campaigns or even in assisting          able graph learning via stochastic attention mecha-
professionals in timely diagnosing mental illness.                nism, in: ICML, PMLR, 2022, pp. 15524–15543.
   Regarding future research directions, more affective [11] P. Veličković, G. Cucurull, A. Casanova, A. Romero,
applications of GNNs can be explored. Moreover, new               P. Lio, Y. Bengio, Graph attention networks, 2017.
GNN architectures may be better suited for the tasks              arXiv:1710.10903 .
presented here.                                              [12] Y. Long, M. Wu, Y. Liu, Y. Fang, C. K. Kwoh, J. Chen,
                                                                  J. Luo, X. Li, Pre-training graph neural networks
                                                                  for link prediction in biomedical networks, Bioin-
Acknowledgments                                                   formatics 38 (2022) 2254–2262.
                                                             [13] K. Han, Y. Wang, J. Guo, Y. Tang, E. Wu, Vision
This conference paper is part of Project 451, a long term
                                                                  GNN: An image is worth graph of nodes, 2022.
research initiative with a primary objective of develop-
                                                                  arXiv:2206.00272 .
ing novel, scalable, numerically stable, and interpretable
                                                             [14] T. N. Kipf, M. Welling, Semi-supervised classifi-
higher order analytics.
                                                                  cation with graph convolutional networks, 2016.
                                                                  arXiv:1609.02907 .
References                                                   [15] A. Gupta, P. Matta, B. Pant, Graph neural net-
                                                                  work: Current state of art, challenges and appli-
  [1] S. Rani, A. K. Bashir, A. Alhudhaif, D. Koundal, E. S.      cations, Materials Today: Proceedings 46 (2021)
      Gunduz, et al., An efficient CNN-LSTM model for             10927–10932.
      sentiment detection in #BlackLivesMatter, Expert [16] Y. Chen, Y. Zheng, Z. Xu, T. Tang, Z. Tang,
      Systems with Applications 193 (2022).                       J. Chen, Y. Liu, Cross-domain few-shot classifi-
  [2] W. Luo, W. Zhang, Y. Zhao, A survey of transformer          cation based on lightweight Res2Net and flexible
      and GNN for aspect-based sentiment analysis, in:            GNN, Knowledge-Based Systems 247 (2022).
      CISAI, IEEE, 2021, pp. 353–357.                        [17] G. Drakopoulos, I. Giannoukou, P. Mylonas,
  [3] A. Rodriguez, Y.-L. Chen, C. Argueta, FADOHS:               S. Sioutas, A graph neural network for assessing
      Framework for detection and integration of unstruc-         the affective coherence of Twitter graphs, in: IEEE
      tured data of hate speech on facebook using senti-          Big Data, IEEE, 2020, pp. 3618–3627. doi:10.1109/
      ment and emotion analysis, IEEE Access 10 (2022)            BigData50022.2020.9378492 .
      22400–22419.                                           [18] G. Drakopoulos, E. Kafeza, P. Mylonas, S. Sioutas,
  [4] M. Bossetta, R. Schmøkel, Cross-platform emotions           A graph neural network for fuzzy Twitter graphs,
      and audience engagement in social media politi-             in: G. Cong, M. Ramanath (Eds.), CIKM companion
      cal campaigning: Comparing candidates’ Facebook             volume, volume 3052, CEUR-WS.org, 2021.
      and Instagram images in the 2020 US election, Po- [19] Y. Liu, K. Zeng, H. Wang, X. Song, B. Zhou, Content
      litical Communication (2022) 1–21.                          matters: A GNN-based model combined with text
  [5] V. Ahire, S. Borse, Emotion detection from social           semantics for social network cascade prediction, in:
      media using machine learning techniques: A sur-             PAKDD, Springer, 2021, pp. 728–740.
      vey, in: Applied Information Processing Systems, [20] Z. Guo, H. Wang, A deep graph neural network-
      Springer, 2022, pp. 83–92.                                  based mechanism for social recommendations,
  [6] S. E. McComb, J. S. Mills, Young women’s body               IEEE Transactions on Industrial Informatics 17
      image following upwards comparison to Instagram             (2020) 2776–2783.
      models: The role of physical appearance perfection- [21] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L.
      ism and cognitive emotion regulation, Body Image            Hamilton, J. Leskovec, Graph convolutional neural
      38 (2021) 49–62.                                            networks for Web-scale recommender systems, in:
  [7] B. Sanchez-Lengeling, E. Reif, A. Pearce, A. B.             KDD, ACM, 2018, pp. 974–983.
      Wiltschko, A gentle introduction to graph neural [22] G. Drakopoulos, I. Giannoukou, P. Mylonas,
      networks, Distill 6 (2021) e33.                             S. Sioutas, On tensor distances for self organiz-
  [8] S. Kumar, A. Mallik, A. Khetarpal, B. Panda, Influ-         ing maps: Clustering cognitive tasks, in: DEXA,
      ence maximization in social networks using graph            volume 12392 of Lecture Notes in Computer Sci-


                                                          8
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                   1–10


     ence, Springer, 2020, pp. 195–210. doi:10.1007/                   Transform-based graph topology similarity met-
     978- 3- 030- 59051- 2\_13 .                                       rics, NCAA 33 (2021) 16363–16375. doi:10.1007/
[23] W. Xia, Y. Li, J. Wu, S. Li, Deepis: Susceptibility               s00521- 021- 06235- 9 .
     estimation on social networks, in: WSDM, 2021,               [36] A. Khamparia, D. Gupta, N. G. Nguyen, A. Khanna,
     pp. 761–769.                                                      B. Pandey, P. Tiwari, Sound classification using con-
[24] A. Benamira, B. Devillers, E. Lesot, A. K. Ray,                   volutional neural network and tensor deep stacking
     M. Saadi, F. D. Malliaros, Semi-supervised learning               network, IEEE Access 7 (2019) 7717–7727.
     and graph neural networks for fake news detection,           [37] L. Zhou, S. Zhang, J. Yu, X. Chen, Spatial–temporal
     in: ASONAM, EEE, 2019, pp. 568–569.                               deep tensor neural networks for large-scale ur-
[25] S. Xu, X. Liu, K. Ma, F. Dong, B. Riskhan, S. Xiang,              ban network speed prediction, IEEE Transactions
     C. Bing, Rumor detection on social media using                    on Intelligent Transportation Systems 21 (2019)
     hierarchically aggregated feature via graph neural                3718–3729.
     networks, Applied Intelligence (2022) 1–14.                  [38] G. Drakopoulos, I. Giannoukou, S. Sioutas, P. My-
[26] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu,                lonas, Self organizing maps for cultural con-
     L. Wang, C. Li, M. Sun, Graph neural networks:                    tent delivery,         NCAA (2022). doi:10.1007/
     A review of methods and applications, AI Open 1                   s00521- 022- 07376- 1 .
     (2020) 57–81.                                                [39] H. Song, M. Kim, D. Park, Y. Shin, J.-G. Lee, Learn-
[27] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y.                  ing from noisy labels with deep neural networks:
     Philip, A comprehensive survey on graph neural                    A survey, IEEE Transactions on Neural Networks
     networks, IEEE Transactions on neural networks                    and Learning Systems (2022).
     and learning systems 32 (2020) 4–24.                         [40] H. Yuan, H. Yu, S. Gui, S. Ji, Explainability in graph
[28] R. Zhao, Z. Yang, H. Zheng, Y. Wu, F. Liu, Z. Wu,                 neural networks: A taxonomic survey, IEEE Trans-
     L. Li, F. Chen, S. Song, J. Zhu, et al., A framework              actions on Pattern Analysis and Machine Intelli-
     for the general design and computation of hybrid                  gence (2022).
     neural networks, Nature communications 13 (2022)             [41] L. Torres, P. Suárez-Serrato, T. Eliassi-Rad, Non-
     1–12.                                                             backtracking cycles: Length spectrum theory and
[29] K. T. Schütt, M. Gastegger, A. Tkatchenko, K.-R.                  graph mining applications, Applied Network Sci-
     Müller, R. J. Maurer, Unifying machine learning and               ence 4 (2019) 1–35.
     quantum chemistry with a deep neural network for             [42] Y. Gao, X. Li, J. Li, Y. Gao, N. Guo, Graph mining-
     molecular wavefunctions, Nature communications                    based trust evaluation mechanism with multidimen-
     10 (2019) 1–10.                                                   sional features for large-scale heterogeneous threat
[30] X. Jiang, L. Zhang, L. Qiao, D. Shen, Estimating func-            intelligence, in: IEEE International Conference on
     tional connectivity networks via low-rank tensor                  Big Data, IEEE, 2018, pp. 1272–1277.
     approximation with applications to MCI identifica-           [43] M. Koohi Esfahani, P. Kilpatrick, H. Vandieren-
     tion, IEEE Transactions on Biomedical Engineering                 donck, LOTUS: Locality optimizing triangle count-
     67 (2019) 1912–1920.                                              ing, in: Proceedings of the 27th ACM SIGPLAN
[31] R. Agarwal, L. Melnick, N. Frosst, X. Zhang,                      Symposium on Principles and Practice of Parallel
     B. Lengerich, R. Caruana, G. E. Hinton, Neural                    Programming, 2022, pp. 219–233.
     additive models: Interpretable machine learning              [44] M. Yoon, T. Gervet, B. Hooi, C. Faloutsos, Au-
     with neural nets, Advances in Neural Information                  tonomous graph mining algorithm search with best
     Processing Systems 34 (2021) 4699–4711.                           performance trade-off, Knowledge and Information
[32] L. V. Jospin, H. Laga, F. Boussaid, W. Buntine,                   Systems (2022) 1–32.
     M. Bennamoun, Hands-on Bayesian neural net-                  [45] G. Drakopoulos, E. Kafeza, P. Mylonas, S. Sioutas,
     works – A tutorial for deep learning users, IEEE                  Approximate high dimensional graph mining with
     Computational Intelligence Magazine 17 (2022)                     matrix polar factorization: A Twitter application,
     29–48.                                                            in: IEEE Big Data, IEEE, 2021, pp. 4441–4449. doi:10.
[33] J. Guo, K. Han, H. Wu, Y. Tang, X. Chen, Y. Wang,                 1109/BigData52589.2021.9671926 .
     C. Xu, CMT: Convolutional neural networks meet               [46] K. Wang, Z. Zuo, J. Thorpe, T. Q. Nguyen, G. H.
     vision transformers, in: CVPR, IEEE/CVF, 2022, pp.                Xu, RStream: Marrying relational algebra with
     12175–12185.                                                      streaming for efficient graph mining on a single
[34] L. G. Wright, T. Onodera, M. M. Stein, T. Wang,                   machine, in: OSDI, 2018, pp. 763–782.
     D. T. Schachter, Z. Hu, P. L. McMahon, Deep physi-           [47] G. Drakopoulos, E. Kafeza, One dimensional cross-
     cal neural networks trained with backpropagation,                 correlation methods for deterministic and stochas-
     Nature 601 (2022) 549–555.                                        tic graph signals with a Twitter application in Ju-
[35] G. Drakopoulos, E. Kafeza, P. Mylonas, L. Iliadis,                lia, in: SEEDA-CECNSM, IEEE, 2020. doi:10.1109/


                                                              9
Michail Karavokyris et al. CEUR Workshop Proceedings                                                                1–10


     SEEDA- CECNSM49515.2020.9221815 .                             China, Computers in Human Behavior 127 (2022).
[48] Z. Chen, S. Villar, L. Chen, J. Bruna, On the equiva-    [62] G. Drakopoulos, E. Kafeza, P. Mylonas,
     lence between graph isomorphism testing and func-             H. Al Katheeri, Building trusted startup teams from
     tion approximation with GNNs, Advances in neural              LinkedIn attributes: A higher order probabilistic
     information processing systems 32 (2019).                     analysis, in: ICTAI, IEEE, 2020, pp. 867–874.
[49] W. Jin, Graph mining with graph neural networks,              doi:10.1109/ICTAI50040.2020.00136 .
     in: Proceedings of the 14th ACM International Con-       [63] D. P. Cingel, M. C. Carter, H.-V. Krause, Social media
     ference on Web Search and Data Mining, 2021, pp.              and self-esteem, Current Opinion in Psychology
     1119–1120.                                                    (2022).
[50] Q. Zhang, X. Song, Y. Yang, H. Ma, R. Shibasaki,         [64] S. P. Eslami, M. Ghasemaghaei, K. Hassanein, Un-
     Visual graph mining for graph matching, Computer              derstanding consumer engagement in social media:
     Vision and Image Understanding 178 (2019) 16–29.              The role of product lifecycle, Decision Support
[51] F. Ebrahimi, A. Asemi, A. Nezarat, A. Ko, Develop-            Systems 162 (2022).
     ing a mathematical model of the co-author recom-         [65] D. Valle-Cruz, V. Fernandez-Cortez, A. López-Chau,
     mender system using graph mining techniques and               R. Sandoval-Almazán, Does Twitter affect stock
     big data applications, Journal of Big Data 8 (2021)           market decisions? Financial sentiment analysis dur-
     1–15.                                                         ing pandemics: A comparative study of the H1N1
[52] R. S. Olayan, H. Ashoor, V. B. Bajic, DDR: Efficient          and the COVID-19 periods, Cognitive computation
     computational method to predict drug–target inter-            14 (2022) 372–387.
     actions using graph mining and machine learning          [66] J. R. Saura, D. Palacios-Marqués, D. Ribeiro-Soriano,
     approaches, Bioinformatics 34 (2018) 1164–1173.               Exploring the boundaries of open innovation: Ev-
[53] G. Drakopoulos, E. Kafeza, H. Al Katheeri, Proof              idence from social media mining, Technovation
     systems in blockchains: A survey, in: SEEDA-                  (2022).
     CECNSM, IEEE, 2019. doi:10.1109/SEEDA- CECNSM.           [67] M. Marountas, G. Drakopoulos, P. Mylonas,
     2019.8908397 .                                                S. Sioutas,        Recommending database archi-
[54] C. Fan, M. Song, F. Xiao, X. Xue, Discovering com-            tectures for social queries: A Twitter case
     plex knowledge in massive building operational                study, in: AIAI, Springer, 2021. doi:10.1007/
     data using graph mining for building energy man-              978- 3- 030- 79150- 6\_56 .
     agement, Energy Procedia 158 (2019) 2481–2487.           [68] A. Kapoor, D. Guhathakurta, M. Mathur, R. Yadav,
[55] J. Kang, J. He, R. Maciejewski, H. Tong, Inform:              M. Gupta, P. Kumaraguru, Tweetboost: Influence
     Individual fairness on graph mining, in: KDD, 2020,           of social media on NFT valuation, in: Compan-
     pp. 379–389.                                                  ion Proceedings of the Web Conference, 2022, pp.
[56] W. Lin, M. Ma, D. Pan, P. Wang, Facgraph: Frequent            621–629.
     anomaly correlation graph mining for root cause          [69] B. Nyagadza, Search engine marketing and social
     diagnose in micro-service architecture, in: IPCCC,            media marketing predictive trends, Journal of Digi-
     IEEE, 2018, pp. 1–8.                                          tal Media & Policy (2022).
[57] S. Tabassum, F. S. Pereira, S. Fernandes, J. Gama,       [70] D. Naik, D. Ramesh, A. H. Gandomi, N. B. Goro-
     Social network analysis: An overview, Wiley Inter-            janam, Parallel and distributed paradigms for com-
     disciplinary Reviews: Data Mining and Knowledge               munity detection in social networks: A method-
     Discovery 8 (2018).                                           ological review, Expert Systems with Applications
[58] F. Baumann, P. Lorenz-Spreen, I. M. Sokolov,                  187 (2022).
     M. Starnini, Modeling echo chambers and polariza-        [71] X. Kong, Y. Shi, S. Yu, J. Liu, F. Xia, Academic
     tion dynamics in social networks, Physical Review             social networks: Modeling, analysis, mining and
     Letters 124 (2020).                                           applications, Journal of Network and Computer
[59] R. Urena, G. Kou, Y. Dong, F. Chiclana, E. Herrera-           Applications 132 (2019) 86–103.
     Viedma, A review on trust propagation and opin-          [72] M. B. Dahesh, G. Tabarsa, M. Zandieh,
     ion dynamics in social networks and group deci-               M. Hamidizadeh,          Reviewing the intellectual
     sion making frameworks, Information Sciences 478              structure and evolution of the innovation systems
     (2019) 461–475.                                               approach: A social network analysis, Technology
[60] D. Ye, S. Pennisi, Analysing interactions in on-              in Society 63 (2020).
     line discussions through social network analysis,        [73] T. Basar, Variations on the theme of the witsen-
     Journal of Computer Assisted Learning 38 (2022)               hausen counterexample, in: Conference on Deci-
     784–796.                                                      sion and Control, IEEE, 2008, pp. 1614–1619.
[61] M. Zhang, P. Xu, Y. Ye, Trust in social media brands
     and perceived media values: A survey study in


                                                         10

</pre>