<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SEBD</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A Multi-Perspective Approach for Risky User Identification in Social Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Pellicani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianvito Pio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michelangelo Ceci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Big Data Lab, National Interuniversity Consortium for Informatics (CINI)</institution>
          ,
          <addr-line>Via Volturno, 58, 00185 Roma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Computer Science, University of Bari "Aldo Moro"</institution>
          ,
          <addr-line>Via E. Orabona, 4, 70125 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Jožef Stefan Institute</institution>
          ,
          <addr-line>Jamova Cesta 39, 1000 Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>31</volume>
      <fpage>02</fpage>
      <lpage>05</lpage>
      <abstract>
        <p>Social networks have become an integral part of modern communication, allowing people to connect and interact across the globe. However, they also bring along some negative phenomena, such as cyberbullying and social media addiction. As a result, monitoring user behavior and content has become essential to ensure a safe and responsible use of social networks. In this context, we recently proposed a novel system called SAIRUS, that we describe in this discussion paper. SAIRUS adopts three separate models to learn from multiple perspectives of social network data, namely the content posted by users, their relationships and their spatial closeness. We compare the system performance with 13 competitors on two real world datasets, demonstrating its superiority in identifying risky users and its usefulness as a tool for social network analysis.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Social Network Analysis</kwd>
        <kwd>User Risk Identification</kwd>
        <kwd>Spatial Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social networks enable people to connect and share news, opinions, and ideas through actions
such as posting, liking, and following each other. This peculiarity fosters the creation of
relationships and facilitates engagement in discussions on diverse topics and events. The widespread
use of social networks has stimulated extensive research by the scientific community, mainly
based on the use of Social Network Analysis (SNA) processes to explore the relationships and
information exchange among users in the network [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In this context, our goal is to analyze
social networks and identify risky users who engage in bad or illegal activities, such as drug
selling or promotion, political or religious extremism, and discrimination against specific groups.
      </p>
      <p>
        The identification of risky users is important for suspending suspicious accounts and
preventing harmful behaviors in social network platforms. Many recent studies have focused on
this area, including works on cyber-extremism and the identification of jihadist accounts [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
Methodologically, the identification of risky users can be approached as a node classification task
and thus can be generally categorized into three approaches: content-based, topology-based,
and hybrid. Content-based approaches focus on analyzing user-generated content [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], while
topology-based approaches consider only user relationships (e.g., established through following,
liking, or commenting actions) in the network [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. Finally, hybrid approaches combine the
strengths of content-based and topology-based methods, making them particularly efective in
classifying borderline users who may have a mix of both safe and unsafe content or relationships
[
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. A well-known example of users that may fall into this category are journalists.
      </p>
      <p>
        It is noteworthy that social networks have become popular due to the possibility of interacting
with them using mobile devices, which also integrate geolocation mechanisms. However, there
have only been a few early attempts to use the spatial dimension in the analysis of (social)
network data [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ], and many of the existing general approaches are unable to take into
account the information conveyed by the geographic locations of the users, which can implicitly
define new relationships among them. To fill this gap, we proposed SAIRUS [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which takes
into account the content generated by users, their relationships in the network, and their
geographical positions to identify risky users. SAIRUS fuses three node classification models,
each learned from a diferent perspective, using a stacked generalization approach to obtain a
more robust final model, that also exploits the uncertainty of the predictions.
      </p>
      <p>Unlike existing hybrid approaches that inject artificially-defined features related to a
perspective into the other(s), SAIRUS allows a separate focus on each perspective and ultimately
combines their contribution to learn a final classifier. Specifically, for the user-generated content,
SAIRUS uses word embeddings to train two autoencoders specialized in identifying safe and
risky users; for user relationships and spatial information, two separate embeddings based on
the analysis of network data are extracted and two classifiers are trained on top. In the following
section, we provide some details about such approaches adopted by SAIRUS.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The method SAIRUS</title>
      <p>Before introducing SAIRUS, we provide a formal definition of a social network as a 4-tuple:
⟨, ,  ,  ⟩, where:
•  =  ∪  ( ∩  = ∅) is the set of users, either labeled () or unlabeled ( ).</p>
      <p>Each labeled user is associated with the category safe or risky.
•  is the set of textual documents produced by users, that is, the posts. Each document
 ∈  is associated with a timestamp and a geographical location.
•  ⊆  ×  refers to the relationship between users and the textual content they
produce or share, specifically the action of creating or posting a particular textual content.
•  ⊆  ×  represents the topology of the social network, determined by the connections
established between users through social relationships, e.g. follows.</p>
      <p>In Figure 1, the four key stages performed by SAIRUS are depicted: i) the semantic content
analysis of the textual documents generated by users, ii) the network topology analysis of user
relationships, iii) the analysis of spatial closeness among users, and iv) the model fusion. In the
following subsections, we briefly detail each of them.</p>
      <sec id="sec-2-1">
        <title>2.1. Semantic analysis of the user-generated content</title>
        <p>The goal of this stage is to analyze the textual content produced by users and classify them as
either safe or risky. It takes as input the set of textual documents  and the set of relationships
 representing the link between users and the textual documents they posted. SAIRUS first
applies standard Natural Language Processing (NLP) techniques such as tokenization, stopword
removal, and stemming. Then, it concatenates all preprocessed documents posted by each
user, taking into account the temporal order of the documents. This choice allows SAIRUS to
implicitly capture the temporal evolution of the topics discussed by the user.</p>
        <p>
          Subsequently, SAIRUS generates a -dimensional feature vector for each user, by applying
the Word2Vec embedding method on each word of the concatenated documents. Specifically, an
embedding for each user is obtained by summing up the embeddings of the words composing
his/her concatenated document, according to the additive compositionality property [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>In the final step, our attention is directed towards the labeled users . We train two distinct
one-class classifiers using stacked autoencoders:  for the vector representation of labeled
risky users and  for the vector representation of labeled safe users. For the unlabeled users
 ∈  , we provide their corresponding vector representation to both autoencoders  and 
and calculate their reconstruction errors () and (). As a result, the semantic analysis
of a user’s textual content produces three outputs: i) the reconstruction error () obtained
by the autoencoder , ii) the reconstruction error () obtained by the autoencoder ,
iii) the predicted label () ∈ {, } (safe or risky), computed according to the minimum
error achieved by  and . These outputs are used in the model fusion phase (see Figure 1).</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Analysis of the network of relationships</title>
        <p>SAIRUS considers the topology of the social network by directly analyzing the adjacency matrix
 ∈ R||×| |, where  = 1 if (,  ) ∈  ,  = 0 otherwise, and  and  are the -th
and the -th user of the network, respectively. However, the analysis of adjacency matrices
may lead to issues due to high dimensionality and sparseness, since each user usually tends to
establish relationships with a very small percentage of the whole set of users.</p>
        <p>Many existent works rely on dimensionality reduction techniques to address the high
dimensionality and sparseness problems. SAIRUS can work directly on the adjacency matrix
 ∈ R||×| |, or on a transformed matrix ′ ∈ R||×  resulting from the application of a
dimensionality reduction technique to , where  is a user-defined parameter. Specifically,
SAIRUS can exploit PCA, autoencoders and Node2Vec, even if other techniques can be easily
plugged in the workflow.</p>
        <p>
          A node classification model is finally trained using the entire set of labeled users . In
this phase, SAIRUS exploits tree-based classifiers since they proved to provide optimal
performances on classification problems in the semi-supervised scenario [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. When provided with
an unlabeled user  ∈  , the learned decision tree returns the predicted label () and a
confidence value (), which is based on the purity of the training examples associated with
the leaf node where  falls into. The predicted label and confidence value are then used in the
model fusion phase, as illustrated in Figure 1.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Analysis of the spatial closeness among users</title>
        <p>
          Similar to the analysis of the network of relationships, also the spatial analysis exploits an
adjacency matrix built from the social network. In this case, SAIRUS uses a weighted matrix
 ∈ R||×| |, where  = (,  ) corresponds to the spatial closeness between the
user  and the user  . Specifically, (,  ) is based on the geodetic distance (,  )
between the geographical locations of the users  and  that are estimated as the mode of
the geographical locations associated to their posts on the social network. We standardize
the distance (,  ) using the -score normalization, obtaining (,  ), that allows us to
distinguish two groups of user pairs: those who are spatially closer than the average (with
(,  ) &lt; 0) and those who are spatially more distant than the average (with (,  ) ≥ 0).
Accordingly, we calculate (,  ) as follows:
(,  ) =
⎧ (,  ) , if (,  ) &lt; 0
⎨ 
⎩0,
otherwise
(1)
where  is the minimum of the normalized distances between two users. Note that we
further normalize (,  ) over  in order to obtain a value in the range [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ], where 0
means that the users  and  are very far from each other (actually, more than the average)
and 1 means that  and  are located precisely at the same location.
        </p>
        <p>After computing the matrix , we use a dimensionality reduction technique to obtain the
reduced matrix ′ ∈ R||×  , where  is a user-defined parameter. Then, we train a node
classification model on the labeled users . Similar to the approach used for the network
of relationships, we use a decision tree learner, which provides a predicted label () and a
confidence value () for any unlabeled user  ∈  . These outputs are then used in the
model fusion phase (see Figure 1).</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Model Fusion</title>
        <p>
          The aim of the last step is to combine the results of the models based on the textual content,
the network topology, and the spatial dimension to classify the unlabeled users in  . In
SAIRUS, we use a Multi-Layer Perceptron (MLP) model to perform this task, following the
Stacked Generalization approach [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>The chosen MLP architecture is depicted in the bottom of Figure 1. It has an input layer
comprising of 7 neurons, which considers the following inputs for a given user : i) the
reconstruction error values of the safe autoencoder () and risky autoencoder (), along
with the predicted label () derived from the semantic analysis component for the textual
content; ii) the predicted label () and confidence value () obtained from the component
responsible for analyzing the network of relationships; iii) the predicted label () and the
confidence value () obtained from the component responsible for the spatial analysis. We
use the sigmoid activation function in the hidden layer to capture any non-linear relationships
between the input and output variables. In contrast, we use the softmax activation function in
the output layer for the final classification.</p>
        <p>It is noteworthy that our approach, which uses the stacked generalization framework, does
not require any user-defined criteria/weight to merge the outputs of three distinct models.
Moreover, in contrast to ensemble techniques that solely rely on combining predictions ( (),
(), and (), in our case), SAIRUS can incorporate other features such as reconstruction
errors () and (), and prediction confidences () and (), that make it more robust
to the uncertainty of the predictions and to the possible presence of noise in the data.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>We collected a real-world dataset from Twitter to evaluate the performance of SAIRUS. The
dataset was associated with sentiment scores for each tweet, which were computed using the
Stanford CoreNLP Toolkit and manually revised by three domain experts.</p>
      <p>To label users as either risky or safe, two strategies were employed. The first strategy relied
on identifying tweets containing specific keywords related to threats, terrorism, hate against
immigrants, and women. The second strategy assigned a score to each user by summing the
sentiment scores of their tweets. The assumption was that users with a higher number of
negative sentiment tweets are more likely to be risky.</p>
      <p>To ensure the accuracy of the labeling process, we initially labelled the top-ranked users as safe
and the bottom-ranked users as risky, whose posts were also manually inspected by three expert
reviewers. We also introduced a set of borderline users, who were initially classified as risky
but had mostly safe connections, to introduce noisy data under controlled conditions. These
users may correspond to journalists who share negative content for informational purposes, but
have mostly connections with safe users. The resulting datasets consisted of 2241 safe users
(including 263 borderline users) and 1467 risky users for the keyword strategy, and 2047 safe
users (including 304 borderline users) and 1033 risky users for the sentiment strategy, with
11,659,043 and 13,970,379 tweets, respectively.</p>
      <p>
        We assessed the performance of SAIRUS using PCA, Node2Vec, and Autoencoders for the
reduction of the dimensionality. We also evaluated the results with diferent values of the
embedding dimensionality, namely  for the semantic analysis of the textual content,  for the
analysis of the network of relationships, and  for the spatial analysis. After conducting some
preliminary evaluations, we chose the following parameter combinations for the experiments:
⟨=128, =256, =256⟩, ⟨=256, =128, =128⟩, and ⟨=512, =128, =128⟩. For space
constraints, here we report the best results (all the results can be found in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]).
      </p>
      <p>We compared the performance of SAIRUS with several other methods, including a Random
Forest model (RF) with 100 trees, and two one-class classifiers based on autoencoders ( 1C-AEs)
designed for content-based analysis, which is consistent with the methodology used in SAIRUS.
We used diferent feature sets, each focusing on one or more perspectives, such as content
(C), relationships (R), or spatial (S). For multiple perspectives, we concatenated the feature
sets of each single perspective (C+R, C+S, R+S, and C+R+S). To embed the textual content, we
used state-of-the-art systems such as Word2Vec (w2v) and Doc2Vec (d2v), with embedding
dimensionality set to , which is the same as that used by SAIRUS. The embedding of the
network of relationships and of the spatial closeness, we used Node2Vec (n2v) with embedding
dimensionality set to  and , respectively, following the setting adopted for SAIRUS.</p>
      <p>We adopted a stratified 5-fold cross-validation technique, which preserved the proportion of
safe and risky users, as well as the ratio of borderline users within safe users. Our evaluation
metrics included precision, recall, F1-Score, and accuracy, with the positive class being the risky
label. In addition, we computed these measures specifically on the borderline users to determine
the performance of the methods in handling noisy data.</p>
      <sec id="sec-3-1">
        <title>3.1. Results</title>
        <p>
          In Tables 1 and 2, we show the results obtained on the sentiment dataset and on the keywords
dataset, respectively, where we emphasize the best result obtained for a given evaluation
measure. By looking at the competitor solutions solely based on textual content, we notice that
the use of w2v generally leads to better results than d2v (as also observed in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]). On the other
Results on the sentiment dataset, with  = 512,  = 128,  = 128.
one solution over the other. However, the adoption of features related to user relationships (R),
to the spatial closeness (S), or a combination of these perspectives did not seem to provide a
clear contribution to the competitors. This result confirms that simply injecting features coming
from one perspective into the other could also compromise the classifier performances due to
the possible introduction of issues related to the course of dimensionality.
        </p>
        <p>In contrast, SAIRUS achieved the best results when leveraging the network of user
relationships or the spatial dimension (or both). This was particularly evident in the sentiment dataset,
where the F1-score reached ∼ 0.8 when both user relationships and the spatial analysis were
considered. These results demonstrate that the fusion strategy adopted by SAIRUS is more
efective than the concatenation of features. In the keywords dataset, the configuration that
leveraged both textual content and spatial analysis slightly emerged as the best. These results
confirmed the relevance of the spatial perspective and the importance of properly modeling
and exploiting it through a smart fusion strategy. Moreover, the obtained results prove that the
spatial dimension is an important factor for predicting borderline users, regardless of network
representation used. In other words, incorporating spatial information improves the accuracy
of predictions for borderline users.</p>
        <p>SAIRUS outperformed competitors in both datasets, demonstrating its ability to efectively
Results on the keywords dataset, with  = 128,  = 256,  = 256
for distinguishing between safe and risky users in social networks, paving the way towards its
adoption for the analysis of large amounts of data from geo-located mobile devices.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>This paper discussed SAIRUS, a novel approach for identifying risky users in social networks.
By combining multiple perspectives of social network data, including textual content, user
relationships, and spatial closeness, SAIRUS can accurately classify users, outperforming 13
competitor systems that exploit either one perspective at a time or a combination thereof. In
our experiments, SAIRUS also proved to be robust to the presence of noisy users.</p>
      <p>In addition to its current capabilities, SAIRUS has the potential to incorporate the temporal
dimension related to textual content and detect sudden changes in user behavior. Therefore,
future work will focus on extending SAIRUS to make it able to capture the dynamism of the
network of relationships and spatial closeness among users, providing a more comprehensive
risk assessment of social network users.
The authors acknowledge the support of the European Commission through the H2020 Project
“CounteR - Privacy-First Situational Awareness Platform for Violent Terrorism and Crime
Prediction, Counter Radicalisation and Citizen Protection” (Grant N. 101021607). This work
was also partially supported by the project FAIR - Future AI Research (PE00000013), Spoke 6
Symbiotic AI, under the NRRP MUR program funded by the NextGenerationEU.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tabassum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. S.</given-names>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <article-title>Social network analysis: An overview</article-title>
          ,
          <source>Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>
          <volume>8</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>I. Awan</surname>
          </string-name>
          , Cyber-Extremism:
          <article-title>Isis and the Power of Social Media</article-title>
          ,
          <source>Society</source>
          <volume>54</volume>
          (
          <year>2017</year>
          )
          <fpage>138</fpage>
          -
          <lpage>149</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Rawi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Groshek</surname>
          </string-name>
          ,
          <source>Jihadist Propaganda on Social Media: An Examination of ISIS Related Content on Twitter, Int. Journal of Cyber Warfare and Terrorism</source>
          <volume>8</volume>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Uzel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Saraç</given-names>
            <surname>Eşsiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Ayşe</given-names>
            <surname>Özel</surname>
          </string-name>
          ,
          <article-title>Using fuzzy sets for detecting cyber terrorism and extremism in the text</article-title>
          ,
          <source>in: ASYU</source>
          <year>2018</year>
          ,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Le</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Distributed representations of sentences and documents</article-title>
          ,
          <source>in: International conference on machine learning</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1188</fpage>
          -
          <lpage>1196</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Macskassy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Provost</surname>
          </string-name>
          ,
          <article-title>Classification in networked data: A toolkit and a univariate case study</article-title>
          ,
          <source>Journal of machine learning research 8</source>
          (
          <year>2007</year>
          )
          <fpage>935</fpage>
          -
          <lpage>983</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bilgic</surname>
          </string-name>
          , L. Getoor,
          <article-title>Efective label acquisition for collective classification</article-title>
          ,
          <source>in: Proc. ACM SIGKDD</source>
          <year>2008</year>
          , KDD '08,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2008</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mateen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Iqbal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Aleem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Islam</surname>
          </string-name>
          ,
          <article-title>A hybrid approach for spam detection for twitter</article-title>
          ,
          <source>in: 2017 14th International Bhurban Conference on Applied Sciences and Technology (IBCAST)</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>466</fpage>
          -
          <lpage>471</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Slimi</surname>
          </string-name>
          , I. Bounhas,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Slimani</surname>
          </string-name>
          ,
          <article-title>A hybrid approach for fake news detection in twitter based on user features and graph embedding</article-title>
          ,
          <source>in: Distributed Computing and Internet Technology</source>
          , Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>266</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Medina</surname>
          </string-name>
          , G. Hepner,
          <article-title>Geospatial analysis of dynamic terrorist networks</article-title>
          ,
          <source>in: Values and violence</source>
          , Springer,
          <year>2008</year>
          , pp.
          <fpage>151</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Masood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Abbasi</surname>
          </string-name>
          ,
          <article-title>Using graph embedding and machine learning to identify rebels on twitter</article-title>
          ,
          <source>Journal of Informetrics</source>
          <volume>15</volume>
          (
          <year>2021</year>
          )
          <fpage>101121</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pellicani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Redavid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          , Sairus:
          <article-title>Spatially-aware identification of risky users in social networks</article-title>
          ,
          <source>Information Fusion</source>
          <volume>92</volume>
          (
          <year>2023</year>
          )
          <fpage>435</fpage>
          -
          <lpage>449</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          ,
          <source>CoRR abs/1310</source>
          .4546 (
          <year>2013</year>
          ). arXiv:
          <volume>1310</volume>
          .
          <fpage>4546</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Levatic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kocev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          , S. Dzeroski,
          <article-title>Semi-supervised trees for multi-target regression</article-title>
          ,
          <source>Inf. Sci</source>
          .
          <volume>450</volume>
          (
          <year>2018</year>
          )
          <fpage>109</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Wolpert</surname>
          </string-name>
          , Stacked generalization,
          <source>Neural Networks</source>
          <volume>5</volume>
          (
          <year>1992</year>
          )
          <fpage>241</fpage>
          -
          <lpage>259</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>G. De Martino</surname>
            , G. Pio,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Ceci, PRILJ: an eficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          <volume>30</volume>
          (
          <year>2022</year>
          )
          <fpage>359</fpage>
          -
          <lpage>390</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>