<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Data Min. Knowl. Discov.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/S10618-012-0278-6</article-id>
      <title-group>
        <article-title>Integrating Semantic, Social, and Spatial Dimensions for Inductive Malicious User Detection in Social Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Benedetti</string-name>
          <email>francesco.benedetti@phd.unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Pellicani</string-name>
          <email>antonio.pellicani@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianvito Pio</string-name>
          <email>gianvito.pio@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michelangelo Ceci</string-name>
          <email>michelangelo.ceci@uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data Science Lab, National Interuniversity Consortium for Informatics (CINI)</institution>
          ,
          <addr-line>Via Volturno, 58, 00185 Roma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Computer Science, University of Bari</institution>
          ,
          <addr-line>Via Edoardo Orabona 4, Bari, 70125</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dept. of Computer Science, University of Pisa</institution>
          ,
          <addr-line>Largo Bruno Pontecorvo 3, Pisa, 56127</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Jožef Stefan Institute</institution>
          ,
          <addr-line>Jamova Cesta 39, 1000 Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <volume>25</volume>
      <issue>2012</issue>
      <abstract>
        <p>Social networks have become central platforms for shaping public discourse, influencing opinions, and facilitating communication. However, these platforms also increasingly serve as breeding grounds for radicalization and the dissemination of hateful or criminal ideologies. With the exponential growth of users and content on social networks, an efective monitoring and detection of harmful actors have become critical for both societal wellbeing and security. In this discussion paper, we introduce IMMENSE, a novel machine learning-based system for detecting malicious social media accounts. Our framework leverages a multi-perspective approach that integrates three complementary dimensions to classify users: the semantics of the content they generate, the topology of their social relationship network, and the spatial information derived from their geographical position. The key innovation of our system lies in its inductive architecture, which enables generalization to previously unseen users or entirely new networks without requiring retraining, thus achieving significant advancement in both eficiency and practical applicability. We validate IMMENSE against a state-of-the-art transductive system using two diverse datasets extracted from the X social network, demonstrating competitive performance despite the inherent additional challenges introduced by the inductive setting.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Social network analysis</kwd>
        <kwd>Malicious user classification</kwd>
        <kwd>Inductive learning</kwd>
        <kwd>Multi-perspective classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Social networks are online platforms that facilitate interpersonal connections, communication, and
interest-sharing. Over the last decade, they have become increasingly popular, emerging as one of
the primary media for communication, information dissemination, and entertainment for a significant
portion of the population. Their social impact is multifaceted, as their proliferation also came with
a substantial increase of malicious activities such as cyberbullying, spam attacks, misinformation
propagation, extreme political or religious views, and recruitment for illicit purposes. This increase in
harmful behaviors highlights the need for robust detection mechanisms to identify potentially dangerous
users and ensure the safety and integrity of these platforms.</p>
      <p>
        Social networks can formally be modeled as graphs, where nodes represent users and edges denote
the relationships between them, such as following or friendship. Accordingly, the detection of dangerous
users falls under the node classification umbrella, that corresponds to labeling each user as either risky
or safe. In the literature, several works attempted to solve this task using diferent approaches. Methods
that classify users based solely on the posted content are commonly referred to as content-based, and are
mostly employed to classify individual posts rather than users. An example is [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where the authors
aim to detect bot accounts. The authors of [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] further extend this work by evaluating various classifiers
that exploit content and metadata from user profiles to identify spambots and fake followers on X.
      </p>
      <p>
        On the other hand, approaches that focus on the structure of the network of users relationships are
commonly called topology-based. A relevant example is SybilRank [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that exploits early-terminated
random walks to detect fake users in a social network. Similarly, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] applies random walk-based
techniques to identify scammers in Web3 transaction graphs.
      </p>
      <p>
        Finally, hybrid methods aim to integrate multiple aspects. Ribeiro et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] analyze both user-generated
contents and social relationship graphs, comparing a GNN-based model against gradient-boosted tree
classifiers for identifying hateful users. Wang et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] consider user’s demographic features, as well
as the social connections, the generated content, and dynamic features in the Momo social network
to detect malicious accounts. Similarly, a hybrid approach was implemented for detecting malicious
users on GitHub in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which evaluates profile characteristics, user activity patterns, and interactions
with both other users and repositories. Hybrid strategies have proven to be the most efective for
detecting malicious users/accounts, as they consider multiple perspectives and are more dificult to
deceive. These approaches often analyze diferent aspects of social networks using separate specialized
modules, with a decision maker providing the final classification. This approach is adopted in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], although in the latter the authors employed this strategy to analyze textual (i.e., posts authored
by the users and received comments) and non-textual (user characteristics and social relationships)
attributes. Another example of a hybrid approach is SAIRUS [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a multi-view user classification
framework that integrates three distinct perspectives: user-generated content, social relationships, and
geospatial proximity among users. For network representation, SAIRUS adopts Node2Vec, which, as a
random walk-based embedding technique, operates in a transductive setting: SAIRUS requires that all
users (even the unlabeled ones) need to be known during the training phase. This limitation makes
it inapplicable in dynamic real-world environments, where the classification of a novel user (unseen
during the training of the model) would require a full re-training of the model.
      </p>
      <p>To fill this gap, in this paper we discuss our method IMMENSE, a hybrid, multi-perspective, inductive
system for the detection of malicious/risky users. The adoption of inductive techniques makes IMMENSE
capable of learning models that can generalize to new, unseen users. In the following sections, we
provide a detailed description of IMMENSE and evaluate its efectiveness on two real-world datasets.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The proposed method</title>
      <p>Before describing our method, we formalize some key aspects. A social network can be represented as
a graph defined by a triple ⟨, ,  ⟩, where:
•  is the set of nodes, with each node representing a user.
•  is the set of textual contents, i.e., posts created by each user. Each post can possibly be associated
with the geographical position in which the user was located when such a post was generated.
•  ⊆  ×  is the set of topological relationships among users. Without loss of generality,
social relationships are represented as directed links.</p>
      <p>IMMENSE employs three specialized modules, respectively, for the semantic analysis of the textual
content, for the analysis of social relationships, and for the analysis of spatial relationships among users,
followed by a model that combines their contributions to make the final decision. A graphical view of
IMMENSE is provided in Figure 1, while in the following subsections we describe in detail each module.</p>
      <sec id="sec-2-1">
        <title>2.1. Semantic Content analysis</title>
        <p>
          The goal of this module is to provide an initial classification of users considering only the semantics
of the content they posted. For this purpose, we initially perform a preprocessing step that, for each
user, concatenates the posts into a unified text and performs tokenization, stopword removal, and
stemming. Then, a Word2Vec [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] model is trained on the resulting corpus. In particular, for each
user, the embeddings of the words appearing in their posts are aggregated through summation into a
single vector representing the semantics of the published content. Although more recent approaches
(e.g., BERT-based or LLM-based) could be used to identify a proper embedding of the textual content,
Word2Vec proved to be more accurate in previous studies [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ].
        </p>
        <p>Finally, a dedicated autoencoder is trained for safe users and for risky users, separately. Specifically,
the autoencoder  is trained using the semantic vectors of users labeled as safe, while the autoencoder
 is learned from the vector representations of risky users. Subsequently, each user embedding vector
is processed through both autoencoders, leading to two reconstruction errors  and , calculated as
the mean squared error between the original and reconstructed vectors. Therefore, this module outputs
the two reconstruction errors  and , and a label () computed as () = 0 if  &lt; , 1
otherwise, where 0 and 1 indicate the safe and risky labels, respectively.</p>
        <p>
          We use two separate autoencoders because, as emphasized in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], employing one distinct model per
class rather than a single binary classifier provides greater stability in situations of label imbalance,
that is the situation we expect in our scenario.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Analysis of social relationships</title>
        <p>Nowadays, in social networks we can find a significant number of passive users, i.e., individuals who
rarely share contents but primarily follow and consume content posted by other users. For these
users, without complementary information, it would be very dificult to provide an estimate of the risk.
Therefore, it is fundamental to integrate such information with interactions among users. Consequently,
in this module, we analyze the social relationships among users as represented by  .</p>
        <p>
          For this task, Graph Neural Networks (GNNs) have emerged as state-of-the-art approaches for
processing graph-structured data and extracting latent representations of nodes [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. An example of
such GNNs is GraphSAGE [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], which generates the embedding of each node in the graph by sampling
and aggregating features from neighboring nodes. In particular, GraphSAGE ofers two significant
advantages over other approaches (e.g., Node2Vec, adopted in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]): i) it samples only a subset of
each node’s neighborhood, enabling eficient scaling to large networks, and ii) it provides inductive
embedding capabilities, allowing generalization to previously unseen nodes or entirely new networks.
Therefore, in IMMENSE, we perform the analysis of social relationships by stacking two GraphSAGE
layers (see Figure 1). Notably, each layer allows expansion of the considered neighborhood by one
hop, resulting in a comprehensive two-hop neighborhood analysis when computing user embeddings.
Furthermore, since GraphSAGE directly exploits the features of neighboring nodes, we associated
each user in the graph with features corresponding to the user’s semantic representation obtained
in the semantic content analysis. In this way, although possibly introducing some redundancy with
the information conveyed by the previous module, relationships are modeled while simultaneously
considering the content posted by users.
        </p>
        <p>
          For this module, contrary to the previous one, we adopt a tree-based classifier, due to their proven
performances with network data [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Specifically, the node embeddings resulting from GraphSAGE
serve as input to a random forest classifier that produces two outputs for each user: the classification
label  and an associated confidence value . The confidence is computed by averaging the purity
of the leaf nodes where each instance falls across all trees in the forest.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Analysis of spatial relationships</title>
        <p>Users who live in close geographical proximity often share similar real-life experiences, cultural contexts,
and have increased opportunities for in-person interactions. These shared environmental factors can
significantly influence opinions, behaviors, and vulnerability to certain types of content. In order to
capture this precious information, we construct a network of spatial relationships that models the
geographical connections among users. This task is performed through two key steps. First, we estimate
each user’s physical location by identifying the most frequent coordinates (latitude and longitude)
associated with published posts1. Such coordinates, when available, are metadata associated with the
posts. Second, we compute a weight to associate with the connections among users, that is inversely
proportional to their geographical distance. Specifically, for any pair of users 1 and 2 with respective
latitudes 1, 2 and longitudes  1,  2, we compute their geodetic distance as:
(1, 2) = 2 · arctan
︃( √︃</p>
        <p>(1, 2)
1 − (1, 2)
)︃
where  is the Earth radius (≈ 6371 km) and (1, 2) = sin2 ︁( 1− 2 2 )︁ +cos(1)· cos(2)· sin2 ︁(  1− 2  2 ︁)
is the Haversine Formula. Then, we compute the mean   and standard deviation   among all distances,
that are used to compute the z-score normalization of the distance (1, 2) in  (0,1). Finally, the
weight of the edge linking the two users in the graph is defined as:
(1, 2) =
{︃ (1,2)</p>
        <p>0
if (1, 2) &lt; 0
otherwise
(1)
(2)
where  is the minimum of the normalized distances among users. In other words, if two users are
closer than the average, the closeness score will range in the interval (0, 1], otherwise, their closeness
score will be set to zero (meaning that they are not connected in the spatial network).</p>
        <p>Once the spatial network is built, it is processed in the same way as the social relationships network:
a GraphSAGE-based model is trained, and the obtained embeddings are fed to a random forest classifier.
This module outputs two values for a user: the label  and the confidence value .</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Model Fusion</title>
        <p>After examining each individual perspective within the social network, the final step consists in
integrating their contribution to make a final prediction. For this purpose, we adopt a Multi-Layer
Perceptron (MLP) architecture following the stacked generalization approach. As illustrated in the
right side of Figure 1, the employed MLP processes seven inputs, derived from the preceding
singleperspectives analyses. More specifically:
1Note that if a user never shares the position of its post, he/she will be represented as an isolated node in the graph, with its
embedding corresponding to its original feature vector.</p>
        <p>• Semantic content analysis: the reconstruction errors , , and the classification label ;
• Social relationships analysis: the classification label  and its confidence value ;
• Spatial relationships analysis: the classification label  and its confidence value .
The MLP-based model fusion component processes these diverse features through one hidden layer
with ReLU activation function and produces a final binary classification for each user, exploiting the
softmax function. Notably, this integration approach potentially allows IMMENSE to make more robust
predictions than any single analysis module could achieve independently, in particular when certain
features might be ambiguous or misleading when considered in isolation.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>
        To evaluate IMMENSE, we conducted comparative experiments against its closest counterpart, namely
SAIRUS [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], using two real-world datasets derived from X (formerly Twitter). The first dataset, denoted
as 1, is the same used in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The ground truth for 1 was established using a keyword-based approach:
tweets containing terms from two manually curated lists related to terrorism, extremism, hate speech,
and discrimination against minorities2,3 were flagged as risky. Subsequently, users were classified as
risky if the majority of their published tweets received this flag. The second dataset, denoted as 2,
is a novel dataset constructed using the X API and keywords related to radicalism identified through
the Horizon 2020 project CounteR4. The data collection followed a multi-step process: i) we began
by retrieving up to 1500 tweets containing each radicalism-related keyword; ii) for each author of
these tweets, we collected up to 1000 of their followers to establish a connected social network; iii) to
ensure suficient network connectivity, we kept users who follow more than 5 other users within our
dataset; iv) we gathered the 20 most recent tweets from each user. Then, to create the ground truth
labels, we implemented a semantic similarity approach using Google’s pre-trained Word2Vec model5.
Specifically, we generated semantic vector representations for each user, along with a reference vector
derived from a corpus of known malicious content provided by CounteR project partners. Then, users
whose semantic similarity to the malicious vector exceeded a defined threshold ∆ = 0 .88 received the
risky label, while the remaining users were labeled as safe. Such a threshold was determined through
a preliminary analysis, and led to around 7% of users being labeled as risky, that can be considered
reasonable. Furthermore, to incorporate the possible influence of social network relationships, we
enhanced this labeling strategy by switching safe users to the risky class if at least 10% of their followed
accounts were already identified as risky. We report key statistics about both datasets in Table 1.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Experimental setup</title>
        <p>In order to show the advantages derived from the multi-perspective approach, we performed experiments
with both IMMENSE and SAIRUS, considering several combinations of the three considered dimensions,
namely content (C), social relationships (R), and spatial relationships (S). When a given dimension is
not considered, the values outputted by its corresponding module are set to zero for all the users.</p>
        <p>Both datasets 1 and 2 were split using 80% for training and the remaining 20% for testing. It
is important to note that SAIRUS, being transductive, requires access to the complete network of
relationships and geographical information during training, and cannot provide predictions for users
absent in those networks. IMMENSE, on the other hand, exhibits a key advantage since it enables
generalization to completely new networks in the testing phase. This aspect makes the comparison
inherently unfair in favor of SAIRUS, since the competitor SAIRUS is aware of the users in the testing set,
while we purposely make IMMENSE unaware of them. However, such a comparison allows us to assess
the performance of IMMENSE in such a more challenging scenario, compared with its transductive
counterpart that could not be applied at all to unseen users.</p>
        <p>As evaluation measures, we considered precision, recall, F1 score, and accuracy, computed on the
whole test set and for each class. Given that both datasets exhibit class imbalance, with safe users being
the majority, our primary interest lies in how efectively the systems detect risky users.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Results and discussion</title>
        <p>In Tables 2 and 3, we present the experimental results achieved for datasets 1 and 2, respectively.
The best values achieved for accuracy and micro/macro F1 scores are highlighted in bold.</p>
        <p>Examining the results for dataset 1, we observe a clear benefit of considering multiple perspectives
instead of only the semantics of the posted content. Indeed, when relying solely on the content, both
IMMENSE and SAIRUS demonstrated suboptimal performances, with a similar F1 score and accuracy
of 0.76 and 0.71, respectively. However, incorporating additional dimensions consistently improved
the overall efectiveness. Notably, the inclusion of the spatial dimension generally yielded the most
significant gains, highlighting the importance of geographical relationships in identifying risky users.
Indeed, when the spatial dimension was considered, SAIRUS achieved the best F1 score of 0.88, while
IMMENSE reached 0.86. Generally, IMMENSE is slightly outperformed by SAIRUS, but we remind that
SAIRUS is aware of the testing nodes during the training. These results are further confirmed by looking
at class-specific performances, especially for the risky class. Indeed, in their best configuration (which
is  +  for both systems), SAIRUS again achieves slightly better metrics compared to IMMENSE.</p>
        <p>Shifting our focus to dataset 2, we can see that the results are quite diferent. Indeed, it is evident
that generally, relying solely on the content is enough to achieve strong performances, with both systems
showing identical F1 score (0.92) and accuracy (0.98). Examining the impact of multiple dimensions,
SAIRUS shows modest improvements when adding spatial relationships (see  +  configuration),
reaching its best F1 score of 0.94 for all users and 0.88 for risky users. On the other hand, IMMENSE
shows a diverse behavior across the diferent configurations. In particular, the use of spatial relationships
(see  +  configuration) leads to a significant drop in terms of F1 score, with 0.72 for all users and 0.29
for risky users. This diferent behavior can be due to the fact that, unlike 1, only a small portion of users
in 2 have associated location data (see Table 1). As a result, many users in the spatial graph remain
isolated, causing their node embeddings to merely reflect their initial features without neighborhood
aggregation. SAIRUS does not appear to sufer from this limitation possibly because of its transductive
nature. Indeed, the locations of nodes in the testing set is known during the training, and, therefore,
during the embedding. Therefore, it can exploit such a perspective more comprehensively, even when
it is available for a limited number of users. In any case, IMMENSE achieves its optimal performance
when all three dimensions are exploited ( +  + ), attaining an F1 score of 0.94 that matches
SAIRUS’s best result. More importantly, when examining the risky class specifically, IMMENSE’s full
configuration outperforms all SAIRUS configurations with an F1 score of 0.89 versus SAIRUS’s best of
0.88, even though it works in the more challenging inductive setting.</p>
        <p>Overall, we can conclude that IMMENSE achieves interesting performances, that are comparable
with its transductive closest competitor SAIRUS, in most of the configurations. This is an important
result, considering the additional information from the test set available during the training for SAIRUS.
Moreover, these results confirm the practical applicability of IMMENSE in real-world environments,
since it can be adopted to estimate the risk of new users, without a full re-training of the model.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>In this paper, we presented IMMENSE, an inductive, multi-perspective method for detecting malicious
accounts in social networks. Unlike previous approaches that work in the transductive setting, thus
requiring complete retraining to analyze new users, the primary contribution of IMMENSE is its ability
to generalize to previously unseen users through an inductive learning framework.</p>
      <p>
        Our experimental evaluation on two real-world datasets from X demonstrates that IMMENSE achieves
competitive performance compared to the state-of-the-art transductive approach SAIRUS [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], despite
the additional challenges introduced by the inductive learning. Furthermore, the experiments confirm
that integrating multiple perspectives provides benefits to the classification performance. Particularly,
the spatial dimension, when available, proved to be valuable in identifying risky users, highlighting the
importance of geographical information in understanding user’s behavior.
      </p>
      <p>For future work, we plan to consider an additional critical aspect, that is the temporal dimension.
The personalities of users evolve over time, and so do their behaviors on social media platforms. This
can lead safe users to gradually move towards risky views, or vice versa.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The authors acknowledge the support of the EU Commission through the H2020 Project “CounteR
- Privacy-First Situational Awareness Platform for Violent Terrorism and Crime Prediction, Counter
Radicalisation and Citizen Protection” (Grant N. 101021607), and of the project FAIR - Future AI Research
(PE00000013), spoke 6 - Symbiotic AI, under the NRRP MUR program funded by the NextGenerationEU.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors used Grammarly for grammar and spelling check. The authors reviewed and edited the
content as needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Igawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Barbon</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C. S.</given-names>
            <surname>Paulo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Kido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Guido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L. P.</given-names>
            <surname>Júnior</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. N. da Silva</surname>
          </string-name>
          ,
          <article-title>Account classification in online social networks with lbca and wavelets</article-title>
          ,
          <source>Information Sciences 332</source>
          (
          <year>2016</year>
          )
          <fpage>72</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          ,
          <article-title>Machine learning-based detection and categorization of malicious accounts on social media</article-title>
          , in: International Conference on Human-Computer Interaction, Springer,
          <year>2024</year>
          , pp.
          <fpage>328</fpage>
          -
          <lpage>337</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sirivianos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pregueiro</surname>
          </string-name>
          ,
          <article-title>Aiding the detection of fake accounts in large scale social online services</article-title>
          ,
          <source>in: Proceedings of the 9th USENIX Symposium on NSDI</source>
          <year>2012</year>
          , San Jose, CA, USA, April
          <volume>25</volume>
          -
          <issue>27</issue>
          ,
          <year>2012</year>
          ,
          <string-name>
            <given-names>USENIX</given-names>
            <surname>Association</surname>
          </string-name>
          ,
          <year>2012</year>
          , pp.
          <fpage>197</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <article-title>Detecting malicious accounts in web3 through transaction graph</article-title>
          ,
          <source>in: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>2482</fpage>
          -
          <lpage>2483</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Calais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Almeida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Meira</surname>
          </string-name>
          <string-name>
            <surname>Jr</surname>
          </string-name>
          ,
          <article-title>Characterizing and detecting hateful users on twitter</article-title>
          ,
          <source>in: Proceedings of the international AAAI conference on web and social media</source>
          , volume
          <volume>12</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Deep learning-based malicious account detection in the momo social network</article-title>
          ,
          <source>in: 2018 27th International Conference on Computer Communication and Networks (ICCCN)</source>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hui</surname>
          </string-name>
          ,
          <article-title>Detecting malicious accounts in online developer communities using deep learning</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <fpage>10633</fpage>
          -
          <lpage>10649</lpage>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2023</year>
          .
          <volume>3237838</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , W. Liang,
          <string-name>
            <surname>K.-C. Li</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Uncovering malicious accounts in open mobile social networks using a graph and text-based attention fusion algorithm</article-title>
          ,
          <source>IEEE Internet of Things</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pellicani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Redavid</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Ceci, SAIRUS: spatially-aware identification of risky users in social networks</article-title>
          ,
          <source>Inf. Fusion</source>
          <volume>92</volume>
          (
          <year>2023</year>
          )
          <fpage>435</fpage>
          -
          <lpage>449</lpage>
          . doi:
          <volume>10</volume>
          .1016/J.INFFUS.
          <year>2022</year>
          .
          <volume>11</volume>
          .029.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Eficient estimation of word representations in vector space</article-title>
          , in: Y. Bengio, Y. LeCun (Eds.),
          <source>1st International Conference on Learning Representations, ICLR</source>
          <year>2013</year>
          , Scottsdale, Arizona, USA, May 2-
          <issue>4</issue>
          ,
          <year>2013</year>
          , Workshop Track Proceedings,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Martino</surname>
          </string-name>
          , G. Pio,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Ceci, PRILJ: an eficient two-step method based on embedding and clustering for the identification of regularities in legal case judgments</article-title>
          ,
          <source>Artif. Intell. Law</source>
          <volume>30</volume>
          (
          <year>2022</year>
          )
          <fpage>359</fpage>
          -
          <lpage>390</lpage>
          . doi:
          <volume>10</volume>
          .1007/S10506-021-09297-1.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>G. De Martino</surname>
            , G. Pio,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Ceci, Multi-view overlapping clustering for the identification of the subject matter of legal judgments</article-title>
          ,
          <source>Information Sciences 638</source>
          (
          <year>2023</year>
          )
          <fpage>118956</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bellinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Japkowicz</surname>
          </string-name>
          ,
          <article-title>One-class versus binary classification: Which and when?</article-title>
          ,
          <source>in: 2012 11th International Conference on Machine Learning and Applications</source>
          , volume
          <volume>2</volume>
          ,
          <year>2012</year>
          , pp.
          <fpage>102</fpage>
          -
          <lpage>106</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICMLA.
          <year>2012</year>
          .
          <volume>212</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Murgod</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Reddy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gaddam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Sundaram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Anitha</surname>
          </string-name>
          ,
          <article-title>A survey on graph neural networks and its applications in various domains</article-title>
          ,
          <source>SN Computer Science</source>
          <volume>6</volume>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ying</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <article-title>Inductive representation learning on large graphs</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9</source>
          ,
          <year>2017</year>
          , Long Beach, CA, USA,
          <year>2017</year>
          , pp.
          <fpage>1024</fpage>
          -
          <lpage>1034</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Stojanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Appice</surname>
          </string-name>
          , S. Dzeroski,
          <article-title>Network regression with predictive clustering</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>