1. Introduction

Bio-inspired algorithms for efective social media profile authenticity verification

Nadir Mahammed

Badia Klouche

Imène Saidi

Miloud Khaldi

Mahmoud Fahsi

0 0 EEDIS Laboratory, Djillali Liabes University , P.O 89 Sidi Bel Abbès 22000 , Algeria 1 LabRI-SBA Laboratory, Ecole Superieure en Informatique Sidi Bel Abbes , P.O 73, El Wiam Sidi Bel Abbés 22016 , Algeria

In the ever-evolving digital era, the profound impact of online social networks is omnipresent. Platforms like Instagram, Facebook, and Twitter grapple persistently with the challenge of distinguishing genuine user profiles from a rising tide of counterfeit or dormant accounts. This predicament underscores the critical need to adeptly diferentiate between authentic and misleading user profiles, particularly in light of the increasing prevalence of online deception. This research centers on introducing an innovative approach to profile validation, highlighting the pivotal task of identifying and mitigating the presence of fake profiles across social media platforms. The methodology employed is groundbreaking, strategically integrating cutting-edge bio-inspired algorithms, with a specific emphasis on the application of metaheuristics. Unlike conventional machine learning techniques, this approach navigates the intricate landscape of online social networks with unparalleled agility and adaptability. Despite the inherent challenges posed by the nature and scarcity of datasets available on the web, the empirical results are remarkably compelling. The approach consistently demonstrates a high level of accuracy in classification tests, showcasing its eficacy in addressing the pervasive issue of fake profiles in the digital realm.

eol>Social media fake profile detection bio-inspired algorithm machine learning simulation

1. Introduction

and efective solution. Such a solution is essential to identify and mitigate the presence of counterfeit accounts, In the ever-evolving landscape of online social networks, ultimately ensuring the creation of a secure and trustworas exemplified by the behemoths Facebook and Twitter, thy environment for the multitude of users frequenting a remarkable surge in user engagement has occurred social networking sites. over recent years. This rapid growth, however, has been In addressing this pressing concern, the authors of this accompanied by a troubling escalation in the presence study have embarked on a transformative journey, deof fake accounts and online impersonation. This issue parting from the well-trodden path of Machine Learning is not only on the rise but has also gained significant (ML) methods to explore the promising realm of metascholarly attention, as evident in [1] report on detecting heuristics. Within this domain, they have harnessed the fake profiles. The essence of these fake profiles lies in capabilities of the Fire Hawk Optimizer (FHO), a contheir representation of fictitious personas or entities that temporary bio-inspired algorithm, to address the multiexpertly mimic real users, raising pertinent concerns faceted challenge of fake profile detection. This unconwithin the online social network ecosystem. ventional approach represents a noteworthy departure

One of the fundamental challenges in this domain from conventional methodologies and stands as a beacon is the absence of robust authentication mechanisms on of innovation, poised to revolutionize the field of online many social networking platforms. These mechanisms social network analysis. are instrumental in efectively distinguishing between The ensuing sections of this comprehensive study genuine user accounts and fraudulent counterparts. As delve into the foundational principles and practical imunderscored by [2]. in their 2022 survey, the deficiencies plications of this pioneering approach. By elucidating its in these mechanisms exacerbate the proliferation of fake diverse facets, the study aims to underscore the transforaccounts, thus prompting a dire need for an innovative mative potential of FHO in the context of enhancing the security and authenticity of online social networks on a 6th International Hybrid Conference On Informatics And Applied Math- global scale. Thus, it transcends mere theoretical exploematics, December 6-7, 2023 Guelma, Algeria ration and emerges as a promising catalyst for substan* Nadir Mahammed tive change in the landscape of social network analysis b$.klno.umcahhea@memsie-dsb@a.edszi-(sBb.aK.dlzou(Nch.eM);aih.saamidmi@ede)s;i-sba.dz (I. Saidi); and the broader digital sphere. m.khladi@esi-sba.dz (M. Khaldi); mahmoud.fahsi@univ-sba.dz (M. Fahsi)

Attribution 4.0 International (CC BY 4.0).

2. Related Work

• Diverse Research Eforts: The table underscores a broad spectrum of research initiatives aimed at fake profile detection, indicating a heightened awareness of the severity of fake profiles in Online Social Networks (OSNs) and the urgency to address this issue. This diversity suggests multiple avenues being explored to tackle the problem. • OSN-Specific Approaches: Several studies focus on specific OSNs like Facebook, Instagram, and Twitter, acknowledging the unique characteristics and challenges of each platform. This prompts the question of whether a universal model can efectively detect fake profiles across various OSNs or if tailored solutions are necessary. • Machine Learning and Metaheuristics: Utilized techniques range from traditional machine learning algorithms (Decision Trees, Random Forest, Support Vector Machine, and K-means) to bioinspired metaheuristics (Satin Bowerbird Optimization and Grey Wolf Optimizer). This mix indicates exploration of both data-driven and heuristic-driven approaches, warranting research into their relative eficacy and optimal use. • Incorporation of Deep Learning: Some studies incorporate deep learning methods, such as Con[3] [4] [5] [6] [ 7 ] [8] [9]

OSN

Facebook Instagram Twitter Facebook Twitter

ML SVM,NB,RF,KNN

RF,DT,SVM

SVM, K-means k-means DT,RF Metaheuristic

SBO GWO

Other CNN,LSTM,RNN CDS Adaboost MapReduce Dataset

volutional Neural Networks, Long Short-Term cial networks proves to be a crucial approach. These opMemory, and Recurrent Neural Networks, high- timization methods ofer notable advantages in terms of lighting the need for advanced methods to com- eficiency, computation time, and resilience to data variabat sophisticated fake profiles employing deep tions—key elements in the field of fake profile detection learning in their creation. on social networks. Metaheuristics excel in efectively • Dataset Size and Quality: Dataset size plays a piv- exploring solution spaces, adapting well to complex landotal role, with some studies employing datasets scapes. This enhanced exploration capability enables concontaining millions of instances. While larger vergence toward high-quality solutions, even in poorly datasets ofer more robust training, they also de- defined search spaces. Moreover, metaheuristics are recmand greater computational resources. Addition- ognized for their computational eficiency, often convergally, dataset quality is crucial, necessitating re- ing to acceptable solutions within reasonable timeframes, search into efective collection and curation tech- making them particularly well-suited for complex probniques. lems. Furthermore, they exhibit robustness in the face • Accuracy Achievements: Notably, some studies of data variations, requiring less dependence on the speachieve very high accuracy levels (e.g., 0.98 and cific nature of the data and demonstrating adaptability 0.99). While promising, it’s vital to scrutinize to incomplete or noisy datasets. the generalization capabilities of these models, as From this bibliographic study, it is deduced that emhigh accuracy on one dataset doesn’t guarantee ploying metaheuristics for detecting fake profiles on sosuccess on new, unseen data. cial networks proves to be a promising approach for ad• Challenges and Future Directions: Challenges dressing challenges in artificial intelligence and machine include the evolving techniques in fake profile learning in this specific domain, ofering high-quality socreation and the need for real-time or near-real- lutions, optimized computation time, and independence time detection. Future research should address from data variations. these challenges and explore methods for dynamic model adaptation. • Integration and Model Ensemble: Combining 3. Material and Methodology strengths from diferent models or creating ensemble models can potentially enhance detection 3.1. Dataset accuracy. Research in this direction could lead to Employing distinct batches for labeling, the dataset conmore robust solutions. struction involved the first batch, which comprised Twit• Explainability and Interpretability: As fake profile ter data sourced from previously banned pro-ISIS acdetection systems are deployed, there’s a grow- counts, serving as positive labels. Specifically, the dataset ing need for interpretability and explainability in "How ISIS Uses Twitter" was utilized 1, encompassing model decisions, especially in legal and ethical 17,350 tweets from over 110 pro-ISIS accounts. This contexts. dataset includes attributes (see table 2 such as Name, • Scalability: Ensuring scalability of fake profile Username, Description, Location, Number of followers at detection methods to handle the increasing vol- the time of tweet download, Number of statuses by the ume of data on OSNs is a significant concern. user when the tweet was downloaded, Date and timesResearch should focus on algorithm eficiency in tamp of the tweet, and the tweet itself. To address Arabic large-scale scenarios. content, the Google Translate API was utilized for transFrom this bibliographic study, it is deduced that em- lation. ploying metaheuristics for detecting fake profiles on so- 1https://www.kaggle.com/fifthtribe/how-isisuses-twitter

For the second batch, the Global Terrorism Database (GTD) was employed as a negative labeled dataset [10] [11] . The GTD contains information on over 180,000 terrorist attacks worldwide since 1970. Filtering events from 2002 onwards, data was extracted from the "summary" column, which provides summaries of each attack.

3.2. Text classification

The process of discerning information from textual input involves three principal stages, as depicted in Figure 1. • Natural Language Processing (NLP): This initial phase focuses on preprocessing textual data, ensuring a well-structured format for ease of understanding and processing. The analysis of textual data unfolds in four essential steps: tagging, annotating, co-reference resolution, and sentiment analysis [12]. • Word Embedding : Embracing the N-gram language model [13], the probability is estimated of the last word based on preceding words. This choice is informed by its superior performance compared to the TF-IDF model [14]. • Classification: Post word embedding, the textual content takes on a numerical form, making it machine-readable. This numerical representation is then input into a classifier, allowing the model to efectively perform the classification task.

3.3. Preprocessing

Data preprocessing is the process of converting raw data into a format that can be readily understood by machine learning algorithms. As detailed in [15], the data preparation procedures for the diferent datasets employed in this research are succinctly outlined below: 1. Data Scrutiny: Eliminate duplications and rectify errors.

a) Eliminate duplications, superfluous data points, inaccuracies, and redundant columns (such as ’id’ and ’id-name’). b) Omit irrelevant data points, inaccuracies, and redundant columns (such as ’id’ and ’id-name’). 2. Address disparities, anomalies, and missing data. 3. Standardize and adapt the data through scaling. 4. Prune interrelated variables and streamline the dataset.

3.4. Machine Learning Algorithms

3.4.1. Induction of Decision Tree When considering decision tree induction, it is noteworthy that ID3 operates as a supervised learning algorithm.

This method constructs a tree based on information derived from training instances, utilizing it for classifying test data [16]. 3.4.2. K-means Algorithm A cornerstone in unsupervised learning for pattern recognition and machine learning, the K-means algorithm is renowned for its simplicity and widespread use among iterative and hill-climbing clustering algorithms [17]. 3.4.3. Hierarchical Clustering Analysis Hierarchical clustering (HC) groups similar objects into clusters. Starting with each object as a separate cluster, it iteratively merges the closest clusters until forming a single, hierarchical structure. This method is valuable for revealing data patterns and relationships [18]. 3.4.4. Nearest Neighbor Classification Often referred to as K-nearest neighbors (KNN), this method is grounded in the concept that the nearest patterns to a target pattern, for which a label is sought, ofer valuable label information [? ]. 3.4.5. Naive Bayes Classifier 3.5.3. Operation Commonly known as NB, the Naive Bayes classifier is a The FHO algorithm, inspired by the foraging behavior of supervised learning algorithm rooted in Bayes’ theorem. fire hawks, operates through the following steps: It operates on the simplifying assumption that attribute values are conditionally independent when considering the target value [19]. 3.4.6. Random Forest Machine Random forests (RF) represent an amalgamation of tree predictors. Each tree relies on the values of a random vector, independently sampled with a uniform distribution shared across all trees within the forest [20]. 3.4.7. Support Vector Machine The Support Vector Machine (SVM) is recognized as a potent tool for classifier construction. SVM is purposefully designed to establish a robust decision boundary between two classes, facilitating the accurate prediction of labels from one or more feature vectors [21].

3.5. Proposed Algorithm

3.5.1. Inspiration Australia’s Indigenous people have a rich history of employing fire as a tool for ecosystem management. Controlled burns, whether ignited intentionally or by lightning, play a crucial role in maintaining the balance of the environment. However, a fascinating revelation involves certain bird species, known as Fire Hawks, which include whistling kites, black kites, and brown falcons. These birds have been observed intentionally carrying burning sticks and using them to start fires as part of their predatory tactics. This behavior is strategic, as the induced ifres serve to startle and capture prey such as rodents, snakes, and other animals, enhancing the eficiency of their hunting endeavors. 3.5.2. Motivation to choose This nature-inspired strategy, finely tuned over eons of evolution, equips the Fire Hawk Optimizer (FHO) for intricate optimization tasks. FHO excels in rapid convergence, surpassing alternative methods. Its robust nature allows efective handling of noisy and uncertain data, contributing to enhanced solution exploration diversity.

The remarkable convergence speed of FHO is valuable in time-sensitive or resource-constrained scenarios. It swiftly reaches optimal solutions through iterations until predefined criteria are met. FHO’s computational eficiency is evident as it converges to the global optimum with fewer evaluations [22]. 1. Initial Positioning: At the start, solution candidates () are defined, representing the positions of fire hawks and prey in the search space. Random initialization places these vectors within the search space, taking into account various parameters. 2. Fire Hawks and Prey: The algorithm categorizes solution candidates into Fire Hawks and prey based on their objective function values. Selected Fire Hawks aim to spread fires around the prey, with the global best solution serving as the primary fire source. 3. Determining Territories: The algorithm calculates the total distance between Fire Hawks and prey to identify the nearest prey to each bird. This step determines the efective territory of the Fire Hawks for hunting. The bird with the best objective function value selects the nearest prey to its territory, while others choose their next nearest prey. 4. Spreading Fires: Fire Hawks collect burning sticks from the main fire and drop them in their territories, causing the prey to flee. Some Fire Hawks may use burning sticks from other territories, contributing to position updates in the search loop. 5. Prey Movements: The prey’s movements within

Fire Hawks’ territories are considered. The algorithm simulates various prey actions, such as hiding, running, or approaching Fire Hawks, impacting position updates. 6. Safe Places: Prey may move toward safe places outside Fire Hawk territories. These movements are also included in the position update process. 7. Territory Denfiition: Fire Hawk territories are represented as circular areas, with the precise territory determined by prey numbers and distances from each Fire Hawk. 8. Boundary Violation and Termination: The algorithm considers boundary control for violating decision variables and employs a termination criterion, such as a predefined number of objective function evaluations or iterations, to conclude the process.

The figure 2 provides pseudocode which ofers a concise overview of the FHO algorithm’s operation. 3.5.4. Transition from natural to artificial This section is devoted to examining the shift from the Fire Hawk’s innate behaviors in the wild to its adapted behaviors in an artificial environment, as detailed in the table 4.

Table 4 delves into a captivating comparison between the natural and artificial, spotlighting the FHO algorithm’s mission of distinguishing genuine from fraud- 3.5.5. Fitness function ulent profiles in online social networks. It intriguingly parallels the hunting behavior of fire hawks with user suitability assessment.

By mentioning distance calculations, it hints at the algorithm’s quest for the optimal solution, equating to precise user classifications in social networks. This table is a gateway to understanding how nature’s wisdom inspires advanced algorithms that address real-world challenges.

It embodies the fusion of the natural and artificial realms, demonstrating how algorithmic innovation stems from nature’s timeless principles, resolving complex isThe FHO rigorously employs a fitness function, as depicted in Figure 3, to meticulously gauge the performance of solution candidates. This fitness function pivots around the precision of a gradient boosting classifier meticulously applied to a thoughtfully selected subset of features sourced from a dataset.

To elaborate on the computation of the fitness value, the function takes a solution candidate into its fold, representing a distinct subset of features. This subset undergoes scrupulous evaluation via a gradient boosting sues in online social networks. Ultimately, it invites exploration of the limitless possibilities born from the fusion of nature and algorithms. classifier, armed with precisely 100 estimators and a deterministic random state fixed at 42. Notably, this classifier undertakes the dual responsibility of feature selection and classification.

The inner workings of the fitness function encompass the formulation of a feature selector. This selector, entailing sophisticated intricacies, leverages the classifier itself to discern and pinpoint the paramount features based on the classifier’s predictive capabilities. This discernment is crucial in optimizing the classification process.

Of particular significance is the selector’s subsequent iftting to both the input dataset and the target variable.

This preparatory phase is pivotal for the forthcoming accuracy evaluation.

What distinguishes this fitness function is its intrinsic capacity to bring about a transformation of the input dataset. This transformation is rendered by carefully cherry-picking the most pivotal features from the original dataset. The result is a transformed dataset, which bears the promise of enhanced accuracy. This transformed dataset now becomes the testing ground for the classifier.

It serves as the substrate for the classifier’s extensive training process, conducted in close tandem with the target variable.

As the final step in this intricate dance of precision, the fitness function introduces the crucial concept of the accuracy score. It orchestrates a meticulous comparison between the true labels and the predicted labels that emerge from the classifier’s outputs on the transformed dataset. The resultant accuracy score stands as a testament to the chosen subset of features’ ability to efectively forecast the target variable.

Figure 4 demonstrates the pivotal role of the fitness function in the FHO. In the third stage of the code, the fitness values for each solution candidate in the population are meticulously computed by invoking the fitness function. This function is systematically applied to every row (axis=1) within the population array, yielding an array replete with fitness values, which are more specifically accuracy scores. These accuracy scores bear significance as they provide a quantitative assessment of each solution candidate’s performance accuracy.

In essence, the fitness function operates as the core evaluator, discerning and ranking solution candidates based on their individual performance. In the broader context, these fitness scores wield substantial influence in steering the FHO’s pursuit of the optimal solution, with the overarching goal of optimizing performance accuracy. 3.5.6. FHO metrics The FHO algorithm undergoes a comparative analysis against a spectrum of established Machine Learning algorithms, encompassing ID3, SVM, NB, RF, HC, KNN with diverse K values, and K-means. This exhaustive evaluation consists of 100 iterations for each dataset, ensuring robustness and careful examination. Notably, the FHO configuration parameters are as follows: the initial population size is set at 50, and the maximum number of iterations is capped at 100 as summarized in Table 4

4. RESULTS AND DISCUSSION

Throughout the experimental phase, a 2014 MSI GT70 gaming laptop was employed, featuring an Intel Core i7-4800MQ CPU, a Nvidia GeForce GTX 770M GPU, and 32 GB of RAM.

4.1. Evaluation Criteria

The detection of fake accounts can be evaluated using various performance metrics, such as Accuracy, F-score, Recall, precision, and entropy. These metrics provide insights into the model’s performance and its ability to classify profiles correctly.

In addition, the Confusion Matrix is used as a visual representation of fake account detection, ofering a comprehensive view of the model’s performance across different classes as shown in Table 5.

• Accuracy: This metric measures the overall accuracy of the model in correctly classifying profiles. =

+ + + + • F1-score: Which is the harmonic mean of precision and recall, balances the trade-of between these two metrics.

1 − =

2 * 2 * + + (3) • Entropy: This metric quantifies the randomness or disorder in a system, providing valuable information about the data’s structure and organization. = 2( ) * (− ) (4)

4.2. Results

Table 6 summarizes the obtained results in comparison to • Precision: Calculates the model’s accuracy in clas- the original work conducted with the same dataset [23]. sifying values correctly by comparing the number So, the results presented in Table 6 showcase the perof accurately classified profiles to the total classi- formance metrics of various classifiers, with a particular ifed data points for a given class label. emphasis on the Fire Hawk Optimizer (FHO).

FHO stands out prominently, achieving remarkable accuracy, precision, recall, and F1-score values of 99.6%. = + (1) This outstanding performance suggests that FHO excels in accurately classifying instances, achieving an almost • Recall: This metric assesses the model’s ability to perfect balance between precision and recall. Such high correctly predict positive values, indicating how metrics underscore the efectiveness of FHO in the given often it correctly identifies true positives. classification task, highlighting its potential as a robust optimization algorithm. = + (2) Comparatively, traditional machine learning classiifers, such as Support Vector Machine (SVM), Naive Bayes (NB), and Logistic Regression (LR), demonstrate competitive yet comparatively lower performance. SVM, while achieving a respectable accuracy of 90.7%, falls short of FHO’s exceptional accuracy. Similarly, NB and LR, with accuracies of 90.4% and 89.9

Precision, recall, and F1-score values further emphasize FHO’s dominance, outperforming the other classiifers across all metrics. The precision of 99.6% indicates an incredibly low false positive rate, essential for tasks where misclassification has significant consequences.

The recall of 99.7% highlights FHO’s ability to capture the majority of actual positives. The F1-score of 99.6% reflects the harmonious balance between precision and recall.

The outstanding performance of FHO positions it as a formidable tool for classification tasks. Its ability to achieve near-perfect accuracy and balance between precision and recall showcases its potential to outshine traditional machine learning methods in complex optimization scenarios. This reafirms the significance of bioinspired algorithms, like FHO, in pushing the boundaries of optimization and classification tasks.

4.3. Discussion

and computational resources are often scarce.

What distinguishes FHO is its inherent ability to navigate the unpredictability and noise inherent in real-world data, illustrating its robustness and adaptability in handling the often erratic nature of user-generated profile information.

FHO’s inclination to diversify the search process, drawing inspiration from natural systems, is another noteworthy trait. By concurrently exploring multiple potential solutions, it enhances the likelihood of discovering innovative answers, a crucial asset when dealing with the ever-evolving strategies employed by creators of fake profiles.

The results underscore FHO’s exceptional computational eficiency, consistently converging to the globally optimal solution within a significantly reduced timeframe. This eficiency proves highly relevant in situations where time sensitivity and the conservation of computational resources are paramount. An additional notable aspect is FHO’s ability to converge toward the globally optimal solution in mathematical test functions while requiring fewer objective function evaluations. This underscores its computational eficiency, highlighting its practical applicability across a spectrum of problem-solving scenarios.ng multiple potential solutions, it enhances the likelihood of uncovering innovative answers, a pivotal asset when contending with the ever-evolving strategies employed by creators of fake profiles.

The results speak to FHO’s exceptional computational eficiency. It consistently converges to the global best solution within a significantly reduced timeframe, allowing it to swiftly identify optimal or near-optimal solutions. This eficiency proves highly pertinent in situations where time sensitivity and conservation of computational resources are paramount. An additional noteworthy aspect is FHO’s ability to converge toward the global best solution in mathematical test functions while requiring fewer objective function evaluations. This underscores its computational eficiency, highlighting its practical applicability across a spectrum of problem-solving scenarios.

FHO’s standout attribute is its remarkable ability to rapidly converge towards predefined tolerance for the global best solution. This swift convergence, coupled with its resource-eficiency, assumes particular significance in the context of social networks where timely proifle verification is crucial, and computational resources often come at a premium.

What sets FHO apart is its innate knack for handling the unpredictability and noise inherent in real-world data, 5. Conclusion showcasing its robustness and adaptability in navigating the often erratic nature of user-generated profile infor- Within the online social media landscape, the issue of mation. fake profiles has become a prominent concern, particu

FHO’s penchant for diversifying the search process, larly on major platforms such as Instagram, Facebook, inspired by natural systems, is another remarkable trait. and Twitter. The widening gap between registered proBy concurrently exploriFHO stands out due to its remark- files and genuinely active users signals a troubling inable capacity to swiftly converge toward a predefined crease in counterfeit or inactive accounts, posing risks tolerance for the globally optimal solution. This rapid to platform credibility, security, and privacy. Academic convergence, coupled with its resource-eficient nature, literature has predominantly focused on applying maholds particular significance in the realm of social net- chine learning techniques to discern real from fraudulent works, where timely profile verification is imperative, profiles by analyzing various attributes and user behavior patterns. However, these traditional methods exhibit N. Elouali, C. Bouhadra, Fake profiles identifilimitations, prompting the exploration of more robust cation on social networks with bio inspired algoand eficient solutions. rithm, in: 2022 First International Conference on

A transformative shift in the fight against fake profiles Big Data, IoT, Web Intelligence and Applications has emerged, emphasizing the potential of metaheuris- (BIWA), IEEE, 2022, pp. 48–52. tic algorithms, specifically bio-inspired algorithms. This [4] N. Deshai, B. B. Rao, et al., Deep learning hybrid apshift acknowledges the constraints of conventional ma- proaches to detect fake reviews and ratings, Journal chine learning in handling the complexities of online of Scientific & Industrial Research 82 (2022) 120– social network data. Bio-inspired algorithms, exempli- 127. ifed by the Fire Hawk Optimizer (FHO), have shown [5] V. Tanniru, T. Bhattacharya, Online fake logo depromise in fake profile detection, deriving computational tection system (2023). prowess from their inherent bio-inspired nature, drawing [6] S. Shi, K. Qiao, J. Chen, S. Yang, J. Yang, B. Song, inspiration from the foraging behavior of fire hawks. L. Wang, B. Yan, Mgtab: A multi-relational graph

The metaheuristic aspect of FHO enhances its signifi- based twitter account detection benchmark, arXiv cance. As a member of the metaheuristics family, FHO preprint arXiv:2301.01123 (2023). belongs to a class of optimization algorithms praised for [ 7 ] A. Saravanan, V. Venugopal, Detection and verifitheir adaptability and eficiency. FHO distinguishes it- cation of cloned profiles in online social networks self by pursuing diverse solution candidates, making it using mapreduce based clustering and classificaadept at addressing multifaceted challenges, particularly tion, International Journal of Intelligent Systems in fake profile detection. and Applications in Engineering 11 (2023) 195–207.

FHO’s proficiency is evident in performance results [8] S. Bansal, N. Baliyan, Detecting group shilling prowith Instagram, Facebook, and Twitter datasets. It excels ifles in recommender systems: A hybrid clustering in promptly and eficiently converging toward the global and grey wolf optimizer technique, in: Design and best solution, a crucial trait in scenarios where timely Applications of Nature Inspired Optimization: Conprofile validation and limited computational resources tribution of Women Leaders in the Field, Springer, are critical. Its resilience in handling unpredictable data 2023, pp. 133–161. and its ability to diversify the search process are valuable [9] C. Hays, Z. Schutzman, M. Raghavan, E. Walk, assets when confronting the evolving tactics of fake pro- P. Zimmer, Simplistic collection and labeling pracifle creators. Furthermore, its computational eficiency, tices limit the utility of benchmark datasets for twitmarked by a lower number of objective function evalu- ter bot detection, in: Proceedings of the ACM Web ations while consistently converging to the global best Conference 2023, 2023, pp. 3660–3669. solution, positions it as a computational prowess exem- [10] G. LaFree, L. Dugan, Introducing the global terrorplar. ism database, Terrorism and political violence 19

Looking ahead, refining and advancing FHO’s capa- (2007) 181–204. bilities for large datasets with heterogeneous data could [11] J. Lutz, B. Lutz, Global terrorism, Routledge, 2019. be a future perspective. Integrating FHO with other ad- [12] R. Collobert, J. Weston, L. Bottou, M. Karlen, vanced techniques and exploring hybrid approaches that K. Kavukcuoglu, P. Kuksa, Natural language proleverage its strengths alongside complementary methods cessing (almost) from scratch, Journal of machine for even more robust profile validation are compelling learning research 12 (2011) 2493–2537. avenues for future studies. [13] J. B. Tenenbaum, V. d. Silva, J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, science 290 (2000) 2319–2323.

References [14] G. Sidorov, F. Velasquez, E. Stamatatos, A. Gelbukh, L. Chanona-Hernández, Syntactic n-grams as ma[1] R. Bhambulkar, S. Choudhary, A. Pimpalkar, Detect- chine learning features for natural language proing fake profiles on social networks: A systematic cessing, Expert Systems with Applications 41 (2014) investigation, in: 2023 IEEE International Students’ 853–860.

Conference on Electrical, Electronics and Computer [15] S. García, S. Ramírez-Gallego, J. Luengo, J. M.

Science (SCEECS), IEEE, 2023, pp. 1–6. Benítez, F. Herrera, Big data preprocessing: meth[2] J. Shamseddine, M. Malli, H. Hazimeh, Survey on ods and prospects, Big Data Analytics 1 (2016) 1–22. fake accounts detection algorithms on online social [16] B. Charbuty, A. Abdulazeez, Classification based networks, in: The International Conference on on decision tree algorithm for machine learning, Innovations in Computing Research, Springer, 2022, Journal of Applied Science and Technology Trends pp. 375–380. 2 (2021) 20–28. [3] N. Mahammed, S. Bennabi, M. Fahsi, B. Klouche, [17] K. P. Sinaga, M.-S. Yang, Unsupervised k-means

clustering algorithm , IEEE access 8 ( 2020 ) 80716 -

80727. [18]

Murtagh ,

Contreras , Algorithms for hierar-

Discovery 7 ( 2017 ) e1219 . [19]

N. M.

Abdulkareem ,

A. M.

Abdulazeez ,

D. Q.

Zee-

baree , D. A.

Hasan , Covid-19 world vaccination

gorithms , Qubahan Academic Journal 1 ( 2021 ) 100 -

105. [20]

Biau ,

Scornet , A random forest guided tour,

Test 25 ( 2016 ) 197 - 227 . [21]

Tanveer ,

Rajani ,

Rastogi ,

Y.-H.

Shao ,

( 2022 ) 1 - 46 . [22]

Azizi ,

Talatahari ,

A. H.

Gandomi , Fire hawk

cial Intelligence Review 56 ( 2023 ) 287 - 363 . [23]

N. E. H. B.

Chaabene ,

Bouzeghoub , R. Guetari,

rorists behaviour in twitter , in: 2021 IEEE interna-

(SMC) , IEEE, 2021 , pp. 309 - 314 .