=Paper=
{{Paper
|id=Vol-3080/paper13
|storemode=property
|title=A Novel Approach for Fake Comments and Reviews Detection on the Online Social Networks
|pdfUrl=https://ceur-ws.org/Vol-3080/13.pdf
|volume=Vol-3080
|authors=Akshat Gaurav,Brij B. Gupta,Kwok Tai Chui,Dragan Peraković,Priyanka Chaurasia,Ching-Hsien Hsu
}}
==A Novel Approach for Fake Comments and Reviews Detection on the Online Social Networks==
A Novel Approach for Fake Comments and Reviews Detection on the Online Social Networks Akshat Gaurav*1 , B.B. Gupta2 , Kwok Tai Chui*3 , Dragan Peraković4 , Priyanka Chaurasia5 and Ching-Hsien Hsu*6 1 Ronin Institute, Montclair, New Jersey 07043, U.S. 2 National Institute of Technology Kurukshetra, Kurukshetra-136119, Haryana, India & Asia University, Taichung 413, Taiwan & Staffordshire University, Stoke-on-Trent ST4 2DE, UK 3 Hong Kong Metropolitan University (HKMU) , Hong Kong, China, 4 University of Zagreb, Croatia, 5 Ulster University, Magee campus, Londonderry, UK 6 Department of Computer Science and Information Engineering, Asia University, Taiwan & Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan & Department of Medical Research, China Medical University Hospital, China Medical University, Taiwan *Corresponding Author Abstract As the primary source of information dissemination, social media networks have surpassed traditional news organisations for the first time. Nonetheless, as the number of people who use social media web- sites grows, they become more susceptible to the spread of misinformation, making it increasingly diffi- cult to distinguish between real news and false news in real time. In this paper, we proposed a machine learning technique for the detection of fake comments in social networks. According to the results of the experiment, it is clear that the machine learning technique efficiently detects the fake comments. Keywords Deep learning, Fake comments, Machine learning 1. Introduction For many reasons, social media has overtaken email as the most important distribution and consumption medium for news and information. For starters, getting news via social media is usually faster and less expensive than getting news from conventional sources[1]. Further, engaging and interacting with other readers by commenting, debating, and fighting with them is a great method to get one’s points through while also encouraging participation and participation. Despite these developments, spreading real-time information with the use of social media has contributed to the spread of disinformation, often referred to as fake comments and reviews. International Conference on Smart Systems and Advanced Computing (Syscom-2021), December 25–26, 2021 " akshat.gaurav@ronininstitute.org (A. Gaurav*); gupta.brij@gmail.com (B.B. Gupta); jktchui@ouhk.edu.hk (K. T. Chui*); dperakovic@fpz.unizg.hr (D. Peraković); p.chaurasia@ulster.ac.uk (P. Chaurasia); robertchh@gmail.com (C. Hsu*) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Recommendations and feedback are becoming more important as internet communication technology advances. In today’s world, people are increasingly relying on internet reviews to assist them make a purchasing choice. For company owners, internet comments are a way to develop and better their enterprises. Product enhancements may be made based on user input via online comments. However, internet remarks are not always honest, and false online comments are common [2]. There are company owners that will pay for nice or poor evaluations regarding their rivals’ items to be written. Consumers are misled by these phoney reviews and end up purchasing inferior goods because of it. Both consumers and sellers depend on honest evaluations to guide their purchasing choices, and false testimonials may have a significant impact on both. Innocent clients may suffer financial losses as a result of this. As a result, many people are interested in learning how to spot fake comments. Most shopping websites, on the other hand, have solely addressed the issue of negative ratings and comments. As a result, the detection of false reviews is critical in both corporate and academic settings[3]. 2. Related Work Several academics have offered a number of methods for identifying various cyber threats [4, 5, 6, 7, 8] as well as phoney reviews and comments[9, 10]. In this part, we discuss some of the most extensively used fraudulent review and comment detection techniques. There are a number of popular social networking sites, like Facebook and Twitter, that are used by internet users to obtain information on the World Wide Web[11]. Spammers and hackers who transmit harmful depicted objects in the form of spam through social networking platforms are examples of content protection[12, 13]. Online social networking has developed as one of the most popular methods to exchange information and communicate with others in the course of everyday life. Online social network- ing is a fun and convenient way to meet new people and keep in touch with old friends. There are, however, worries about user privacy and account security. During events, people use social media sites like Twitter and Facebook to disseminate false information. Author in [14] presented a method to identify phoney accounts by studying the features of dangerous information that spreads in real-time. Fake profiles are created by stealing the personal information of a real user and utilising that information to establish a new profile. As time goes on, the profile gets hacked such that it may send friend requests to a friend of the original account holder. The suggested method outlines our chrome extension-based architecture for detecting bogus Twitter accounts by examining several attributes. Web applications are automatically scanned for XSS attack vectors using XSS-explorer, a universal and automated server-side flexible framework proposed by the author in [8]. Extensive XSS attack investigations are generated for each web application’s injection locations, which may be explored using the built-in XSS-explorer tool. This strategy relies on approaches that allow for the accurate filling of injection locations in forms with relevant information. Identification of these points allows us to search for every possible web page of the application, allowing us to look for more attack vectors and speeding up the process of finding them. 3. Setup The suggested method uses six core machine learning techniques to identify bogus news. Tokenization and stop phrasing are the first steps in the recommended technique. Stastical approaches are used to assess the performance of each ML methodology. It might take a long time and be difficult to determine if a remark is real or not. Because of this, a previously gathered and recognised dataset of bogus news was deployed. We drew inspiration for our project from the Kaggle database. The dataset includes a header, title and text columns, as well as a fake/real comment item flag. It is necessary to remove stop words from text before adding it into machine learning models. Our models will perform better if we can use these methods to find and optimise the most relevant terms. There are a lot of useless words and strange characters in our datasets since we utilise real-world news items. Our data collection was simplified as a consequence of the removal of these extraneous characters. Stop words are removed as the last step in preprocessing. They were thus omitted from all of our testing due to the potential for excessive noise they would have generated. In the last section, we used machine learning models to pre-process the data. In order to get over this limitation, we employ a counter vectorized to translate the text data into vector form. Finally, we evaluate the performance of multiple machine learning models using statistical techniques. 4. Results and Disscusion It is possible to improve the classification accuracy of various techniques by varying the quantity of labelled data used in training . When just a tenth of the labelled data is utilised, the suggested detection methods demonstrate an improvement in performance of up to three percent, while simultaneously improving the accuracy and lowering the false postive. After extracting and transforming the undesired content using tokenization algorithms. Following that, we trained distinct models using machine learning techniques. The performance of these machine learning models is evaluated using following eqations, and result is represented in figure 1. 𝛿𝑃 + 𝛿𝑁 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1) 𝛿𝑃 + 𝛿𝑁 + 𝛿𝑃′ + 𝛿𝑁 ′ 𝛿𝑃 𝑅𝑎𝑐𝑎𝑙𝑙 = ′ (2) 𝛿𝑃 + 𝛿𝑁 𝛿𝑃 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (3) 𝛿𝑃 + 𝛿𝑃′ 𝛿𝑃 × 𝛿𝑅 𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 = 2 × (4) 𝛿𝑃 + 𝛿𝑅 (a) Accuracy (b) Precision (c) Recall (d) F-1 Score Figure 1: Statistical Parameters Calculation 5. Conclusion We examined social media reviews and comments in this research and created a novel approach for detecting fraudulent reviews and comments. We used machine learning technologies to identify fake reviews and comments. To show our system’s usefulness and efficiency, we compared it to a variety of other temporal outlier detection approaches. Detecting fraudulent reviews using review data involves various challenges. Our study could not conclusively establish when a product is most likely to be the subject of fake reviews and comments, which is an exciting topic for more research. References [1] K. Kumari, Online social media threat and it’s solution, Insights2Techinfo (2021) 1. [2] H. Deng, L. Zhao, N. Luo, Y. Liu, G. Guo, X. Wang, Z. Tan, S. Wang, F. Zhou, Semi- supervised learning based fake review detection, in: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), IEEE, 2017, pp. 1278–1280. [3] W. Liu, J. He, S. Han, F. Cai, Z. Yang, N. Zhu, A method for the detection of fake reviews based on temporal features of reviews and comments, IEEE Engineering Management Review 47 (2019) 67–79. [4] A. Gaurav, A. K. Singh, Light weight approach for secure backbone construction for manets, Journal of King Saud University-Computer and Information Sciences (2018). [5] D. P. Ivan Cvitić, G. Praneeth, Digital forensics techniques for social media networking, Insights2Techinfo (2021) 1. [6] Z. Zhou, A. Gaurav, B. Gupta, H. Hamdi, N. Nedjah, A statistical approach to secure health care services from ddos attacks during covid-19 pandemic, Neural Computing and Applications (2021) 1–14. [7] Z. Zhou, A. Gaurav, B. B. Gupta, M. D. Lytras, I. Razzak, A fine-grained access control and security approach for intelligent vehicular transport in 6g communication system, IEEE Transactions on Intelligent Transportation Systems (2021). [8] S. Gupta, B. B. Gupta, Robust injection point-based framework for modern applications against xss vulnerabilities in online social networks, International Journal of Information and Computer Security 10 (2018) 170–200. [9] J. Zhao, H. Wang, Detecting fake reviews via dynamic multimode network, International Journal of High Performance Computing and Networking 13 (2019) 408–416. [10] A. Gaurav, B. Gupta, A. Castiglione, K. Psannis, C. Choi, A novel approach for fake news detection in vehicular ad-hoc network (VANET), in: International conference on computational data and social networks, 2020, pp. 386–397. Tex.organization: Springer. [11] S. R. Sahoo, B. B. Gupta, Classification of various attacks and their defence mechanism in online social networks: a survey, Enterprise Information Systems 13 (2019) 832–864. [12] S. R. Sahoo, B. B. Gupta, Classification of spammer and nonspammer content in online social network using genetic algorithm-based feature selection, Enterprise Information Systems 14 (2020) 710–736. [13] S. R. Sahoo, B. B. Gupta, Hybrid approach for detection of malicious profiles in twitter, Computers & Electrical Engineering 76 (2019) 65–81. [14] S. R. Sahoo, B. Gupta, Real-time detection of fake account in twitter using machine-learning approach, in: Advances in computational intelligence and communication technology, Springer, 2021, pp. 149–159.