1. Introduction

S. G. Burdisso, et al., A text classification framework for simple and effective early depression detection over social media streams, Expert Systems with Applications

10.1109/ICCCNT56998.2023.10308056

Andriy Kovalenko†, Igor Ruban†, Olesia Barkovska*, † and Vladyslav Kholiev†

0 Kharkiv National University of Radio Electronics , Nauky Ave. 14, Kharkiv, 61166 , Ukraine

2023

133 2019 0000 0002

The analysis of the emotional state of participants in a digital scientific community is a crucial task that significantly impacts the productivity of scientific discussions, the quality of collective decision-making, and the level of researcher engagement. Scientific discourse differs from general communication texts due to its specific characteristics, including formality, structured presentation, specialized terminology, and a predominantly neutral emotional tone. Traditional sentiment analysis and topic modeling methods, which perform effectively in social media contexts, are not always well adapted to the peculiarities of the academic environment. This study explores approaches to analyzing the emotional state of scientific discussions based on textual data using natural language processing (NLP) techniques, including topic modeling, sentiment analysis, and named entity recognition (NER). The OntoNotes dataset was modified and expanded with annotated texts from scientific forums and article comments to better align NLP methods with the characteristics of academic discourse. A comparative analysis of state-of-the-art machine learning models (BERT-Base, RoBERTa, DistilBERT) was conducted for the automatic analysis of scientific texts. The results demonstrate that transformer-based models significantly improve the accuracy of topic modeling, sentiment analysis, and NER in scientific discussions. An integrated system for analyzing the emotional tone and structure of academic discourse is proposed. Future research will focus on multimodal analysis of scientific communication, incorporating audio and video processing to achieve a deeper understanding of the emotional context within academic interactions.

eol>natural language processing sentiment analysis topic modeling named entity recognition scientific discourse BERT digital scientific community text analysis 1

1. Introduction

Nowadays, there is a significant increase in the volume of digital scientific communication, including chats, forums, conferences, and platforms for discussing scientific ideas. While the use of natural language processing (NLP) for emotion analysis in commercial and social networks (e.g., Twitter, Facebook) is already well-developed [ 1-2 ], such approaches remain largely unsystematized in the academic domain. Identifying the emotions of speakers based on textual data in digital scientific communities is a complex task that involves NLP, machine learning, and psycholinguistics. This research direction is crucial for understanding participant engagement levels, discussion productivity, and, ultimately, the success of scientific interactions.

The analysis of the emotional state of participants in digital scientific communities based on textual information relies on named entity recognition (NER), automated topic modeling, and sentiment analysis. Research in this area will enable: tracking the correlation between emotions and the productivity of scientific discussions by measuring the relationship between the emotional background of communication and its efficiency. Future studies will focus on developing an "emotional profile" of successful discussions. The formation of such an emotional profile can be achieved through intelligent monitoring of scientific chats, forums, and discussions; preventing conflicts and researcher burnout by assessing the emotional state of participants, thereby improving communication and collaboration in scientific communities.

The implementation of emotion analysis methods in text processing is applied across various practical fields, some of which are illustrated in Figure 1. This technology enhances communication, increases team efficiency, prevents conflicts, and optimizes business and social processes.

The proposed review of sentiment, reaction, and emotion analysis based on textual information highlights the relevance of applying these methods to digital scientific communities and academic discussions. Despite the vast body of research on natural language processing (NLP) techniques, their adaptation to academic discourse remains underexplored. This is due to the complexity and specificity of scientific terminology, sentence structures, and the frequent use of modal verbs, all of which may influence the perception of emotions in text.

The relevance of this research topic is further reinforced by the increasing transition of scientific communication to digital formats. This shift is particularly significant for Ukrainian researchers, who are currently operating under the conditions of a full-scale war. Scientists and academics often experience high cognitive workloads, which can lead to professional burnout. At the same time, the productivity of scientific discussions is largely influenced by the emotional state of researchers. Thus, the early detection of negative emotional states, which may enhance engagement, improve the effectiveness of academic communication, and prevent burnout, represents a highly relevant research challenge.

A classification of emotion analysis methods is presented in Figure 2.

The objective of this study is to analyze the emotional reactions of participants in digital scientific communities based on textual information to determine their level of engagement and interest in the discussed topics, as well as to prevent potential conflicts and the pursuit of irrelevant research directions.

To achieve this goal, the following tasks must be addressed:    justification of the necessity of analyzing scientific discussions to determine topics, sentiment analysis, and named entity recognition (NER) for further generalization; preparation of a training dataset by updating and annotating an existing textual dataset; evaluation of the impact of fine-tuning neural network models on the accuracy of named entity recognition, topic classification, and sentiment analysis.

A further extension of this research involves the creation of a multimodal dataset that includes not only textual data but also audio and video information, such as recordings of scientific conferences.

2. Related Works

The systematization of key areas of natural language processing (NLP) and their practical applications, presented in Figure 3, is based on the categorization proposed in [ 4 ] and expanded upon by the authors of this article.

The development of NLP methods allows the creation of effective applications in computational linguistics [ 5-7 ]. In particular, NLP methods have been used in new types of chatbots and translation systems, which allow text aspect analyzing and determining the emotional mood of text in social networks and open communication channels of many people. Methods of analyzing emotions and sentiments in social networks help assess public opinion, improve customer experience, detect fake news [ 8 ], prevent cyberbullying, and support users' psycho-emotional well-being. For example, one of the problems of sentiment analysis based on digital text analysis and classification is the detection of depressive states and PTSD (post-traumatic stress disorder) [9-10]. As was shown in studies [1113] one of the key tasks within NLP is text content analysis, the process of discovering, classifying, and interpreting information contained in text data.

However, traditional NLP methods, such as rule-based approaches and statistical models, often struggle with context understanding, semantic nuances, and implicit meanings in textual data. To overcome these limitations, modern deep learning approaches, particularly neural network-based models, have been developed. Deep learning techniques, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, have demonstrated high efficiency in text classification, sentiment analysis, and emotion recognition.

Recent advancements in neural architectures, such as Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Transformer-based models (e.g., BERT, GPT, RoBERTa), allow for a more sophisticated understanding of linguistic structures, capturing dependencies across long text sequences. These models have shown superior performance in sentiment classification, intent recognition, and detection of psychological states, including stress and PTSD. The next section provides an overview of the most widely used neural models for emotion and sentiment analysis.

Existing sentiment analysis systems, such as Amazon Comprehend or Google Cloud Natural Language, often suffer from limitations in flexibility, adaptability to specific tasks, and transparency, making them unsuitable for domains requiring specialized emotional analysis.

Numerous studies have focused on NLP methods in academic and scientific research, particularly in the areas of automated classification of scientific articles, citation analysis, and sentiment evaluation in peer reviews. In particular, the classification of scientific documents using deep learning has been examined in [14]. This study approached the analysis of emotions in text and scientific publications by combining machine learning and deep learning techniques, highlighting the need for more advanced methods for detecting and assessing emotional nuances. A critical review of citation classification methods is presented in [15-16], where the authors explore context-dependent and context-independent citation analysis techniques. These approaches rely on citation placement in specific sections of a document and incorporate deep learning methods and transformer architectures. One of the main challenges of this research is the complexity of dataset annotation, as even with human intelligence, it is often difficult to accurately determine the sentiment of a citation. The findings reported in [17] further emphasize advancements in citation analysis, citation sentiment classification, citation summarization, and citation-based recommendations. These improvements have been facilitated by the availability of citation databases such as Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions, which support machine learning-based citation analysis.

An analysis of recent studies also demonstrates a growing interest in topic modeling (TM) of scientific publications [18-19]. Topic models have been applied to analyze scientific publications, evaluate researcher influence, and track the evolution of research topics over time, which is particularly useful for monitoring trends in academic literature. The review of existing research presented in this section indicates that topic modeling is an effective method for uncovering latent themes in scientific articles. However, traditional models such as Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) have limitations in accurately representing different scientific fields, as they fail to clearly delineate four distinct academic disciplines. At the same time, the findings suggest that BERT-based topic modeling can enhance the analysis of scientific literature, though further validation is required for its broader application in both academic and industrial domains.

These studies highlight the significance of natural language processing (NLP) methods in improving the accessibility, organization, and analysis of scientific information. However, in the reviewed works, these approaches are presented as independent studies, which does not allow for a comprehensive assessment within a unified system for text information analysis.

This research proposes a novel system for a comprehensive understanding of the text by combining multiple dimensions of analysis, integrating emotional classification, named entity recognition, thematic analysis, and sentiment analysis into a unified solution.

3. Methods and Materials

Any machine learning model requires a training dataset, regardless of the specific analysis task—be it topic modeling, named entity recognition (NER), or sentiment analysis. However, dataset creation and annotation present unique challenges across different domains, particularly in labeling emotional states. Among existing datasets, multimodal datasets enable classification based on audio, text (transcription), and visual data (facial expressions, gestures).

An example of a multimodal dataset is MELD (Multimodal EmotionLines Dataset), which provides video-recorded conversations annotated for emotion recognition. Another well-structured dataset is IEMOCAP (Interactive Emotional Dyadic Motion Capture), which includes dyadic session recordings between actors, annotated with emotions such as happiness, anger, sadness, frustration, and neutrality. Additionally, the SEMAINE dataset contains audiovisual recordings of human-agent interactions, with annotations for anger, happiness, fear, disgust, sadness, contempt, and amusement.

The analysis of scientific communication is a less explored area and has distinct characteristics— academic discourse is characterized by formality and neutrality. To address this gap, in this study, the OntoNotes dataset was modified by incorporating new data from non-personalized discussions on scientific forums and article comments. During data preparation, OntoNotes was adapted for topic modeling by adding topic annotations to each dialogue fragment and categorizing discussions into major academic fields (e.g., Machine Learning, NLP, Physics, Medicine) (Figure 4a). For sentiment analysis, annotations included positive, negative, and neutral sentiment labels (Figure 4b). The processed data was converted into a machine learning-friendly format (JSON/CSV files) and stored as OntoNotes_mod, a structured dataset ready for further training (Figure 4). а)

To evaluate and compare different architectures based on their reported performance on widely accepted benchmark datasets, such as GoEmotions, CoNLL-2003, and SST-2/SST-5 a comparative analysis of various deep learning models used for Emotion Classification, Named Entity Recognition (NER), and Sentiment Analysis was proposed in the article (Table 1).

This analysis demonstrates that Transformer-based models, particularly BERT and DistilBERT, provide state-of-the-art performance across multiple NLP tasks. DistilBERT, in particular, offers a trade-off between accuracy and computational efficiency, making it a suitable choice for applications requiring faster inference. BERT, on the other hand, remains a robust baseline for NER and Sentiment Analysis, ensuring high accuracy and generalization capabilities.

4. Experiments and results

In study [20], the authors present a general model of a system for comprehensive text analysis aimed at understanding the emotions of discussion participants.

4.1. The named entity recognition block research

The NER block aggregates solutions to two subtasks – identifying and classifying entities in the text (NER tags) and tagging the part of speech in each token (POS tags).

This block uses the OntoNotes_mod dataset, a reference for named object recognition tasks, which has well-annotated entities, including persons (PER), locations (LOC), organizations (ORG), and miscellaneous entities (MISC).

On the base of comparative Table 1 for named entity recognition BERT-base-cased was chosen for its high F1-score (94.82%) while maintaining a reasonable computational cost. Additionally, BERT-CRF achieved competitive results (92.29%), demonstrating the effectiveness of combining BERT with Conditional Random Fields (CRF) for structured prediction tasks.

The experiments conducted in the paper consisted of configuring the entity classification model using TrainingArguments:    output_dir – specifies the directory where the model checkpoints will be stored; evaluation_strategy – specifies when to evaluate the model (e.g., after each epoch); learning_rate – sets the learning rate for the optimizer, which controls how much to adjust the model weights relative to the gradient; per_device_train_batch_size – specifies the number of samples per batch for training; per_device_eval_batch_size – specifies the number of samples per batch for evaluation; num_train_epochs – sets the number of epochs to train the model; weight_decay – applies regularization to the model weights to prevent over-training.

From the graphs, it is clear that Epoch = 3, Batch Size = 16 give the best result, and only 6 Accuracy (0.9889), Precision (0.9412), Recall (0.949), F1 Score (0.9443), Classification time (353sec).

For demonstration purposes, we will use the next sentences:  

Text 1 – "I was thrilled by the outstanding performance of the new iPhone's camera, but the poor battery life left me frustrated".

Text 2 – "The exhilarating last-minute victory of Manchester United over Chelsea made the entire crowd ecstatic". 

Text 3 – "I am deeply concerned about the lack of strong climate change policies, as they are essential for protecting our environment and ensuring a sustainable future".

The POS tags correctly identify grammatical structure, covering pronouns (PRP), verbs (VBD, VBN), prepositions (IN), determiners (DT), adjectives (JJ), and nouns (NN).

4.2. The sentiment analysis block research

For sentiment analysis block BERT-Base model were selected due to high accuracy 93.5% on SST-2 using a minimum number of parameters (110M), which reduces computational costs. This model demonstrates stability of operation and wide applicability in NLP tasks.

The experiments conducted in the work consisted of configuring the model for sentiment analysis, starting from the basic configuration - batch size - 16 (both for training and evaluation), number of epochs - 3.

For demonstration purposes, we will use the same sentences that were tested in the section “The named entity recognition block research ” (Table 3).

BERTopic provides a sophisticated approach to topic modeling using BERT embeddings that allow for extracting coherent and semantically rich topics from text data. Systematic experiments confirm the model's effectiveness, adaptability, and efficiency for text of short length and complexity.

4.3. The topic modeling block research

A comparative analysis of topic modeling approaches shows that classical methods such as LDA and NMF offer good interpretability but have limitations when working with short texts and depend on extensive preprocessing. Unlike these methods, BERTopic leverages transformers and a modified TFIDF (c-TF-IDF) to generate context-aware topics, making it more potent for dynamic and hierarchical topic modeling. It enables the adaptive extraction of topics from data streams, which is especially important for analyzing unstructured text.

For demonstration purposes, we will use the same sentences that were tested in the section The named entity recognition block research (Table 4).

In the future, the marker can solve many problems outside our system, so it is a universal indicator of mood, emotional coloring, and the context of the text.

The solved tasks (the named entity recognition task, the sentiment analysis task, the thematic modeling task) are the main structural elements of the system proposed by the authors in [21]. This confirms the relevance and necessity of emotion analysis based on textual information in the context of the development of digital scientific communities (Figure 1). It can be regarded as an example of applying the developed approach practically.

5. Discussions

Analysis of the obtained result (Figure 5 and Figure 6) shows that batch size 16 balances training time and model performance well. Smaller batch sizes significantly increased the training time without a noticeable increase in performance, while larger batch sizes reduced the training time but slightly worsened the performance. Training on three epochs gave the best overall performance. Although four epochs showed minor improvements, the additional training time did not justify the minor gains. Two epochs were not enough for optimal training.

The NER and POS tagging results show the system's ability to identify named entities and parts of speech in sentences. Text 1 identified the iPhone as a B-MISC entity (miscellaneous entity category). The system identified "iPhone" as a different entity because the product name does not fit into standard categories such as PERSON, ORG, or LOCATION.

The experimental results (Figure 7 and Figure 8) show that it is possible to ensure high model performance by adjusting several essential parameters, such as batch size and number of training epochs. Batch size 16 and training on three epochs balance training time and model performance well. The graphs show that Epoch = 3 and Batch Size = 16 provide the highest indicators of Accuracy (0.738), Precision (0.739), Recall (0.742), F1 Score (0.742), and Classification time (432sec). Four epochs showed insignificant improvements; additional training time did not yield insignificant gains.

The sentiment analysis results demonstrate the system's ability to identify the overall sentiment of each sentence (Table 3). For Text 1, despite the presence of positive elements ("excited" and "outstanding performance"), the overall sentiment is dominated by a negative aspect ("poor battery life" and "disappointment"), resulting in a negative classification with a high confidence score. Text 2 is a positive sentence, with several positive words that enhance the feeling ("exciting," "victory," and "excited"), resulting in a positive classification with a very high confidence score. In Text 3, the primary emotion conveyed is concern about inadequate climate policies, reflected in negative terms such as "deeply concerned" and "lack of strong policies," resulting in a negative classification with a high confidence score.

Let us explain the results from Table 4 using the example of Text 1 for BERTopic model. The main topic is apple. The score is 0.4149593710899353. The probability is 0.5898942351341248. The system identifies "apple" as the dominant topic with a reasonably high score and probability, reflecting the central focus on the iPhone. Other related topics include "6s", "smartphones," "smartphones," and "phones," which are related to the discussion of Apple devices. The inclusion of the terms "discontinued," "5s", "phone," "devices," and "touchscreen" further confirms the classification, indicating a complete understanding of the context of the Apple product ecosystem.

The main topic, "Apple," and related terms such as "6s", "smartphones," and "smartphone" reflect the focus of the sentence on the iPhone and its features. The system's high probability score indicates a good understanding of the context.

The last step of the proposed system is to generate a visual marker that will graphically display the input text's emotion, mood, and topic. Aggregating the results of emotion classification, NER, sentiment analysis, and topic modeling for each text block is the input data for the marker generation module.

The results’ explanation is given using the example of Text 1. Aggregation of all results will look like (Figure 9):

6. Conclusions

This study presents a comprehensive analysis of methods for evaluating the emotional state of participants in digital scientific communities based on textual information. The research primarily focused on topic modeling, sentiment analysis, and named entity recognition (NER).

The following key results were obtained:  the OntoNotes dataset was expanded by incorporating and annotating textual data from scientific forums and article comments. The annotation included thematic categories (e.g., Machine Learning, NLP, Physics, Medicine) and sentiment labels (positive, negative, neutral);     the performance of state-of-the-art neural network models for emotion classification, NER, and sentiment analysis was evaluated. The results demonstrated that BERT-Base and RoBERTa achieved the highest accuracy, while DistilBERT provided a balance between speed and accuracy; the optimal training parameters were identified, with batch size = 16 and three training epochs offering the best trade-off between performance and computational efficiency; BERTopic outperformed traditional topic modeling approaches (LDA, NMF), confirming the effectiveness of transformer-based models in topic analysis; an integrated approach was proposed for combining the results of different NLP tasks (emotion analysis, NER, sentiment analysis, and topic modeling) into a unified system, enabling visualization of results and the generation of graphical markers representing sentiment and discussion topics.

Future research directions include enhancing multimodal analysis by incorporating audio and video data from scientific conferences to gain a deeper understanding of the communication context; integrating the system into digital platforms for the automated analysis of discussions within scientific communities; investigating the impact of emotional tone on the quality and effectiveness of scientific discussions, which could help identify the most productive exchanges.

The obtained results confirm that the combination of modern natural language processing techniques and machine learning enables effective analysis of scientific communication, improving researcher interaction and enhancing the productivity of academic discussions.

Declaration on Generative AI

The authors have not employed any Generative AI tools.

[1]

Batiuk ,

Dosyn , Implementation of the intellectual system of sentiment analysis and clusterization of publications in the Twitter social network , Innovative Technologies and Scientific Solutions for Industries 1 ( 23 ) ( 2023 ) 25 - 44 . doi: 10 .30837/ITSSI. 2023 . 23 .025.

[2]

Shuliak , T. Kondratieva, Intention capability and tonality of English news headlines about Ukraine in the context of medialinguistics , International Humanitarian University Herald. Philology 56 ( 2022 ) 163 - 170 . doi: 10 .32841/ 2409 - 1154 . 2022 . 56 .36.

[3]

Wankhade ,

A. C. S.

Rao ,

Kulkarni , A survey on sentiment analysis methods, applications, and challenges , Artificial Intelligence Review 55 ( 2022 ) 5731 - 5780 . doi: 10 .1007/s10462-022- 10144-1.

[4]

Panchendrarajan , A. Zubiaga, Synergizing machine learning & symbolic methods A survey on hybrid approaches to natural language processing , Expert Systems with Applications 251 ( 2024 ) 124097 .

[5]

Barkovska et al., Analysis of the impact of the contextual embeddings usage on the text classification accuracy , Radioelectronic and Computer Systems 2024(3) ( 2024 ) 67 - 79 . doi: 10 .32620/reks. 2024 . 3 .05.

[6]

Barkovska ,

Kholiev ,

Havrashenko ,

Mohylevskyi ,

Kovalenko , A conceptual text classification model based on two-factor selection of significant words , in: COLINS (2) , 2023 , pp. 244 - 255 .

[7]

Maksymenko , et al., Improving the machine translation model in specific domains for the Ukrainian language , in: 2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT) , IEEE, 2022 , pp. 123 - 129 . doi: 10 .1109/CSIT56902. 2022 . 10000529 .

[8]

O. J.

Prasad ,

Nandi ,

Dogra ,

D. S.

Diwakar , A systematic review of NLP methods for sentiment classification of online news articles , in: 2023 14th International Conference on