<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Misinformation Detection in Social Media Texts and LLM Generated Text using Auxiliary Text Supervised Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kongqiang Wang</string-name>
          <email>wangkongqiang60@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peng Zhang</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Qingli Tan</string-name>
          <email>tanqingli@stu.ynu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Ecology and Environment, Yunnan University</institution>
          ,
          <addr-line>Kunming 650500, Yunnan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Kunming Academy of Environmental Sciences</institution>
          ,
          <addr-line>Kunming 650032, Yunnan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Information Science and Engineering, Yunnan University</institution>
          ,
          <addr-line>Kunming 650500, Yunnan</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Our team wangkongqiang participated in the Prompt RecOvery For MisInformation Detection (PROMID) competition, and the main contributions were in two sub-tasks. They are Subtask 2: Misinformation Detection in LLM generated text. Given a piece of LLM-generated text with misinformation, the objective is to categorize each datapoint into diferent categories based on the presence of factual incorrectness in the summaries. and Subtask 3: Misinformation Detection in social media texts respectively. The objective of this task is to classify tweets related to the Russo-Ukrainian conflict as either misinformation (positive class) or non-misinformation (negative class). In this competition experiment, our team employed various methods for exploration. This includes the Logistic Regression method of machine learning and the Dense Neural Network and Recurrent Neural Network of deep learning, as well as the method based on the transformer pre-model models-microsoft-deberta-v3-base. Through thorough experiments, it has been proved that they have achieved significant accomplishments in these two sub-tasks. In subtask 2, the experiment using the models-microsoft-deberta-v3-base model applied to the tamil language achieved the best result with the F1 score of 0.31. The best result F1 score of the experiment using Recurrent Neural Network for the kannada language was 0.34. In subtask 3, Logistic Regression was used to achieve the best result, the precision is 0.81, the recall is 0.83 and the F1 score is 0.82.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Text Multi-classification</kwd>
        <kwd>Binary Classification of Text</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Deep learning</kwd>
        <kwd>Transformer</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The PROMID [1] shared task aims to explore methods for identifying misinformation in human and
LLM generated texs, as well as reconstructing the input prompt that likely led to a given piece of LLM
generated misinformative text. There is a lot of motivations in this sharing task. As LLMs become more
prevalent, the rate at which misinformation is generated and disseminated is much higher than in the
past. It is more crucial than ever to identify ways for combating this spread of misinformation. The
PROMID [2] shared task aims at two aspects of combating misinformation. The first is identifying
whether a given piece of text has minformation or not. The second is understanding how a specific piece
of misinformation was generated using LLMs. If we can trace back or infer the prompt that generated a
suspicious text, it could provide insights into the intent, source, or specific instructions used to create
misleading content. Both these aspects can aid in developing more robust misinformation detection
and mitigation strategies.</p>
      <p>It can be seen from the task overview that the organizer mainly ofers three subtasks this year for
this sharing task:
• Subtask 2 : Misinformation Detection in LLM generated text; Given a piece of LLM-generated
text with misinformation, the objective is to categorize each datapoint into diferent categories
based on the presence of factual incorrectness in the summaries [3].
• Subtask 3 : Misinformation Detection in social media texts; The objective of this task is to
classify tweets related to the Russo-Ukrainian conflict as either misinformation (positive class) or
non-misinformation (negative class). The dataset comprises manually annotated tweets, collected
using the Twitter API during the first year of the Russia-Ukraine war. The dataset is highly
imbalanced, which is the goal to check how the model’s performance works in this setting.
Misinfo tweets in multiple languages, and the content can be translated or addressed by LLMs.
Misinfo tweets have some extra information (account age, bot account), but that can be extracted
for non-misinfo tweets if participants want. The result will be evaluated based on Precision,
Recall and weighted-averaged F1.</p>
      <p>Participants can choose to participate in one or more subtasks. Here, our group mainly participated
in Subtask 2: Misinformation Detection in LLM generated text [4] and Subtask 3: Misinformation
Detection in social media texts. With the continuous development and progress of artificial intelligence,
we have selected representative works of milestone significance. For examples, Logistic Regression
in machine learning, Dense Neural Network and Recurrent Neural Network in deep learning and
models–microsoft-deberta-v3-base in pre-trained models transformers are taken as the main research
objects and have demonstrated excellent performance in these two sub-tasks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>2.1. Problem Definition and Machine Learning
In recent years, the influence of misinformation/fake news on social media [ 5] in public opinion and
political processes has become increasingly serious. Therefore, automatic detection of misinformation
has become an important research direction in natural language processing and computational social
sciences. The research [6] not only focuses on the content of a single text itself (such as language style
and fact-finding), but also on the assistance of social context (such as dissemination structure and user
reputation) and multimodal evidence (such as images) for detection.</p>
      <p>Early methods focused on text-based language features (vocabulary, syntax, sentiment, suspicious
indicator words) and manual/statistical features, using classifiers such as SVM [ 7] and LR [8] for
discrimination. This type of method is intuitive and has good interpretability, but often has limited
ability when dealing with complex semantics, irony and factual inferences.
2.2. Network-based Approaches and Deep Learning
Another important type of work uses how information spreads in social networks to determine its
credibility, such as analyzing the shape of the forward/reply tree, user interaction patterns, and time
series characteristics. This type of method can capture the dynamics of rumor propagation, but it still
poses challenges for cold start (new events) and cross-event generalization.</p>
      <p>In recent years, end-to-end methods based on neural networks (CNN, RNN, attention, and pre-trained
language models represented by BERT) have significantly improved the efectiveness of text true and
false classification. Pre-trained models can capture deeper semantic and contextual information and are
often used in combination with meta-information (author, publication time) or structural features to
improve performance.
2.3. Multimodal Approaches and Future Work
As social media content often contains images/videos, many studies have proposed multimodal
architectures that jointly model text and visual information to enhance robustness. Representative works
include EANN (Event Adversarial Network) [9] and MVAE (Multimodal Variational AutoEncoder) [10],</p>
      <p>Languages Incorrectness_Type Frequency Comments</p>
      <p>KaTnanmaidla iinnmmffccaaiioossllrrssrrffrreeaaeeee__ppbbccTTaaNNrrrrttooeeiitt__ccaattssttqqaarrNNeeaaiiuuttnnbblliiaaoouuttnnaannttttttiiooiiiioottnniinneess 4433221122229999999955558877445500006655 TTGGGGGGTTGGhhhhiiiiiiiivvvvvvvviieesseeeeeeeennnnnnnnttrrooooaaaaaaaawwttaappppppppllooiiiiiiiieeeeeeeeffnnccccccccdduueeeeeeeeaammoooooooottaaffffffffbbLLLLLLLLiieennLLLLLLLLrrddMMMMMMMMooiiccff--------aaggggggggrrtteeeeeeeeooeennnnnnnnsswweeeeeeeettrrrrrrrrsshhaaaaaaaaaaootttttttteeeeeeeettffddddddddIIddnniiffmmffaaccnnaaaattoollbbcciiaassssrroorreerr..rriirreeee__ccrrppccaaaaeettrrttttcc__eeiittttoorrSSss__iinneeuuqqbbnnmmuuuutttteettaaaammxxiinnttoottiittaannoowwiirrttnnyyttiiiieeeettttaahhxxsseennttxxttmmeeddwwttxxiiwwiiIIssttttnniiiihhwwnncctthhooffmmiioottrrmmhhrrrriimmeessiimmiiccssnnaattiiiiffnnnnttssooiieeffiioorrnnoossmmnnrrssff..oomm__aaTTrrttaammyyiittooppiiaannooeettnniiaaoo..rrnnee nnoott pprroovviiddeedd,, bbuutt oonnllyy tthhee iinnffoorrmmaattiioonn ooff CCoorrrreecctt__SSuummmmaarryy iiss pprroovviiddeedd..
etc., which improve detection by aligning visual and text representations or learning event-invariant
features.</p>
      <p>The problems faced by the current evaluation include: inconsistent dataset labeling standards,
sensitivity of evaluation metrics to category imbalance, poor generalization of models in new events/domains,
and insuficient interpretability and auditability of models. When actually deploying, issues such as
latency (online detection), user privacy and ethics also need to be considered.</p>
      <p>The important future directions include: cross-language/cross-cultural error message detection, the
combination of evidence retrieval &amp; claim verification with generative models, explainable detection
mechanisms, multimodal and multi-source fusion strategies, and the improvement of the model’s
transfer ability in low-resource or new event situations.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Exploratory Data Analysis</title>
      <p>For Subtask 2: Misinformation Detection in LLM generated text, it contains datasets in four languages,
namely telugu, tamil, kannada and malayalam. The two languages we mainly participated in researching
were tamil and kannada. Now, statistical analysis is conducted on the datasets of these two languages.
Among them, the situation of the Incorrectness_Type Column label in the training dataset is shown in
Table 1, and the situation of the IsCorrect Column label in the test dataset is shown in Table 2. They are
crucial for data screening before the model training and for the model to evaluate the test data.</p>
      <p>For Subtask 3: Misinformation Detection in social media texts, it contains datasets in multiple
languages, namely English(en), Spanish(es), German(de), Italian(it), French(fr), Ukrainian(uk), Dutch(nl),
Persian(fa), Polish(pl), Russian(ru), Chinese(zh), Japanese(ja) and so on. The objective of this task is
to classify tweets related to the Russo-Ukrainian conflict as either misinformation (positive class) or
non-misinformation (negative class). The dataset [11] comprises manually annotated tweets, collected
[12] using the Twitter API during the first year of the Russia-Ukraine war. This shared task involves
two phases: Development phase with training and test datasets, Final phase for evaluation. All tasks
are classification tasks. The dataset is highly imbalanced, which is the goal to check how the model’s
performance works in this setting. The number of misinfo tag row contents and nonmisinfo tag row
contents of the training datasets in the Development phase and Final phase are shown in the table 3
below.</p>
      <p>The number of data samples rows in the test datasets of the Development phase and Final phase, see
Table 4.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>learning and DeBERTaV3 : Improving DeBERTa using ELECTRA-Style Pre-Training with
GradientDisentangled Embedding Sharing [13]. DeBERTa improves the BERT and RoBERTa models using
disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out
perform RoBERTa on a majority of NLU tasks with 80GB training data.</p>
      <p>Input_data: misinfo_train.csv</p>
      <p>nonmisinfo_train.csv
reference_data: misinfo_test.csv
nonmisinfo_test.csv</p>
      <p>Final
Remove: lines with an empty id, or lines
with an empty text, or lines with an
empty label</p>
      <p>Use the `clean` function in the
`tweetpreprocessor` library to preprocess the text,
and replace consecutive spaces and
newline characters with a single space.</p>
      <p>Remove english_stopwords.</p>
      <p>convert_label:
0-nonmisinfo,1misinfo
Use the `clean` function in the
`tweet-preprocessor` library to
preprocess the text, and replace
consecutive spaces and newline
characters with a single space.</p>
      <p>Remove english_stopwords.</p>
      <p>test_data_without_label.csv
test_final_merge_withoutlabel.csv
4.1. The Principle of Logistic Regression
Logistic Regression is a type of discriminative model widely used in binary and multi-classification
tasks. Its core idea is to linearly combine input features and map the linear results to probability values
using the Sigmoid function (or Softmax function), thereby achieving classification decisions. Although
the name contains "regression", logistic regression is essentially a linear classification model whose
goal is to learn an optimal decision boundary in the feature space.</p>
      <p>Conceptual and Format Models. For binary classification tasks, logistic regression assumes that
the input feature vector is  ∈ R, It is obtained through linear transformation:
Among them,  is a weight parameter,  is bias. To map the linear output to classification probabilities,
logistic regression uses the Sigmoid function:</p>
      <p>=   + 
() =</p>
      <p>1
1 + −
Thus, the probability that the sample belongs to the positive class (labeled as 1) is:
 ( = 1 | ) = (

 + )
This probability reflects the model’s confidence level in the category, and the final classification label
can be judged through the threshold (usually 0.5).</p>
      <p>Loss Function and Learning Process. Logistic regression learns parameters  with  by maximizing
the log-likelihood function of the training data. For a single sample (,), Its logarithmic likelihood is:
For the convenience of optimization, the opposite number is usually taken as the Loss function, namely
the Binary Cross-Entropy loss:</p>
      <p>
        ℓ(, ) =  log(()) + (1 − ) log(1 − ())
ℒ(, ) = − [ log(()) + (1 − ) log(1 − ())]
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
By performing gradient descent on the loss function (or other optimization algorithms, such as L-BFGS,
Newton method, etc.), the model can gradually update the parameters to make the predicted probability
as close as possible to the true label.
      </p>
      <p>Regularization. To prevent overfitting caused by overly large parameters, logistic regression often
incorporates L1 or L2 regularization terms:
• L2 regularization (Ridge) : Encourage smoother weights and reduce model complexity;
• L1 regularization (Lasso) : It has the ability of feature selection and can compress some
weights to zero.</p>
      <p>The addition of regularization can significantly enhance the generalization performance of the model,
especially in high-dimensional feature spaces.</p>
      <p>Multi-category Extension. Although logistic regression is mainly used for binary classification
tasks, it can be extended to multi-classification problems through one-to-many (One-vs-Rest) or Softmax
regression (multiple logistic regression). For  Class classification, Softmax regression uses the Softmax
function to output the probability distribution of each class:
 ( =  | ) =</p>
      <p>
        ∑︀=1  
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
The corresponding loss function is Categorical Cross-Entropy.
      </p>
      <p>Characteristics and Advantages. Logistic regression is widely applied in fields such as text
classification, medical prediction, and risk assessment due to its strong interpretability, eficient training,
and robust stability. Its decision boundary is linear. Therefore, after the features undergo reasonable
engineering processing (such as TF-IDF, embedding vectors), good results are often achieved. The main
advantages include:
• The model parameters are interpretable;
• It performs stably on small-scale datasets;
• Fast training speed and low computing cost;
• Overfitting can be efectively prevented through regularization;
• The mathematical form is clear and easy to analyze.
4.2. The Principle of Dense Neural Network
Dense Neural Network (DNN), also known as Fully Connected Neural Network (FCNN), is a type
of feedforward neural network composed of multiple layers of linear transformations and nonlinear
activation functions. DNN is the most fundamental and classic model structure in deep learning and can
be used for various tasks such as classification, regression, and representation learning. The core idea
is to gradually learn the complex mapping relationship from input features to output labels through
multi-layer abstraction.</p>
      <p>Network Structure. A typical DNN consists of the following parts:
• Input Layer : Receives input features  ∈ R;
• Hidden Layers : Composed of several fully connected layers stacked together, each layer
contains several neurons.
• Output Layer : Provide the final prediction based on the task type (such as classification or
regression).</p>
      <p>
        At each layer, neurons perform linear transformations on the input and then introduce nonlinearity
through activation functions, enabling the network to fit complex functions. For the  Layer neurons
include:
() =  ()ℎ(−1)
+ ()
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
ℎ() =  ())︁
      </p>
      <p>︁(
 ( =  | ) = ∑︀</p>
      <p>
        ℒ = −

∑︁  log(ˆ)
=1
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
functions, such as ReLU, Sigmoid, Tanh, etc. ℎ() for the output representation of this layer.
      </p>
      <p>
        Among them,  () and () are respectively the weight matrix and the bias vector.  (·) for activation
A multi-layer network can be recursively represented as:
ℎ() =  ()(︁  () (−1) (︁
· · ·  (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )(︁  (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) + (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ))︁
Through multi-layer combination, DNN can gradually extract high-level abstract features from low-level
features.
      </p>
      <p>Activation Function. The introduction of the activation function is the key for DNN to represent
complex nonlinear mappings. Common activation functions include:
• ReLU : ReLU() = max(0,). It has the advantages of stable gradient propagation and simple
calculation.
• Sigmoid : It is applicable to the binary output layer.</p>
      <p>• Tanh : It is applicable to some normalized data scenarios.</p>
      <p>Nonlinear activation endows the network with the Universal Approximation Theorem, theoretically
enabling it to approximate any continuous function.</p>
      <p>Forward Propagation. During the forward propagation process, the input features pass through
each fully connected layer in sequence to generate the final output. For classification tasks, the last
layer typically uses the Softmax function to convert the output into a probability distribution:
Among them,  is the output layer for the category  linear response.</p>
      <p>Loss Function and Parameter Learning. DNN typically learn network parameters by minimizing
the loss function. The commonly used Loss function in classification tasks is Cross-Entropy loss:
training data.</p>
      <p>DNN often incorporate regularization techniques:
Parameter updates are accomplished through the Backpropagation algorithm. Backpropagation
calculates the gradient layer by layer and updates the parameters through the chain rule. The optimization
algorithm usually uses: Stochastic Gradient Descent (SGD), Adam, RMSProp, etc. These optimization
methods continuously reduce training errors, enabling the model to gradually fit the patterns of the
Regularization and Prevention of Overfitting. To enhance the generalization ability of the model,
• L2 regularization : Weight decay.
• Dropout : Randomly discard some neurons to reduce co-adaptation.
• Batch Normalization : Stable training accelerates convergence.</p>
      <p>• Early Stopping : Avoid overfitting caused by long-term training.</p>
      <p>These techniques can efectively prevent the model from overfitting the training data and improve the
generalization performance.</p>
      <p>Model Features and Advantages. Dense Neural Network has the following advantages:
• Be capable of learning complex nonlinear mapping relationships;
• It is easy to implement and expand, and serves as the foundation for many deep models;
• It is efective for multiple tasks (classification, regression, text feature learning, etc.);
• After combining regularization and modern optimization algorithms, the training eficiency is
high and the efect is stable.</p>
      <p>Therefore, DNN is often used as a fundamental module in many deep learning systems and is widely
applied in fields such as natural language processing, computer vision, and recommendation systems.
4.3. The DeBERTa Model Principle
DeBERTa [14] (Decoding-enhanced BERT with Disentangled Attention) is an improved Transformer
language model proposed by Microsoft in 2021 By introducing the decoupled Attention mechanism
(Disentangled Attention) and the Enhanced Mask Decoder structure (Enhanced Mask Decoder), the
performance of the language understanding task was significantly improved while maintaining similar
model parameter scales. As an enhanced version of BERT and RoBERTa, DeBERTa demonstrates leading
performance in multiple NLP benchmark tasks.</p>
      <p>Decoupling Attention Mechanism (Disentangled Attention). The traditional BERT model uses
the superposition representation of token embedding and position embedding, that is, it processes
content and position information in the same way. DeBERTa proposed to separate the content
representation of words from the position information representation, thereby enhancing the model’s ability
to learn language structures. For the -th token, whose representation is split into:
ℎ = ℎ ()</p>
      <p>() + ℎ
Among them, ℎ() is content embedding, ℎ() is relative position embedding.</p>
      <p>In the self-attention mechanism, DeBERTa divides the attention score into three parts:
 = () ·  () + () ·  () + () ·  ()</p>
      <p>Compared with the simple dot product form of traditional Transformers, decoupled attention can
respectively model the dependencies of "content-to-content", "content-to-location", and
"location-tocontent", making the model more sensitive to language structures.</p>
      <p>Advantage: It can better capture sentence structure; Enhance the ability to model long-distance
dependencies; Improve the quality of semantic and grammatical learning. This is also the core reason
why DeBERTa has significantly improved performance compared to BERT/RoBERTa.</p>
      <p>Enhanced Mask Decoder. In the Masked Language Modeling (MLM) task of BERT, the model
often fails to make full use of the position encoding information. DeBERTa proposed Enhanced Mask
Decoder (EMD), which introduces stronger relative position information when decoding masked tokens,
enabling the model to make more accurate judgments based on the relative context structure when
predicting mask words. Its improvements include:
• Strengthen the modeling of the Mask position’s dependence on surrounding tokens;
• Improve the position modeling ability by using relative position bias;
• Reduce the information loss of tokens after they are masked.</p>
      <p>Experiments show that EMD enables DeBERTa to achieve higher training eficiency and prediction
accuracy in MLM training.</p>
      <p>
        Removal of Absolute Position Encoding. Unlike the absolute position encoding of BERT, DeBERTa
completely eliminates absolute positional embedding and instead uses relative position bias:
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
 = relative position embedding( − )
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
This way: It is more in line with the relative position structure of language; It is more suitable for tasks
such as sentence rearrangement and fill-in-the-blank; It has better generalization ability for sequence
length.
      </p>
      <p>Training and Optimization. DeBERTa adopted RoBERTa’s training strategies, such as: Dynamic
Masking; Train more data; Larger batch size; Remove the Next Sentence Prediction task. This further
enhances the performance of the model.</p>
      <p>Summary and Advantages. Compared with BERT [15] and RoBERTa [16], the main advantages of
DeBERTa are reflected in:
• Decouple attention : Model content and location information separately to capture a stronger
language structure.</p>
      <p>model
models–microsoft–deberta-v3-base
category
• Enhanced decoder : Improve the prediction quality and training eficiency of mask.
• Relative position encoding : More suitable for sentences of diferent lengths and structures.
• Strong overall performance : Achieved leading results in benchmark tasks such as GLUE,</p>
      <p>SQuAD, and SuperGLUE.</p>
      <p>Therefore, DeBERTa, as a representative model in language understanding tasks, is widely applied in
ifelds such as classification, text matching, summarization, and information extraction.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Result</title>
      <p>For Subtask 2: Misinformation Detection in LLM, our group respectively adopted the pre-trained model
method of the transformer architecture of models-microsoft-deberta-v3-base and the method
of implementing text classification based on the RNN model. Recurrent Neural Network (RNN) is a kind
of neural network model specifically for processing sequential data. Unlike traditional neural networks,
RNNS have memory capabilities and can capture the temporal dependencies in data. Among them,
the precision, recall and f1-score indicators of the two languages tamil and kannada that our group
participated in the experiment for the final test results in the two aspects of category and overall are
shown in Table 5.</p>
      <p>For Subtask 3: Misinformation Detection in Social Media Texts, this subtask of Prompt RecOvery
For MisInformation Detection (PROMID) shared task involves two phases: Development phase with
training and test datasets, Final phase for evaluation. All tasks are classification tasks.</p>
      <p>For Development_phase, the test set test_data_without_label for verification is provided. This
dataset contains a total of 14,802 rows of data content. Our group conducted an initial model development
based on this test set, using models-microsoft-deberta-v3-base as the main model and achieved
the best results. The evaluation of model performance mainly refers to the weighted avg results of
the classification model. The detailed results of each classification model are shown in the table 6 below.</p>
      <p>For Final_phase, the test set test_final_merge_withoutlabel for evaluation is provided. This
dataset contains a total of 2,414 rows of data content. Our group conducted a final model development</p>
      <p>model</p>
      <p>model</p>
      <p>training dataset precision recall f1-score
models–microsoft–deberta-v3-base</p>
      <p>Logistic Regression
Dense Neural Network</p>
      <p>Development</p>
      <p>Final
Final
Final
based on this test set, using Logistic Regression as the main model and achieved the best results. The
evaluation of model performance mainly refers to the weighted avg results of the classification model.
The detailed results of each classification model are shown in the table 7 below. In this experiment,
two training sets were respectively used for the models-microsoft-deberta-v3-base. One type
is the training dataset (Development) provided by Development_phase. One type is the training dataset
(Final) provided by Final_phase. The key diference between these two training datasets is that the test
set reference label data of Development_phase is added to the latter dataset. Through experiments, it
can be found that these real label data are very useful. He can enhance the learning efect of the model.
The other two models, Logistic Regression and Dense Neural Network, were fully trained based on the
training dataset (Final) provided by Final_phase.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, several machine learning and deep learning approaches have been used to detect
misinformation multiple languages content, and the models have been compared. Several techniques have
been employed to increase accuracy. Our proposed models–microsoft–deberta-v3-base model
achieved good results compared to its simplicity. We believe with proper feature extraction and data
augmentation techniques, these will be able to improve our proposed model.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We are very grateful to the organizers of the Shared Task on PROMID: Prompt RecOvery For
MisInformation Detection and the School of Information of Yunnan University for providing the environment
and equipment.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Shasirekha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          , G. Pasi, T. Mandl,
          <article-title>Overview of the first shared task on prompt recovery for misinformation detection</article-title>
          (promid
          <year>2025</year>
          ), in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Majumdar</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Chakraborty (Eds.), Working Notes of FIRE 2025 -
          <article-title>Forum for Information Retrieval Evaluation, Varanasi, India</article-title>
          .
          <source>December 17-20</source>
          ,
          <year>2025</year>
          , CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hegde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Shasirekha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          , G. Pasi, T. Mandl,
          <article-title>Prompt recovery for misinformation detection at fire 2025, in: Proceedings of the 17th Annual Meeting of the Forum for Information Retrieval Evaluation</article-title>
          , FIRE '25,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <article-title>Key takeaways from the second shared task on indian language summarization (ILSUM 2023)</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2023 -
          <article-title>Forum for Information Retrieval Evaluation (FIRE-WN</article-title>
          <year>2023</year>
          ), Goa, India,
          <source>December 15-18</source>
          ,
          <year>2023</year>
          , volume
          <volume>3681</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>724</fpage>
          -
          <lpage>733</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3681</volume>
          /
          <fpage>T8</fpage>
          -1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <article-title>Fighting fire with fire: Adversarial prompting to generate a misinformation detection dataset</article-title>
          ,
          <source>CoRR abs/2401</source>
          .04481 (
          <year>2024</year>
          ). URL: https://doi.org/10. 48550/arXiv.2401.04481. doi:
          <volume>10</volume>
          .48550/ARXIV.2401.04481. arXiv:
          <volume>2401</volume>
          .
          <fpage>04481</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vosoughi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Aral,</surname>
          </string-name>
          <article-title>The spread of true and false news online</article-title>
          ,
          <source>Science</source>
          <volume>359</volume>
          (
          <year>2018</year>
          )
          <fpage>1146</fpage>
          -
          <lpage>1151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zafarani</surname>
          </string-name>
          ,
          <article-title>A survey of fake news: Fundamental theories, detection methods and opportunities</article-title>
          , ACM Computing Surveys (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cortes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>Support-vector networks</article-title>
          ,
          <source>Machine Learning</source>
          <volume>20</volume>
          (
          <year>1995</year>
          )
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          . URL: https://api.semanticscholar.org/CorpusID:52874011.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <article-title>The regression analysis of binary sequences</article-title>
          ,
          <source>Journal of the royal statistical society series b-methodological 20</source>
          (
          <year>1958</year>
          )
          <fpage>215</fpage>
          -
          <lpage>232</lpage>
          . URL: https://api.semanticscholar.org/CorpusID:125694386.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , G. Xun,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , Eann:
          <article-title>Event adversarial neural networks for multi-modal fake news detection</article-title>
          ,
          <source>Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
          (
          <year>2018</year>
          ). URL: https://api.semanticscholar.org/ CorpusID:46990556.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Khattar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Goud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          , Mvae:
          <article-title>Multimodal variational autoencoder for fake news detection</article-title>
          ,
          <source>The World Wide Web Conference</source>
          (
          <year>2019</year>
          ). URL: https://api.semanticscholar.org/CorpusID: 86785940.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mejova</surname>
          </string-name>
          ,
          <article-title>Too little, too late: Moderation of misinformation around the russoukrainian conflict</article-title>
          ,
          <source>Websci '25</source>
          ,
          <year>2025</year>
          . doi:
          <volume>10</volume>
          .1145/3717867.3717876.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Majchrzak</surname>
          </string-name>
          ,
          <string-name>
            <surname>Amused:</surname>
          </string-name>
          <article-title>An annotation framework of multimodal social media data</article-title>
          , in: F.
          <string-name>
            <surname>Sanfilippo</surname>
            ,
            <given-names>O.-C.</given-names>
          </string-name>
          <string-name>
            <surname>Granmo</surname>
            ,
            <given-names>S. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yayilgan</surname>
            ,
            <given-names>I. S.</given-names>
          </string-name>
          Bajwa (Eds.),
          <source>Intelligent Technologies and Applications</source>
          , Springer International Publishing, Cham,
          <year>2022</year>
          , pp.
          <fpage>287</fpage>
          -
          <lpage>299</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen,
          <article-title>Debertav3: Improving deberta using electra-style pre-training with gradientdisentangled embedding sharing</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2111</volume>
          .
          <fpage>09543</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen, Deberta:
          <article-title>Decoding-enhanced bert with disentangled attention</article-title>
          ,
          <source>in: International Conference on Learning Representations</source>
          ,
          <year>2021</year>
          . URL: https://openreview.net/forum? id=XPZIaotutsD.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19-1423/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wayne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jun</surname>
          </string-name>
          ,
          <article-title>A robustly optimized BERT pre-training approach with posttraining</article-title>
          , in: S. Li,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          , Y. Liu,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Che</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>He</surname>
          </string-name>
          , G. Rao (Eds.),
          <source>Proceedings of the 20th Chinese National Conference on Computational Linguistics, Chinese Information Processing Society of China</source>
          , Huhhot, China,
          <year>2021</year>
          , pp.
          <fpage>1218</fpage>
          -
          <lpage>1227</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .ccl-
          <volume>1</volume>
          . 108/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>