HALE Lab NITK at Touché 2024: A Hybrid Approach for
                         Identifying Political Ideology and Power in Multilingual
                         Parliamentary Speeches
                         Notebook for the Touché Lab at CLEF 2024

                         Sevitha Simhadri1,*,† , Mauli Mehulkumar Patel1,† and Sowmya Kamath S.1,†
                         1
                             Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangalore 575025, India


                                        Abstract
                                        In this article, an approach to determine the political views and stances of speakers for identifying whether they
                                        support or oppose the government in parliamentary discussions is presented. The work was carried out as part
                                        of the Touché 2024 Task 2, “Ideology and Power Identification in Parliamentary Debates”. Towards this, two
                                        systems were developed, the first employs traditional machine learning methods with TF-IDF embeddings, while
                                        the second utilizes advanced NLP techniques with the LASER encoder for multilingual embeddings. Both systems
                                        incorporate standard preprocessing techniques and also integrates a variety of models, after which a voting
                                        classifier is used to combine the predictions from both approaches. Experiments revealed that this comprehensive
                                        framework effectively addresses the complexities and nuances of political discourse, providing valuable insights
                                        into speakers’ ideologies and governing statuses within parliamentary debates.

                                        Keywords
                                        Parliamentary Debates, Governing Status, Natural Language Processing, Multilingual Embeddings


                         1. Introduction
                         In recent years, analyzing parliamentary debates has become a crucial area of study in political science
                         and natural language processing (NLP). Understanding speakers’ political ideologies and governing
                         statuses in these debates can offer profound insights into legislative processes and power dynamics.
                         The Touché 2024 Task 2, “Ideology and Power Identification in Parliamentary Debates”, addresses these
                         analytical challenges, inviting participants to develop systems that accurately identify parliamentary
                         speakers’ political stances and leadership roles [1]. This task is significant because it advances the
                         field of NLP and provides practical tools for political analysts and researchers. The Hale Lab team
                         participated in this task to contribute to developing these analytical tools. By identifying the underlying
                         ideologies and power structures in parliamentary debates, it is possible to understand better how
                         political narratives are constructed and conveyed, thus offering a deeper understanding of the legislative
                         process and political communication.
                            In this paper, we cover both tasks; first, we provide a high-level overview of the first task and then
                         detail our strategy. The same strategy is used for the second task, which is described subsequently.
                         Lastly, we will highlight our primary contributions and conclusions, along with suggestions for future
                         work, to wrap up the paper.
                            The paper is organized as follows: Section 2 describes the related work in this area; Section 3 outlines
                         the competition details; Section 4 provides an in-depth explanation of our approach; Section 5 discusses
                         our main findings; and finally, Section 6 draws conclusions and suggests directions for future research.


                         CLEF 2024: Conference and Labs of the Evaluation Forum, September 09–12, 2024, Grenoble, France
                         *
                           Corresponding author.
                         †
                           These authors contributed equally.
                         $ munnysimhadri5544@gmail.com (S. Simhadri); maulipatel03@gmail.com (M. M. Patel); sowmyakamath@nitk.edu.in
                         (S. K. S.)
                          0000-0002-0888-7238 (S. K. S.)
                                     © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Literature Survey
Political viewpoint identification study encompasses a wide range of approaches and perspectives that
are intended to help interpret the intricacies of political debate. In order to gain a clear understanding
of who is powerful and influential in political systems without conducting any practical experiments,
Abercrombie and Batista-Navarro [2] worked on the sentiment and position analysis of the parliamentary
structure. In order to reduce ambiguity and make biases in textual data across the political spectrum
more understandable, Doan and Gulla (2022) [3] developed bias learning techniques. In a similar
vein, they developed a language model for Scandinavian languages[4] and tested it using political
datasets. Preot¸iuc Pietro et al. (2017)[5], in contrast, provided a multimodal approach that incorporates
fine-grained ideology labels from surveys together with linguistic features including unigrams, LIWC,
Word2Vec themes, and sentiment analysis.
   Automatic political orientation prediction using social media postings has been studied by Pietro et
al. (2017), and it has shown to be rather successful in differentiating between openly avowed liberals
and conservatives in the US. Through the usage of language on Twitter, they sought to identify user
groups that were politically involved and to develop an improved model that could predict the political
ideology of users who are not visible.
   Multi-task learning(MTL) was investigated by Barnes et al. (2019)[6]as a means of integrating external
knowledge into neural networks for sentiment analysis. A straightforward method for identifying
ideological learnings in documents based on sentiment expressions toward various topics was presented
by Bhatia and P (2018)[7].
   The study conducted by Ahmadalinezhad and Makrehchi (2018)[8] centered on identifying points
of agreement and disagreement in political discourse. The importance of identifying agreement and
disagreement in political speech is emphasized in their abstract, which also presents their work as a
contribution to the social and cultural modeling field.


3. Overview of Tasks and Dataset
3.1. Task Definition
The task consists of two sub-tasks on identifying two important aspects of a speaker in parliamentary
debates (a) Sub-Task 1: Given a parliamentary speech in one of several languages, identify the ideology
of the speaker’s party, and (b) Sub-Task 2: Given a parliamentary speech in one of several languages,
identify whether the speaker’s party is currently governing or in opposition.

3.2. Dataset specifics
The dataset for this task is provided from ParlaMint [9], a multilingual corpus of parliamentary debates.
The data is curated to minimize potential confounding variables, such as speaker identity, to ensure a
balanced and unbiased dataset. The dataset is provided as tab-separated text files with the fields like, id
(a unique ID for each text), speaker (a unique ID for each speaker, multiple speeches from the same
speaker may be included), sex (the binary/biological sex of the speaker, which can be Female, Male, or
Unspecified/Unknown), text (the transcribed text of the parliamentary speech, which may include line
breaks and special sequences), text_en (automatic translation of the text to English. This field may be
empty for English speeches or for some non-English speeches where the translation is unavailable), label
(a binary/numeric label indicating political orientation [0 for left, 1 for right] or power identification [1
for opposition, 0 for coalition/governing party]). The training data encompasses parliamentary speeches
from 28 countries for the political orientation task and 25 countries for the power identification task.
The test files will have the same fields except for the label. A sample dataset for a single country (e.g.,
Latvia) for both sub-task 1 (political orientation) and sub-task 2 (power identification) is illustrated in Fig.
1a and 1b.
                                   (a) Sub-Task 1 (Political Orientation)


                                   (b) Sub-Task 2 (Power Identification)

Figure 1: Data samples from the data provided for the Sub-tasks


4. System Overview
4.1. Data Preprocessing
Before feeding text data into either system, the following preprocessing steps are performed to improve
the quality of the analysis. First, the text is broken down into individual words or meaningful units,
a process known as tokenization. Next, common words that don’t contribute to sentiment analysis,
such as “the”, “a”, and “an” are eliminated using language-specific stopword lists. Finally, words may be
reduced to their base form (lemma) to improve consistency, a process called lemmatization, although
this step is optional and not necessary for all languages.
   Various language-specific libraries and tools cater to a wide range of languages, facilitating text
analysis and processing tasks. SpaCy [10] supports languages such as Catalan, Croatian, Danish, Dutch,
English, Finnish, French, German, Greek, Italian, Polish, Portuguese, Romanian, Russian, Slovenian,
Spanish, Swedish, and Ukrainian. NLTK [11] provides robust support for languages including Czech,
Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Italian, Polish, Portuguese, Russian,
Slovenian, Spanish, Swedish, and Turkish. Additionally, StanfordNLP [12] specializes in Bulgarian, Croa-
tian, Serbian, and Slovenian. These tools are essential for preprocessing, analyzing, and understanding
text across diverse linguistic contexts, enhancing the capability of NLP applications worldwide.

4.2. Approach 1: Text Vectorization using TF-IDF
After preprocessing, the textual data is transformed using the Term Frequency-Inverse Document
Frequency (TF-IDF) method [13]. TF-IDF converts text into numerical features by assessing word
frequency and importance across documents. This transformation is crucial for identifying ideological
leanings within parliamentary speeches and distinguishing whether a speaker represents an opposition
or governing party. The implementation of TF-IDF in this project is pivotal for analyzing parliamentary
Figure 2: Proposed Approach


speeches, as it emphasizes words that are distinctive within specific documents yet less common across
the entire corpus. This approach enhances the understanding of the content and context of individual
speeches, facilitating nuanced analysis of political discourse.
   TF-IDF plays a crucial role in ideology detection by identifying key terms and phrases indicative of left-
leaning or right-leaning ideologies. Terms frequently associated with specific ideological stances receive
higher TF-IDF scores, enabling machine learning models to effectively differentiate between speeches
with contrasting ideological orientations. Additionally, TF-IDF assists in distinguishing speeches from
opposition members versus those representing governing parties. By analyzing the prevalence and
importance of specific terms, TF-IDF reveals linguistic patterns characteristic of opposition or governing
party discourse.

4.3. Approach 2: Multilingual Sentence Embeddings
Instead of employing the TF-IDF method for text embedding, the LASER (Language-Agnostic SEntence
Representations) encoder was utilized to transform the textual data. Developed by Facebook, LASER [14]
is designed to enhance performance by providing highly effective multilingual sentence representations.
This toolkit supports over 90 languages written in 28 different alphabets, embedding all languages
jointly in a unified space rather than requiring separate models for each language. This capability makes
LASER particularly advantageous for zero-shot transfer learning (Fig. 3), where a model trained on one
language can generalize to others, including low-resource languages. The LASER encoder employs a
five-layer bidirectional Long Short-Term Memory (BiLSTM) network (Fig. 4) to generate a fixed-size
vector representation of input sentences in 1,024 dimensions. This high-dimensional vector is derived
by max-pooling over the final states of the BiLSTM, ensuring that the embeddings encapsulate the
semantic essence of sentences, irrespective of their written language. This universal, language-agnostic
sentence embedding simplifies the comparison of sentence representations and supports their direct
application in diverse classifiers.


Figure 3: Zero-Shot approach vs other models
Figure 4: LASER Model Architecture [14]


4.4. Prediction Models
Features extracted from both TF-IDF and LASER encodings are utilized in various traditional machine
learning models to classify speeches based on ideological orientation and political affiliation. The models
employed include Logistic Regression, Support Vector Classifier (SVC), Naive Bayes, Random Forest
Classifier, Gradient Boosting Classifier, and XGBoost Classifier. These models leverage the numerical
features derived from the embeddings to effectively categorize the speeches, distinguishing between
different ideological leanings and political affiliations.
   To address the limitations encountered with traditional machine learning models, advanced deep
learning architectures were incorporated into the classification process. A multi-layered LSTM [15]
architecture with an embedding layer was utilized to convert inputs into denser representations.
Regular dropout and recurrent dropouts were integrated to ensure the model’s ability to generalize well.
Additionally, a Simple Neural Network with two hidden layers was employed, featuring input layers,
multiple hidden layers with ReLU activations, and dropout layers. Furthermore, a Voting Classifier was
employed, combining the predictions of all the above classifiers—including the ML models, LSTM, and
Simple Neural Network—to enhance classification accuracy.

4.5. Integrating BERT
In an attempt to further enhance performance, we considered integrating BERT [16], a transformer-
based model known for its contextual word embeddings. However, due to the demanding computational
requirements and the unsatisfactory results obtained during the training phase, we decided to terminate
the integration of BERT into the classification pipeline. While BERT holds promise for improving
classification accuracy by capturing important contextual information, our preliminary experimentation
indicated that computational infrastructure constraints and performance limitations made it impractical
for deployment in our project’s context. Additionally, as BERT is language-specific and multilingual
versions of BERT were not readily available, we halted further testing and deployment of BERT.


5. Experimental Results
As the leaderboard results are yet to be released, we are currently comparing our outcomes solely with
the baseline. In the Touché 2024 Task 2, our team, Hale Lab, explored two distinct approaches. Both
strategies demonstrated remarkable performance enhancements compared to the baseline across a
range of metrics. In System 1, we achieved an F1 score of 0.6055 for political orientation and 0.6724
for power identification. Similarly, System 2 yielded promising results with an F1 score of 0.6154 for
political orientation and 0.6983 for power identification, as shown in Table 1. When compared to the
baseline, our methodologies consistently showcased improved performance metrics, including precision,
recall, and F1 scores, across various countries. These outcomes underscore the effectiveness of our
approaches in accurately deciphering political ideologies and power dynamics within parliamentary
debates.
   Table 1
   Touché Task2 2024 Preliminary Results
                                  Model      F1_orientation   F1_power
                                 Baseline        0.569        0.640
                                 System 1        0.615        0.672
                                 System 2        0.605        0.698


6. Conclusion and Future Work
In this paper, the various approaches designed for addressing the Touché 2024 Task 2 requirements,
focusing on the identification of political ideologies and power structures within parliamentary de-
bates, were presented. Our methodology involved leveraging diverse feature sets, including linguistic,
contextual, and speaker-related features, and applying advanced classification models to accurately
detect the political orientation and power status of speakers. Despite the complexities introduced by
the multilingual and heterogeneous nature of the dataset, our experiments yielded significant insights
into the ideological and power dynamics of parliamentary discourse. These findings underscore the
importance of robust preprocessing and the integration of various linguistic and contextual features to
enhance model performance.
   As part of extended work, we plan to further optimize our model to specifically identify the relation-
ships between speeches, to determine which speeches are replies to others. This relational context is
currently missing in the dataset but is crucial for a comprehensive understanding of parliamentary
debates. Techniques such as dialogue act recognition and sequential modeling to map the conversational
flow between speeches will also be explored. Including more languages and legislative contexts, and
expert feedback, to enhance the generalizability of our models.


Acknowledgments
We extend our sincere gratitude to the Touché Lab for providing us with the opportunity to participate
in this challenging task and for their support throughout the process. Special thanks to Çağrı Çöltekin
for his invaluable assistance.


References
 [1] J. Kiesel, Ç. Çöltekin, M. Heinrich, M. Fröbe, M. Alshomary, B. De Longueville, T. Erjavec,
     N. Handke, M. Kopp, N. Ljubešić, et al., Overview of touché 2024: Argumentation systems,
     in: European Conference on Information Retrieval, Springer, 2024, pp. 466–473.
 [2] G. Abercrombie, R. Batista-Navarro, Sentiment and position-taking analysis of parliamentary
     debates: a systematic literature review, Journal of Computational Social Science 3 (2020) 245–270.
 [3] T. M. Doan, J. A. Gulla, A survey on political viewpoints identification, Online Social Net-
     works and Media 30 (2022) 100208. URL: https://www.sciencedirect.com/science/article/pii/
     S246869642200012X. doi:https://doi.org/10.1016/j.osnem.2022.100208.
 [4] T. M. Doan, B. Kille, J. A. Gulla, SP-BERT: A language model for political text in scandinavian
     languages, in: Natural Language Processing and Information Systems, Lecture notes in computer
     science, Springer Nature Switzerland, Cham, 2023, pp. 467–477.
 [5] D. Preoţiuc-Pietro, Y. Liu, D. Hopkins, L. Ungar, Beyond binary labels: Political ideology prediction
     of Twitter users, in: R. Barzilay, M.-Y. Kan (Eds.), Proceedings of the 55th Annual Meeting of the
     Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational
     Linguistics, Vancouver, Canada, 2017, pp. 729–740. URL: https://aclanthology.org/P17-1068. doi:10.
     18653/v1/P17-1068.
 [6] J. Barnes, S. Touileb, L. Øvrelid, E. Velldal, Lexicon information in neural sentiment analysis: a
     multi-task learning approach, in: M. Hartmann, B. Plank (Eds.), Proceedings of the 22nd Nordic
     Conference on Computational Linguistics, Linköping University Electronic Press, Turku, Finland,
     2019, pp. 175–186. URL: https://aclanthology.org/W19-6119.
 [7] S. Bhatia, D. P, Topic-specific sentiment analysis can help identify political ideology, in: A. Balahur,
     S. M. Mohammad, V. Hoste, R. Klinger (Eds.), Proceedings of the 9th Workshop on Computational
     Approaches to Subjectivity, Sentiment and Social Media Analysis, Association for Computational
     Linguistics, Brussels, Belgium, 2018, pp. 79–84. URL: https://aclanthology.org/W18-6212. doi:10.
     18653/v1/W18-6212.
 [8] M. Ahmadalinezhad, M. Makrehchi, Detecting agreement and disagreement in political debates,
     in: Social, Cultural, and Behavioral Modeling: 11th International Conference, SBP-BRiMS 2018,
     Washington, DC, USA, July 10-13, 2018, Proceedings 11, Springer, 2018, pp. 54–60.
 [9] T. Erjavec, M. Ogrodniczuk, P. Osenova, N. Ljubešić, K. Simov, A. Pančur, M. Rudolf, M. Kopp,
     S. Barkarson, S. Steingrímsson, Çöltekin, J. de Does, K. Depuydt, T. Agnoloni, G. Venturi,
     M. Calzada Pérez, L. D. de Macedo, C. Navarretta, G. Luxardo, M. Coole, P. Rayson, V. Morke-
     vičius, T. Krilavičius, R. Darǵis, O. Ring, R. van Heusden, M. Marx, D. Fišer, The parlamint
     corpora of parliamentary proceedings, Language resources and evaluation 57 (2022) 415–448.
     doi:10.1007/s10579-021-09574-0.
[10] M. Honnibal, I. Montani, spacy 2: Natural language understanding with bloom embeddings,
     convolutional neural networks and incremental parsing, To appear 7 (2017) 411–420.
[11] S. Bird, E. Klein, E. Loper, Natural language processing with Python: analyzing text with the
     natural language toolkit, " O’Reilly Media, Inc.", 2009.
[12] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Manning, Stanza: A python natural language processing
     toolkit for many human languages, arXiv preprint arXiv:2003.07082 (2020).
[13] J. Ramos, et al., Using tf-idf to determine word relevance in document queries, in: Proceedings of
     the first instructional conference on machine learning, volume 242, Citeseer, 2003, pp. 29–48.
[14] M. Artetxe, H. Schwenk, Massively multilingual sentence embeddings for zero-shot cross-lingual
     transfer and beyond, Transactions of the association for computational linguistics 7 (2019) 597–610.
[15] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997) 1735–1780.
[16] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers
     for language understanding, arXiv preprint arXiv:1810.04805 (2018).