<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Munibuc at Touché: Generalist Embeddings for Ideology and Populism Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marius Marogel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Silviu Gheorghe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bucharest</institution>
          ,
          <addr-line>Academiei 14, Bucharest, 010014</addr-line>
          ,
          <country country="RO">Romania</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Recent generalist text embedding models (gte) with customized instructions are designed to be used in many NLP applications such as classification, clustering, or retrieval. We use Nvidia's NV-Embed-v2 which has a Mistral-7b backbone with task-based instructions for Sub-Task 1 (ideology detection) and Sub-Task 3 (populism detection) to extract features for classification. Combined with a Support Vector Classifier, our system outperforms the proposed baseline and proves the reliability of tailoring generalist embedding models on various tasks on which they are not trained.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;generalist embeddings</kwd>
        <kwd>customized instructions</kwd>
        <kwd>LLMs for feature extraction</kwd>
        <kwd>ideology detection</kwd>
        <kwd>populism detection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The political opinion of an individual can be seen as a point in a high-dimension vector space. People
hold beliefs in regards to what are the most desirable public policies to be implemented, such as taxation
and redistribution, in regards to ethical matters, spiritual opinions and so on. These possible choices can
be seen as separate dimensions of the political space. These separate dimensions, however, are often
quite well correlated, so, for example, the position of an individual in regard to taxation is quite a good
predictor of the position related to other, seemingly unrelated matters, such as planned parenthood.
Because of this fact, the political space can be mapped on lower dimension spaces such as a line or a
plane. Examples of such spaces are the left-right spectrum 1 or left-right plus authoritarian-libertarian
plane usually associated with the political compass. This represents a form of dimensionality reduction.
The terms left and right have been coined and associated with the meanings we use today after the
French Revolution (1789), based on the favourite physical spot that various parties took in the National
Assembly. Another aspect, besides the politics to be implemented, is the strategy employed in taking or
holding the power. Populism is sometimes seen as a strategy that aims to win votes by dividing the
society into two opposing groups, the us vs them paradigm, usually the ‘common people’ vs ‘the elites’.
The political leaning and populism can be usually detected in a discourse due to the diferent use of the
language associated with diferent politics and strategies.
1.1. Text classification
The proposed tasks can therefore be seen as text classification, an activity that’s almost as old as the
written language itself. We know that the library of Alexandria, for example, was organized into
sections, according to the subject, in order to help the scholars find the work relevant to their field of
study. Text classification used to be a human tasks, but, with the advance of the digital representation of
texts, it became central to the information technology. Early text classification techniques were based
on boolean or statistic operations with the terms found in a document. They usually consisted of a
3-step process: Feature extraction that digitizes the text (usually in a high-dimensional vector space)
followed optionally by dimensionality reduction and finally some kind of classification algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Most of the early methods were essentially bag-of-words methods, meaning that they didn’t really
consider the position of the words and the relationships between them. Using groups of N words
(N-grams) was an early attempt to partially address this problem but the solution was only a temporary
one. One of the first methods to change this was Long short-term memory (LSTM)[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These methods
proposed a method to retain the context of the text seen so far, making sense of the words encountered
in context, with practical applications in text translation sound processing and so on, arguably leading
to the apparition of the modern attention mechanism some 20 years later.
      </p>
      <p>
        A breakthrough in text processing happened with the invention of the modern quadratic transformers
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and the attention mechanism. The method allows words to be finally understood in context. This
permitted the apparition of pretrained language models (PLMs). PLMs are language models that are
trained, usually in an unsupervised fashion, on a large corpus of text, allowing them to obtain a general
understanding of the language and the world.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The use of pre-trained models for text classification started when Yin et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed using the
entailment problem2 as a form of text classification. Specifically, to determine whether a text refers to a
certain subject, say sports, the authors determine if the text to be classified,  entails a statement of the
form the previous text is about sports. If the problem is multi-class such an inference is performed with
each of them and the one with the highest confidence is presumed to be the correct class. Zhang et
al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] determined that fine-tuning PLMs on various available tasks, where the task is described in a
simple language, leads the models to the ability to perform simple language-described tasks. It was also
shown that [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ] dividing the available NLP tasks into groups and fine-tuning them on some clusters
improves the performance on other, unseen and unrelated clusters. Foundation models are PLMs that
are pretrained on general data, with the specific purpose of being further adapted to various destination
tasks. The general structure of the language is learned at the initial training and the specific details
are learned later. The target task can be specified in two diferent ways: fine-tuning and intstruction
tuning. Fine-tuning involves a training step from task-specific examples which can lead to catastrophic
forgetting. Instruction tuning leverages the natural language task description during learning processes.
      </p>
      <p>
        Political ideology identification is a reasonably well-studied task as shown by a recent survey [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Machine learning is often used to automatically classify the news. Especially of interest are the solutions
submitted in the previous year as described in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], due to the fact that they are address the exact same
type of problem. In [11] multiple PLMs are used, forming an ensemble model. The work also makes
use of data augmentation through back-translation. In [12] the authors use a fine-tuned BERT on the
English translation of the parliamentary text. Other teams experimented with various classical Machine
Learning models for feature extraction and classification such as TF-IDF, SVM, KNN and Deep Learning
methods.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>The solution we propose for the Ideology and Power Identification in Parliamentary Debates 2025
shared task is an automated system consisting of two stages: feature extraction and classification. We
customize a generalist embeddings model, Nvidia’s NV-Embed-v2 [13], with task-based instructions for
orientation and populism detection.
2for which some PLMs are pretrained</p>
      <p>Generalist embedding models are a recent trend in Representation Learning with many applications
in NLP. The generalist attribute of these models refers to their ability to capture relevant neural
representations for a wide range of NLP tasks and subfields. Their performance is evaluated on massive
benchmarks such as MTEB [14] (Massive Text Embedding Benchmark), which contain tasks such as
text classification, clustering, retrieval, or sentence similarity.</p>
      <p>We choose Nvidia’s NV-Embed-v2 model to extract embeddings for both orientation and populism
detection. NV-Embed-v2 is a generalist embedding model which ranked No. 1 on MTEB as of May 2024
and August 2024 and No. 5 as of May 2025. The authors of NV-Embed propose a lattent attention layer
on top of pre-trained Mistral-7b [15], a decoder-only LLM, alongside a two-stage contrastive learning
training process. They curate training data in the first stage for hard negatives and in-batch negatives
and use contrastive learning only on retrieval datasets, while in the second stage they disable the use of
in-batch negatives for the non-retrieval datasets.</p>
      <p>We use custom instructions for the specialized embeddings on each Sub-Task on the English
translation of the parliamentary texts and use them to create instructed queries as proposed by the authors in
prompt 1:
• Orientation Instruction: "Classify the orientation of the political speech as either left or right"
• Populism Instruction:" Identify the position of the speaker’s party on the populist–pluralist scale.</p>
      <p>Classify it as one of the following: 1 (Strongly Pluralist), 2 (Moderately Pluralist), 3 (Moderately
Populist), or 4 (Strongly Populist)."
 = Instruct:{task_definition} Query: {_}
(1)</p>
      <p>We use the resulted embeddings in a Support Vector Classifier (SVC) model for each task. We split
the train embeddings into 5 splits to test the results of varying the  hyperparameter in the SVC
implementation from scikit-learn [16] with linear kernel.</p>
      <p>Hyper-parameter selection Parameter , regularization, is varied between 10− 2 and 102
logarithmically, in 20 steps, separately for each parliament. The best value is computed using GridSearchCV,
using five cross validation splits.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussions</title>
      <p>We evaluate the embeddings with SVC on a held-out validation set before training the final model for each
parliament. Using a five-fold cross-validation and grid search, we determine the best hyperparameter
 in each country. We then compare the predictions on the held-out set with the predictions from
the baseline model: a weighted tf-idf for feature extraction and a logistic regression (tf-idf+lr) for
classification trained and evaluated on the same split.
Table 1 contains the comparison between our system (nv-embed+svc) and the baseline predictions on
the held-out validation set for the orientation task (Sub-Task 1). We notice that our system outperforms
the baseline tf-idf+lr on all parliaments, with the exception of es-ct with equal performance and es-pv
with a very big diference which is due to there being no left-wing samples in the held-out validation
and the model achieves 0.98 on right-wing and 0.00 on left-wing, thus the 0.49 macro F1.</p>
      <p>We present the results of the held-out validation set in the populism task in Table 2. As we can see,
for this Sub-Task there are multiple parliaments for which our method does not beat the baseline, with
parliaments with a very high diference in performance. For example, in the pl parliament the diference
is 30% and in es-ga the diference is 20% in our favor, while in gb the diference is 29% in favor of the
baseline.</p>
      <p>In Table 3, we present the results of the Orientation Sub-Task on the test set. NV-Embed-v2 features
with SVC obtain an average of 0.66 macro F1, thus outperforming the logistic regression baseline of
0.57 macro F1. The same diference in performance is seen on the Populism Sub-Task in Table 4, where
our system reaches 0.496 macro F1 compared to 0.418 macro F1 from the baseline. With the increase
in performance of 8-9 points for both Sub-Tasks, we see the diference of tackling NLP tasks with
customizable representations based on generalist text embedding models.</p>
      <p>Future research. Given the results of NV-Embed-v2 with SVC on the test set, we consider generalist
text embedding models as relevant feature extractors for both Sub-Tasks. Future experiments with
diferent top-ranking models from MTEB are needed to assess the proposed methodology over a range
of LLM-based embedding models. While we use pre-trained embedding models, fine-tuning them
directly on the task or using contrastive learning techniques to retrieve political documents should
uncover the full potential of generalist LLM-based text representations.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we present the submission of team Munibuc for Sub-Tasks 1 and 3 (orientation and
populism) for the Ideology and Power Identification in Parliamentary Debates 2025 shared task. Our
approach is to use generalist text embedding models as feature extractors, thus evaluating the
generalization capabilities of LLM-based embeddings on specific datasets. We extract task-based embeddings
with customized instructions from a model based on Mistral-7b: NV-Embed-v2. Then, we use the
extracted embeddings with a Support Vector Classifier with tuned hyperparameters on each parliament,
resulting in an automated detection system which outperforms the baseline by a considerable margin
on Orientation and Populism Sub-Tasks.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgments</title>
      <p>This work was supported by a grant of the Ministry of Research, Innovation and Digitization, CCCDI
UEFISCDI, project number PN-IV-P6-6.3-SOL-2024-0090, within PNCDI IV.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors did not use any Generative AI tool and take full
responsibility for the publication’s content.
[11] O. Palmqvist, J. Jiremalm, P. Picazo-Sanchez, Policy parsing panthers at touché: ideology and
power identification in parliamentary debates, in: Working Notes of the Conference and Labs of
the Evaluation Forum (CLEF 2024). CEUR Workshop Proceedings, CEUR-WS. org, 2024.
[12] D. Chandar, D. Seshan, A. Koushik, P. Mirunalini, Trojan horses at touché: Logistic regression for
classification of political debates (2024).
[13] C. Lee, R. Roy, M. Xu, J. Raiman, M. Shoeybi, B. Catanzaro, W. Ping, Nv-embed: Improved techniques
for training llms as generalist embedding models, 2025. URL: https://arxiv.org/abs/2405.17428.
arXiv:2405.17428.
[14] N. Muennighof, N. Tazi, L. Magne, N. Reimers, Mteb: Massive text embedding benchmark, 2023.</p>
      <p>URL: https://arxiv.org/abs/2210.07316. arXiv:2210.07316.
[15] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F.
Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao,
T. Lavril, T. Wang, T. Lacroix, W. E. Sayed, Mistral 7b, 2023. URL: https://arxiv.org/abs/2310.06825.
arXiv:2310.06825.
[16] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay,
Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–
2830.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Faye</surname>
          </string-name>
          ,
          <article-title>Théorie du récit: introduction aux langages totalitaires: critique de la raison, l'économie narrative, (No Title) (</article-title>
          <year>1972</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kowsari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jafari Meimandi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Heidarysafa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mendu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Barnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <source>Text Classification Algorithms: A Survey</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <article-title>150</article-title>
          . URL: https://www.mdpi.com/2078-2489/10/4/150. doi:
          <volume>10</volume>
          . 3390/info10040150.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          , N. e. a. Parmar, Attention is All you Need,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <year>2017</year>
          . URL: https://proceedings.neurips.cc/paper/2017/ hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          ,
          <article-title>Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3914</fpage>
          -
          <lpage>3923</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1909</year>
          .
          <volume>00161</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Wang, Instruction tuning for large language models: A survey (</article-title>
          <year>2024</year>
          -12-01). URL: http://arxiv.org/abs/ 2308.10792. doi:
          <volume>10</volume>
          .48550/arXiv.2308.10792.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Guu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <surname>Finetuned Language Models Are Zero-Shot</surname>
            <given-names>Learners</given-names>
          </string-name>
          , in: International Conference on Learning Representations,
          <year>2022</year>
          . URL: http://arxiv.org/abs/2109.01652. doi:
          <volume>10</volume>
          .48550/arXiv.2109.01652.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tan</surname>
          </string-name>
          , et al.,
          <source>NaturalSpeech</source>
          <volume>2</volume>
          :
          <article-title>Latent difusion models are natural and zero-shot speech and singing synthesizers</article-title>
          ,
          <source>in: ICLR</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Gulla</surname>
          </string-name>
          ,
          <article-title>A survey on political viewpoints identification</article-title>
          ,
          <source>Online Social Networks and Media</source>
          <volume>30</volume>
          (
          <year>2022</year>
          )
          <fpage>100208</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiesel</surname>
          </string-name>
          , Ç. Çöltekin,
          <string-name>
            <given-names>M.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alshomary</surname>
          </string-name>
          , B. De Longueville,
          <string-name>
            <given-names>T.</given-names>
            <surname>Erjavec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Handke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kopp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ljubešić</surname>
          </string-name>
          , et al.,
          <source>Overview of touché</source>
          <year>2024</year>
          :
          <article-title>Argumentation systems</article-title>
          ,
          <source>in: International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>308</fpage>
          -
          <lpage>332</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>