<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SEBD</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Representation of Linguistic Properties in Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elisabetta Rocchetti</string-name>
          <email>elisabetta.rocchetti@unimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alfio</string-name>
          <email>alfio.ferrara@unimi.it</email>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Explainable AI, Large Language Models, Probing, Linguistic Properties</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Università degli Studi di Milano, Department of Computer Science</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>33</volume>
      <fpage>16</fpage>
      <lpage>19</lpage>
      <abstract>
        <p>This paper reviews studies evaluating the linguistic performance of large language models (LLMs), focusing on how they process and represent linguistic properties. We explore methods like probing classifiers and Iterative Nullspace Projection (INLP) to assess whether LLMs actively use encoded knowledge during inference, and how shifting representations can help evaluate model performance. We report findings showing that LLMs can encode formal properties, such as syntax and morphology, but are less proficient with functional phenomena like semantics, with monolingual models outperforming multilingual ones. We also highlight gaps in current evaluations, such as the limited testing of recent large models on a small set of tasks [1]. We suggest that future research could integrate diferent perspectives of what linguistic competence is.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>State-of-the-art Large Language Models (LLMs) have demonstrated an extraordinary ability to generate
human-like text. Their fluency suggests that they do more than just predict the next word based
on surface-level patterns observed during training. Instead, these models appear to capture deeper,
more abstract linguistic properties, enabling them to construct grammatically correct and contextually
appropriate sentences. However, the precise nature of these learned representations remains an open
question. Researchers in the field of Explainable AI (XAI) have been actively investigating which
linguistic properties LLMs encode, how they use them, and whether these properties are merely
incidental patterns or core components of their reasoning process.</p>
      <p>One fundamental observation is that LLMs go beyond simple word association. If a model merely
memorized sequences of words, it would struggle to generalize grammatical rules to new contexts.
However, research suggests that LLMs implicitly learn various linguistic structures that guide their
predictions. For example, models can often correctly conjugate verbs according to the subject of the
sentence, a phenomenon known as subject-verb agreement. Consider the sentence:
(1)</p>
      <p>The cat that chased the dog is sleeping.</p>
      <p>Even though the noun “dog” is closer to the verb “is sleeping”, the model correctly associates “is
sleeping” with “cat”, demonstrating an understanding of syntactic dependencies rather than relying
solely on proximity.</p>
      <p>
        XAI research has explored which linguistic properties LLMs encode and how this information is
structured within their internal components. In particular, researchers have investigated whether
specific neurons or layers store linguistic knowledge and how this knowledge contributes to the model’s
behavior. Identifying where particular information is stored is valuable for various applications, such
as controlling the model’s output [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or reducing model size by retaining only its most important
components [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. By analyzing internal representations—either through embeddings or neuron
      </p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073
activations—researchers have found evidence that LLMs capture linguistic phenomena in structured
ways.</p>
      <p>In this work, we present a comprehensive literature review of state-of-the-art methods for
investigating whether large language models (LLMs) encode knowledge of various linguistic properties. We begin
by formalizing the types of language models under consideration, as well as the linguistic phenomena
that can be analyzed. We then examine existing research on how LLMs represent linguistic information
within their internal layers, with a focus on studies that employ probing classifiers to uncover specific
linguistic features. Additionally, we explore approaches for evaluating whether the linguistic
information present in a model is actually utilized during inference. Finally, we synthesize empirical findings
from the literature, highlighting key results and validating the discussed methodologies.</p>
      <p>The paper is structured as follows: Section 2 introduces the notation that will be used to explain
key concepts; Section 3 presents examples of the linguistic properties that can be analyzed in LLMs;
Section 4 describes probing classifiers as a methodology for interpreting and analyzing the internal
representations of LLMs; Section 5 discusses methods for evaluating whether a LLM encodes specific
linguistic phenomena in its hidden representations. It also introduces Iterative Nullspace Projection
(INLP) as a method for selectively identifying and removing specific features from hidden representations;
Section 6 summarizes key insights regarding the linguistic competence of language models. It also
explores how language models’ ability to learn and encode linguistic properties varies when considering
diferent languages, particularly in the context of multilingual versus monolingual models; Section 7
concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Language models</title>
      <p>In this section, we introduce the notation that will be used to refer to LLMs throughout the rest of the
paper.</p>
      <p>Let ℳ be a LLM with  hidden layers, each containing  neurons. Each neuron in layer  ∈ {1, … , } is
connected to neurons in layer  + 1 through weighted connections with biases. ℳ refers to a LLM with
any architecture, including LSTM, RNN, or Transformer. Let W() ∈ ℝ × be the weight matrix storing
the connection weights between neurons in layer   and those in layer  +1 . Similarly, let b() ∈ ℝ be the
bias vector associated with layer  . The model ℳ processes an input sequence represented as a matrix
X ∈ ℝ× , where  is the sequence length and  is the vocabulary size. Each row of X corresponds to a
one-hot encoded representation of a token from the input string.</p>
      <p>As the input propagates through the model, it is transformed into a series of latent representations.
Specifically, at each layer  ∈ {1, … , } , the model computes a hidden representation H() ∈ ℝ× , where
each row corresponds to a transformed token embedding in a latent space of dimension  . We denote
by ℎ() the activation of the  -th neuron at the  -th layer for the  -th token in the sequence.</p>
      <p>Finally, the model generates an output matrix Y ∈ ℝ× , where each row y represents a probability
distribution over the vocabulary, indicating the likelihood of each token being the next in the sequence
at position  . The final row, y , corresponds to the probability distribution for the next token following
the last input token. This output distribution is obtained by applying a final linear transformation to
the last hidden representation, followed by a softmax function.</p>
      <p>During decoding, the next token is typically selected by choosing the most probable token from y ,
or by sampling from the distribution, depending on the inference strategy used.</p>
      <p>Table 1 summarizes the notation discussed in this section.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Linguistic properties</title>
      <p>
        In this section, we present examples of the linguistic properties that can be analyzed in LLMs. Prior
work, such as [
        <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
        ], categorizes these properties into two broad types: formal and functional linguistic
phenomena.
      </p>
      <p>Formally, we define the set of all possible linguistic properties as  = { 1, … ,   }, where each property
  ∈  is associated with a set of possible values or labels y(  ). For instance, consider the linguistic
property of subject-verb agreement: this phenomena can take two possible values: {correct, incorrect},
indicating whether the verb is correctly conjugated to match its corresponding subject. Additional
examples of linguistic properties and their respective labels can be found in Table 2.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Probing classifiers</title>
      <p>
        Probing classifiers represent a powerful methodology for detecting linguistic knowledge in internal
representations of LLMs. The core idea behind probing is to train an external classifier to predict a
specific linguistic property based on the representations extracted from the model [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8, 9, 10</xref>
        ].
      </p>
      <p>Formally, a probing classifier can be defined as a function  ∶ H() → y(  ), which maps a hidden
representation H() from layer  to a label y(  ) associated with a linguistic property   . The classifier’s
weights  are trained on an annotated dataset, where the inputs consist of hidden representations
extracted from the LLM, and the outputs are labels corresponding to a specific linguistic phenomena.
The performance of  provides insight into how well the internal representations encode the target
property. If a probe  successfully predicts the property (achieving high accuracy), it suggests that the
linguistic feature in question is encoded within the model’s hidden states. Conversely, if the probe fails,
this may indicate that the information is either absent or not accessible in the given representation.</p>
      <p>To illustrate, consider the task of subject-verb agreement—a fundamental grammatical rule in many
languages. A probing classifier can be trained to predict whether a verb correctly agrees in number
with its subject based on the hidden states of an LLM. Suppose we extract the representation H() from
a given layer of the model while processing sentence 1: “The cat that chased the dogs is sleeping.” The
correct verb form is must agree with the singular subject cat, despite the presence of the plural noun
dogs intervening between them. A well-trained probe should be able to infer that the subject is singular
based on H() and correctly predict this agreement pattern. If the probe achieves high accuracy across
various test cases, this suggests that the LLM’s internal representations contain information about
subject-verb agreement. However, if the probe struggles to distinguish correct from incorrect verb
forms, it may imply that this grammatical rule is not explicitly encoded in the model’s representations
or is only accessible in a more complex, non-linear manner.</p>
      <p>
        Probing classifiers have been used extensively to investigate a variety of linguistic properties,
including syntactic structures and part-of-speech information. For instance, previous studies have shown
that linear transformations of LLM representations can encode syntax tree structures [
        <xref ref-type="bibr" rid="ref7">9, 7</xref>
        ] and
partof-speech tags [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Specifically, in cases like those mentioned, where the probing classifier is trained
as a linear classifier, it can be inferred that the information it learns is also represented linearly in the
model. By systematically applying probes to diferent layers and architectures, researchers can gain
deeper insights into how linguistic information is distributed within LLMs and how diferent levels of
abstraction are captured across layers.
      </p>
      <p>In the following sections (4.1 and 4.2), we illustrate how LLMs’ internal representations can be used
to train a probing classifier for detecting and localizing linguistic properties.</p>
      <sec id="sec-4-1">
        <title>4.1. Probing linguistic properties via embeddings</title>
        <p>One way to analyze the linguistic knowledge stored in an LLM is by evaluating its internal
representations, specifically the embeddings produced at diferent layers. This approach considers all neurons in a
layer collectively, working with the hidden representation h() rather than isolating the contribution of
individual neurons. This analysis does not attempt to pinpoint where linguistic information is stored
but instead assesses whether the model encodes specific linguistic properties at all.</p>
        <p>
          A common method involves extracting the latent representations of tokens from the last layer of
the network, H() , and using them as input to probing classifiers. These classifiers are trained to
predict linguistic features from the embeddings, providing insight into the extent to which the model
ℳ captures syntactic or semantic information [
          <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8, 9, 10</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Neuron ranking</title>
        <p>To go beyond embedding-level analysis, some studies focus on ranking individual neurons based on
their relevance to a specific linguistic property. Instead of treating the entire hidden representation as a
single feature vector, this method evaluates the contribution of individual neurons by analyzing their
activations and how they influence external classifiers.</p>
        <p>This approach involves training a probing classifier to predict a linguistic feature while using neuron
activations as input features. By examining how much each neuron contributes to the classifier’s
prediction, researchers can rank neurons according to their importance. If this ranking is meaningful,
we should be able to retain only the top- neurons and still achieve high accuracy with the probe. If
the probe maintains strong performance using only a small subset of neurons, this indicates that the
ranking efectively identifies the most informative neurons [</p>
        <p>Focusing on linear probes, Dalvi et al. [11] propose training a linear classifier to classify the values y 
of a linguistic property   (e.g., subject-verb agreement, part-of-speech tagging, morphological analysis,
or semantic tagging). The importance of a neuron for classifying   can be inferred from the magnitude
of its associated classifier weight. Specifically, to rank neurons by their relevance, the authors extract
the weight vector  ∈ ℝ from the trained probe and sort its elements by absolute value in descending
( )
order.</p>
        <p>By ranking neurons, [11, 12, 13] have demonstrated that linguistic knowledge is distributed across
neurons, though not uniformly. Instead, it is often concentrated in a subset of highly ranked neurons,
rather than being evenly spread across all neurons in the model. This finding supports the idea that
some neurons specialize in encoding specific linguistic properties, making it possible to extract and
interpret this information with targeted analysis.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Assessing and controlling the usage of linguistic features in LLMs</title>
      <p>So far, we have discussed methods for evaluating whether a LLM encodes specific linguistic phenomena
in its hidden representations. However, probing classifiers and similar analysis techniques primarily
reveal whether linguistic properties are present in the model’s internal representations—they do not
indicate whether and how these properties are actually used during inference [14].</p>
      <p>To determine whether an LLM actively uses a certain linguistic property during inference and to
modify the model’s behavior accordingly, Iterative Nullspace Projection (INLP) [15] provides a method
for selectively identifying and removing specific features from hidden representations. In the original
paper, the authors successfully identified gender information encoded in the model’s contextual
representations. By removing this information, they observed that the model’s predictions were no longer
biased, indicating that the original model had indeed relied on gender-related features in its hidden
representations.</p>
      <sec id="sec-5-1">
        <title>Formally, let h</title>
        <p>() ∈ ℝ be the hidden representation of the  -th token in the input sequence at layer
 , and let y(  ) represent the corresponding value of a linguistic property   . Following [15], consider
gender as an example of such a linguistic property, where y(  ) can take values from {male, female}.</p>
      </sec>
      <sec id="sec-5-2">
        <title>The objective is to find a transformation  ∶ ℝ  → ℝ  such that the transformed representation ( h no longer encodes information about   . In other words, after applying (⋅) , it should not be possible to</title>
        <p>() )
related to   .
predict   from the transformed representation.</p>
        <p>At the same time, ( h

() ) should preserve as much other information as possible to avoid degrading
the overall performance of the language model. For instance, in the case of gender bias removal, the
classification) but should no longer encode gender-related information.
transformed representation ( h
() ) should still be useful for the original task (e.g., text generation or

To construct such a transformation (⋅) , the authors first train a linear probing classifier
 to predict
the linguistic property   from the hidden representations. This classifier has an associated weight vector
 ∈ ℝ , which captures the most predictive direction for   in the representation space. This weight
vector is used to construct a projection matrix P ∈ ℝ ×
the transformed representation Ph() lies in the null space of  , efectively removing the information
such that:  ⊤(Ph
() ) = 0,
∀. This ensures that</p>
        <p>The projection matrix P is obtained as follows:
In the first execution (left), the LLM predicts “is” for the masked position, recognizing the singular subject. After
shifting the hidden representation of “cat” to the plural subspace using INLP, the model is asked to predict again
(right) and now predicts “are”, aligning with the plural subject. This figure is inspired by [17].
1. Compute the null space of  , defined as:  (  ) = {h</p>
      </sec>
      <sec id="sec-5-3">
        <title>2. Construct the projection matrix P (  ) using the basis vectors of  (  ).</title>
        <p>() ∣  ⊤h() = 0}.</p>
        <p>3. Obtain the final transformation by projecting onto the orthogonal complement: P (  )h() .</p>
        <p>To ensure that all traces of   are removed, this procedure is applied iteratively. Specifically, additional
probing classifiers  ′ are trained to detect any remaining information about   in the transformed
representations. If a new classifier  ′ still predicts   with above-random accuracy, a new weight vector
 ′ is obtained and used in the next iteration of the null-space projection process. This iterative procedure
continues until no further linear information about   can be extracted from the representations.</p>
        <p>INLP not only enables the complete removal of specific information from hidden representations but
also provides a principled framework for controlling how linguistic features are encoded and utilized
by LLMs. Crucially, by applying INLP and observing changes in model predictions, one can empirically
verify that a given piece of information was indeed being used by the model during inference.</p>
        <p>Beyond its application to mitigating bias, INLP has been employed to analyze other linguistic
phenomena, such as subject-verb agreement in relative clauses [16, 17] and a broader range of syntactic
structures [18]. In these studies, rather than simply removing number information from hidden
representations, the authors used INLP to generate counterfactual examples. For example, consider
an LLM’s performance in predicting the correct verb agreement with the subject in the sentence “The
cat (SUBJECT) that chased the dog [MASK] (VERB) sleeping” as shown in Figure 1. The model is
considered to perform well if it predicts ℙ(is) &gt; ℙ(are) for the masked position, recognizing that the
subject is singular. To test this, the projected hidden representation of cat (h(c)at) is shifted within the
number subspace identified by INLP, specifically towards the plural subspace (illustrated by the red
line in Figure 1). This shift generates a counterfactual example, represented by the blue dot in the
plural subspace. If the probabilities change upon re-evaluation using the counterfactual representation
instead of the original, it suggests that the model was indeed using number information to predict the
correct agreement. This demonstrates that LLMs encode and utilize grammatical number in a structured
manner.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results on the linguistic competence of language models</title>
      <p>
        Waldis et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] conduct an evaluation using the classifier-based probing approach described in Section 4,
analyzing classification performance across various linguistic tasks by treating the model’s final-layer
embeddings as input to the classifiers. The authors measured classification performance by using
diferent metrics according to the analyzed task (e.g. in subject-verb agreement, the task metric is
the accuracy of predicting the correct verb given the subject). Building on their findings, here we
summarize key insights regarding the linguistic competence of language models; Figure 2 is reported
here from Waldis et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to depict the linguistic task performance.
      </p>
      <p>While language models excel at capturing local syntactic properties, such as part-of-speech tagging,
they face greater challenges when tasked with understanding more complex linguistic phenomena like
semantic roles and reasoning. For example, consider the sentence The cat that chased the dog is sleeping.
In this case, the part-of-speech tag of the word cat is highly dependent on its surrounding context.
The presence of the definite article the before cat helps the model predict that cat is a noun. This is
a relatively simple syntactic phenomenon where word dependencies within nearby contexts can be
easily captured.</p>
      <p>However, when moving to more complex semantic tasks, the challenges become more pronounced.
Consider again the sentence The cat that chased the dog is sleeping. Now, if we are to assign a semantic
role to the words cat and dog, the task becomes significantly more complicated. The appropriate
semantic role for cat would be Agent (the entity performing the action) , while dog would be the
Theme (the entity afected by the action). While models can easily handle word co-occurrences within
the immediate context, they struggle with more distant and intricate relations, such as rhetorical
connections or understanding semantic roles. In this regard, language models show a tendency to
approximate simpler word dependencies in nearby contexts well, but their performance diminishes
when these dependencies extend over larger portions of text or involve more abstract relationships.</p>
      <p>An additional consideration is that the linguistic performance of language models can be
significantly improved by augmenting them with an encoder module. This is especially true when models
are equipped with additional parameters, which allow them to better approximate simpler word
cooccurrences. However, when dealing with more complex co-occurrences, such as those found in
rhetorical relations or higher-order semantic structures, language models still encounter dificulties.</p>
      <sec id="sec-6-1">
        <title>6.1. Linguistic performance in multilingual vs monolingual models</title>
        <p>While the studies discussed earlier primarily focus on monolingual (English) language models, a growing
body of literature explores how language models’ ability to learn and encode linguistic properties varies
when considering diferent languages, particularly in the context of multilingual versus monolingual
models.</p>
        <p>In their work, [19] propose a framework for assessing syntactic properties in both monolingual
and multilingual models, with a particular focus on the subject-verb agreement task. Their findings
suggest that monolingual models typically achieve high accuracy on syntactic dependencies without
attractors, while showing poorer performance on agreement in object relative clauses. Furthermore,
they observe that languages with richer morphology tend to have higher agreement accuracy across
various syntactic constructions. In contrast, multilingual models, particularly those in their current
form, do not exhibit evidence of positive grammar transfer across languages. Instead, these models often
experience harmful interference, which negatively afects their ability to learn linguistic properties
efectively across languages.</p>
        <p>Regarding the potential for grammar transfer in multilingual models, Mueller et al. [20] explore
whether syntactic neurons are shared across a set of high-resource, grammatically similar languages.
Their findings reveal significant overlap in neurons responsible for syntactic agreement across languages
in autoregressive multilingual models, but not in masked language models. Notably, two distinct patterns
of layerwise efects and sets of neurons were found, corresponding to syntactic agreement, depending
on whether the subject and verb were separated by other tokens. These results suggest that multilingual
models may be capable of sharing some syntactic knowledge across grammatically similar languages,
but this is not necessarily the case for all model architectures or language pairs.</p>
        <p>While multilingual models exhibit some shared knowledge across languages, these results raise
important questions about the impact of multilingual training on grammar proficiency. The evidence
suggests that while multilingual models may benefit from shared syntactic structures in some cases, they
often struggle to maintain the same level of proficiency in grammar and syntax as their monolingual
counterparts. This is especially true when languages with vastly diferent syntactic structures or
morphological richness are included in the training data, as the model may struggle to reconcile
conflicting grammatical patterns across languages. Therefore, it appears that multilingual training may
not always enhance grammar and syntax capabilities but can instead lead to interference that hinders
the model’s ability to learn and generalize linguistic properties efectively.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>This paper provides an overview of key studies on evaluating the linguistic performance of language
models, covering essential concepts in this area. We discussed how language models process text, the
linguistic properties that can be studied, and the tools available for assessing the presence of these
properties in model representations, with a particular focus on probing classifiers. Furthermore, we
explored methods to evaluate whether the linguistic knowledge encoded in a model is actively used
during inference. Additionally, we highlighted techniques like INLP for guiding the model to disregard
certain learned information (e.g., gender bias) and examined how shifting representations of linguistic
properties (e.g., number) in the opposite direction within their respective subspaces can help evaluate
linguistic performance.</p>
      <p>Language models are generally better at encoding formal linguistic phenomena, such as morphology
and syntax, compared to functional linguistic phenomena like semantics. This trend is especially true
for monolingual language models, while multilingual models appear to struggle more with transferring
and applying grammatical knowledge across languages.</p>
      <p>
        An important observation made by [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is that the current evaluations of linguistic performance in
language models are limited in scope. Notably, only three language models were probed on more
than 20% of the tasks, and only one task (part-of-speech tagging) was evaluated for more than 20% of
the models. Furthermore, recent large language models are significantly underrepresented in these
evaluations, suggesting a gap in how we assess the linguistic capabilities of state-of-the-art models.
      </p>
      <p>Looking forward, future work can expand the understanding of linguistic competence in language
models by considering alternative perspectives. For instance, Chomsky [21] and De Saussure [22] ofer
distinct views on linguistic competence. Chomsky defines linguistic competence as the unconscious
knowledge of language and linguistic performance as the application of this knowledge in actual
utterances. In contrast, de Saussure distinguishes between the structured rules and lexicon of language
(langue) and its dynamic, social usage (parole), emphasizing the ongoing negotiation and evolution of
language within society. The majority of studies covered in this paper follow Chomsky’s interpretation,
treating language models as static representations of a particular moment in time. However, de
Saussure’s framework invites a more dynamic perspective on linguistic competence, where the evolution
of language knowledge and societal shifts in communication practices are considered. Future research
could apply the methods outlined in this paper, but with an eye toward Saussure’s view of language.
This would help us model the relationship between the language model’s interaction with the language
it is trained on and whether this interaction aligns with the evolving nature of language as a societal
process. Such an approach could ofer valuable insights into how language models might better capture
the dynamics of language use and evolution over time.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work was supported in part by project SERICS (PE00000014) under the NRRP MUR program funded
by the EU - NGEU. Views and opinions expressed are however those of the authors only and do not
necessarily reflect those of the European Union or the Italian MUR. Neither the European Union nor
the Italian MUR can be held responsible for them.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used GPT-4 and NotebookLM in order to: Drafting
content, Paraphrase and reword, Improve writing style, Abstract drafting, Grammar and spelling check.
After using these tools, the authors reviewed and edited the content as needed and take full responsibility
for the publication’s content.
Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Association
for Computational Linguistics, 2019, pp. 4593–4601. URL: https://doi.org/10.18653/v1/p19-1452.
doi:10.18653/V1/P19- 1452.
[9] J. Hewitt, C. D. Manning, A structural probe for finding syntax in word representations, in:
J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume
1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota,
2019, pp. 4129–4138. URL: https://aclanthology.org/N19-1419/. doi:10.18653/v1/N19- 1419.
[10] Y. Belinkov, Probing classifiers: Promises, shortcomings, and advances, Computational Linguistics
48 (2022) 207–219. URL: https://aclanthology.org/2022.cl-1.7/. doi:10.1162/coli_a_00422.
[11] F. Dalvi, N. Durrani, H. Sajjad, Y. Belinkov, A. Bau, J. Glass, What Is One Grain of Sand in the
Desert? Analyzing Individual Neurons in Deep NLP Models, Proceedings of the AAAI Conference
on Artificial Intelligence 33 (2019) 6309–6317. doi: 10.1609/aaai.v33i01.33016309.
[12] L. Torroba Hennigen, A. Williams, R. Cotterell, Intrinsic probing through dimension selection, in:
B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online,
2020, pp. 197–216. URL: https://aclanthology.org/2020.emnlp-main.15/. doi:10.18653/v1/2020.
emnlp- main.15.
[13] N. Durrani, H. Sajjad, F. Dalvi, Y. Belinkov, Analyzing individual neurons in pre-trained language
models, in: B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing (EMNLP), Association for Computational
Linguistics, Online, 2020, pp. 4865–4880. URL: https://aclanthology.org/2020.emnlp-main.395/.
doi:10.18653/v1/2020.emnlp- main.395.
[14] O. Antverg, Y. Belinkov, On the Pitfalls of Analyzing Individual Neurons in Language Models,
https://arxiv.org/abs/2110.07483v3, 2021.
[15] S. Ravfogel, Y. Elazar, H. Gonen, M. Twiton, Y. Goldberg, Null It Out: Guarding Protected Attributes
by Iterative Nullspace Projection, 2020. doi:10.48550/arXiv.2004.07667. arXiv:2004.07667.
[16] S. Ravfogel, G. Prasad, T. Linzen, Y. Goldberg, Counterfactual Interventions Reveal the Causal
Efect of Relative Clause Representations on Agreement Prediction, in: A. Bisazza, O. Abend (Eds.),
Proceedings of the 25th Conference on Computational Natural Language Learning, Association
for Computational Linguistics, Online, 2021, pp. 194–209. doi:10.18653/v1/2021.conll- 1.15.
[17] S. Hao, T. Linzen, Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject
Number, in: H. Bouamor, J. Pino, K. Bali (Eds.), Findings of the Association for Computational
Linguistics: EMNLP 2023, Association for Computational Linguistics, Singapore, 2023, pp. 4531–4539.
doi:10.18653/v1/2023.findings- emnlp.300.
[18] K. Lasri, T. Pimentel, A. Lenci, T. Poibeau, R. Cotterell, Probing for the Usage of Grammatical
Number, in: S. Muresan, P. Nakov, A. Villavicencio (Eds.), Proceedings of the 60th Annual
Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association
for Computational Linguistics, Dublin, Ireland, 2022, pp. 8818–8831. doi:10.18653/v1/2022.
acl- long.603.
[19] A. Mueller, G. Nicolai, P. Petrou-Zeniou, N. Talmina, T. Linzen, Cross-linguistic syntactic evaluation
of word prediction models, in: D. Jurafsky, J. Chai, N. Schluter, J. Tetreault (Eds.), Proceedings of
the 58th Annual Meeting of the Association for Computational Linguistics, Association for
Computational Linguistics, Online, 2020, pp. 5523–5539. URL: https://aclanthology.org/2020.acl-main.490/.
doi:10.18653/v1/2020.acl- main.490.
[20] A. Mueller, Y. Xia, T. Linzen, Causal Analysis of Syntactic Agreement Neurons in Multilingual</p>
      <p>Language Models, 2022. doi:10.48550/arXiv.2210.14328. arXiv:2210.14328.
[21] N. Chomsky, Aspects of the Theory of Syntax, 50 ed., The MIT Press, 1965. URL: http://www.jstor.</p>
      <p>org/stable/j.ctt17kk81z.
[22] F. De Saussure, Cours de linguistique générale, volume 1, Payot, Paris, 1916.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Waldis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Perlitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Choshen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Holmes A Benchmark to Assess the Linguistic Competence of Language Models, Transactions of the Association for Computational Linguistics 12 (</article-title>
          <year>2024</year>
          )
          <fpage>1616</fpage>
          -
          <lpage>1647</lpage>
          . doi:
          <volume>10</volume>
          .1162/tacl_a_
          <fpage>00718</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Belinkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sajjad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Durrani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dalvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Glass</surname>
          </string-name>
          ,
          <article-title>Identifying and controlling important neurons in neural machine translation</article-title>
          ,
          <source>in: International Conference on Learning Representations</source>
          ,
          <year>2019</year>
          . URL: https://openreview.net/forum?id=
          <fpage>H1z</fpage>
          -
          <lpage>PsR5KX</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Voita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Talbot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Moiseev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sennrich</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Titov</surname>
          </string-name>
          ,
          <article-title>Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <article-title>Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>5797</fpage>
          -
          <lpage>5808</lpage>
          . URL: https:// aclanthology.org/P19-1580/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          - 1580.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sajjad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dalvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Durrani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>On the efect of dropping layers of pre-trained transformer models</article-title>
          ,
          <source>Comput. Speech Lang</source>
          .
          <volume>77</volume>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.1016/j.csl.
          <year>2022</year>
          .
          <volume>101429</volume>
          . doi:
          <volume>10</volume>
          . 1016/j.csl.
          <year>2022</year>
          .
          <volume>101429</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Mahowald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ivanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. A.</given-names>
            <surname>Blank</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kanwisher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          , E. Fedorenko,
          <article-title>Dissociating language and thought in large language models: a cognitive perspective</article-title>
          ,
          <source>CoRR abs/2301</source>
          .06627 (
          <year>2023</year>
          ). URL: https://doi.org/10.48550/arXiv.2301.06627. doi:
          <volume>10</volume>
          .48550/ARXIV.2301. 06627. arXiv:
          <volume>2301</volume>
          .
          <fpage>06627</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Adi</surname>
          </string-name>
          , E. Kermany,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Belinkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lavi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <article-title>Fine-grained analysis of sentence embeddings using auxiliary prediction tasks</article-title>
          ,
          <source>in: International Conference on Learning Representations</source>
          ,
          <year>2017</year>
          . URL: https://openreview.net/forum?id=
          <fpage>BJh6Ztuxl</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          , G. Kruszewski,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Barrault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baroni</surname>
          </string-name>
          ,
          <article-title>What you can cram into a single $&amp;!#* vector: Probing sentence embeddings for linguistic properties</article-title>
          , in: I. Gurevych, Y. Miyao (Eds.),
          <source>Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Melbourne, Australia,
          <year>2018</year>
          , pp.
          <fpage>2126</fpage>
          -
          <lpage>2136</lpage>
          . URL: https://aclanthology.org/P18-1198/. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P18</fpage>
          - 1198.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Tenney</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. Das</surname>
            ,
            <given-names>E. Pavlick,</given-names>
          </string-name>
          <article-title>BERT rediscovers the classical NLP pipeline</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D. R.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <source>Proceedings of the 57th Conference of the Association for Computational</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>