1. Introduction

Years

AlieLassche

Ruben Ros

JorisVeerbeek

1 0 Leiden University, Institute of History , Doelensteeg 16, 2311 VL Leiden , The Netherlands 1 Utrecht University, Department of Media and Culture Studies , Dri昀琀 13, 3512 BR Utrecht , The Netherlands

1991

2020 6 8

Binary oppositions, since their introduction by Claude Levi-Strauss and other structuralists in the seventies, are under pressure, especially because they legitimatize societal power structures. Deconstruction of binary oppositions such as man/woman, black/white, le昀琀/right, and rich/poor is therefore increasingly encouraged. The question arises of what kind of e昀ect the debate about binary oppositions has had on its linguistic use. We have therefore detected antonyms in a corpus of Dutch newspaper articles from the period 1990-2020, to study the development of binarism in journalism. Our method consists of two parts: the use of a good-old lexicon, and the 昀椀netuning of a BERT model for antonym detection. In this paper, we not only present our results regarding the (de)construction of binary oppositions in Dutch journalism, but we also re昀氀ect on the two methodological stages and discuss their gain.

eol>antonym detection BERT binary opposition journalism newspapers

1. Introduction

linguistic use. How has the discourse surrounding the critique and deconstruction of binary thinking in昀氀uenced the utilization of binary oppositions in texts?

This paper aims to detect binary oppositions in a corpus of Dutch newspapers and analyse patterns in their use in Dutch journalism between 1990 and 2020. We chose this genre and period because we expect a change here in the use of binary oppositions, as a re昀氀ection of the developments in its public debate. To operationalize the detection of binary oppositions, we look for antonyms in our corpus. Equating these two concepts has important consequences: while an antonym pair consists of two words that have opposite meanings, a binary opposition consists of two opposing words that o昀琀en have a connotation of contrast, con昀氀ict, or tension. Therefore, not every antonym pair is a binary opposition. In the remainder of this paper, we will further re昀氀ect on this methodological choice. The method we propose consists of two layers, and can thus be considered a two-stage rocket: we start with creating a good-old lexicon of antonym pairs in the Dutch language. A昀琀erwards, we use this as training data to 昀椀netune a BERT model for automatic antonym detection. In what follows, we will discuss relevant related work concerning the extraction of antonyms from text. We will then discuss the methodological pipeline we propose in more detail. We will not only present our results regarding the use of binary oppositions in Dutch journalism, but we will also re昀氀ect on the two methodological stages, and discuss the gain of each of them.

2. Related work

Detecting antonyms – words with an opposite meaning – is a task o昀琀en undertaken by linguists. However, di昀erentiating antonyms from synonyms is challenging due to similar usage. Linguists use pattern-based and co-occurrence models to distinguish thePma.ttern-based models assume that antonymous word pairs co-occur in some antonym-indicating lexico-syntactic patterns. Examples are patterns such afsrom A to B, between A and B, and either A or B. Roth and Schulte im Walde combined patterns with discourse markers for classifying paradigmatic relations between words such as synonymy and antonymy. Schwartz, Reichart, and Rappoport presented a symmetric pattern-based model for word vector representations in which antonyms were assigned to dissimilar vector representations. More recently, a novel pattern-based neural method AntSynNET to distinguish antonyms from synonyms was presented9[].

In co-occurrence models, each word is represented by a weighted feature vector, where features typically correspond to words that co-occur in particular contexts. Yih, Zweig, and Platt introduced a vector space representation where antonyms were positioned on opposite sides of a sphere. Scheible, Schulte im Walde, and Springorum showed that the di昀erences in the contexts of synonymous and antonymous pairs could be identi昀椀ed with a simple word space model. Santus, Lu, Lenci, and Huang introduced an Average-Precision-based measure for the unsupervised discrimination of antonymy from synonymy. They argued that synonyms are expected to have a broader and more salient intersection of their top-K salient contexts than antonyms.

The recent introduction of pre-trained large language models such as BERT has largely improved how the task of antonym detection is addressed. In a recent paper by Church, Cai, and Bian, an earlier proposed mixture of experts (MoE) metho1d4][was combined with dLCE embeddings [ 9 ]. Its performance was compared with the performance of a BERT model that was 昀椀netuned using di昀erent datasets. The highest performance (0.947) was gained with a model that was trained and tested on a subset of Samuel Fallows’ thesaurus of synonyms and antonyms [ 5 ].

3. Corpora and methods

The methodology we use is inspired by the work of Church, Cai, and Bian on 昀椀netuning a BERT model for antonym detection. However, since we want to discuss the gain of using a large language model in this task, we will compare its outcome with the more basic approach of using a lexicon of antonym pairs, in order to 昀椀nd binary oppositions. We thus propose the following methodology to detect antonyms in Dutch text corpo1ra: 1. Preparing a word list with antonyms and synonyms. We extracted antonymous and synonymous word pairs from the websitweww.mijnwoordenboek.nland the online dictionary Van Dale2. To ensure a balanced representation of these two classes, we downsampled the synonym pairs. Additionally, to enable the model to discern when two words are unrelated in both antonymous and synonymous senses, we introduced random word pairs. These random pairs form the majority class, as we assume that the majority of word pairs in our data are not related. To achieve this, we sampled these random pairs at a rate ten times the size of our synonym and antonym pairs, as shown iTnable 1. 2. Finetuning BERT model for antonym detection. We tested 昀椀ve di昀erent BERT models, both Dutch and multilingua3l.A昀琀er following Devlin, Chang, Lee, and Toutanova for standard hyperparameter tuning, which entails optimizing the learning rate, epochs, and batch size, the multilingual modemldeberta-v3-base [ 6 ] yielded the highest performance, hence our choice to continue with this model. Overall, the model achieved an accuracy of 0.90 on the test set. On the antonym class speci昀椀cally, the model achieved an 1 -score of 0.79. 3. Preparing word pairs from newspaper data set. Our dataset consists of articles from the Dutch newspaper NRC from the period 1990-2020, which are all available on their 1Please refer to the git repository for the full cohdtet:ps://github.com/rubenros1795/antonym-detectio.n 2www.vandale.nl. Van Dale does not provide any information on selection criteria for their synonym and antonym dictionary. Mijn Woordenboek states on its website that synonyms are licensed from Van Dale, Kernerman Dictionaries, and Interglot, unless otherwise speci昀椀ed. An active group of volunteers and users continuously contributes and veri昀椀es words. 3These include: bert-base-dutch-cased, robbert-v2-dutch-base, xlm-roberta-base, mdeberta-v3-base, and bert-base-multilingual-cased.

# articles 589,739 words 346,909,375 filtered words (N, Adj) 93,278,902 word pairs 25,248,589 website.4 NRC is among the major newspapers in the Netherlands, with a liberal orientation. Formerly known asNRC Handelsblad, it has a strong focus on business and international news. We preprocessed the data by excluding advertisements, job postings, and news index pages. We kept only nouns and adjectives that occur more than 10 times in the full corpus. We then created pairs for every possible combination between two words within one sentence. This resulted in 25,248,589 unique word pairs. To reduce the number of word pairs that had to be annotated by the model, we used a threshold of 0.4 for the cosine similarity between the words in a pair, leaving 2,471,340 to be annotated by the model. We trained afastText model on our dataset to de昀椀ne these similaritie5s.

Characteristics of the corpus and the word pairs can be foundTianble 2. 4. Annotating antonyms in newspaper word pairs. We applied our 昀椀netuned BERT model to the newspaper word pairs. We considered two given words antonyms if the probability was higher than 50%. Although our model consists of three classes, we are only interested in antonyms in our analysis. To further limit the number of false positives, we opted for this threshold of 50%, which is conventional in a binary classi昀椀cation task.

This resulted in 128,294 word pairs that were classi昀椀ed as antonyms by our model.

4. Results 4.1. Binary oppositions in newspapers

In Figure 1, we present the averaged adjusted PMI for the antonym pairs extracted from the corpus of newspaper articles. There appears to be a modest decline in the co-occurrence of antonym pairs between 1990 and 2020. We use Pointwise Mutual Information (PMI) as a target metric for estimating the joint probability of an antonym pair appearing togeth2e]r. [We use the sentence as the context for establishing whether the antonym words are used together. However, as shown inFigure 2, sentence length decreases monotonically over time. The 昀椀gure also shows a steep decrease in article length until 2012, and a slow increase in the number of sentences within an article. In other words: sentences become shorter over time, while articles 椀昀rst become shorter, but longer a昀琀er 2012, due to more sentences within article6s.The decrease 4www.nrc.nl. 5The average cosine similarity for antonyms and synonyms was 0.44. 6The peaking article length in 2020 is largely caused by the appearance of the so-calcloerdona live blogs, a daily live blog in which all COVID-related news was collected. 40.0 37.5 35.0 IPM32.5 d e t jsu30.0 d A 27.5 25.0 22.5 1990 1995 2000 2005 Years (1991 in sentence length is likely to a昀ect antonym PMI scores. To address this issue, we augment the PMI values with a decay function, which can be found in the appendix.

The antonym pairs with our classi昀椀er’s highest score are included in the appendixTinable 3. Three dominant clusters can be detected when looking at these most frequent antonym pairs and pairs lower on the frequency list. There is a clear cluster of economic/昀椀nancial word pairs, includingeuro-procent, euro-dollar, procent-totaal, and jaar-kwartaal. Secondly, there are word pairs related to geopoliticse:uropees-amerikaans, amerikaans-nederlands, amerikaans-iraaks, russisch-oekraïens, turks-syrisch, and amerikaans-russisch. A third cluster includes antonym 5.2 5.0 4.8 4.6 4.4 4.2 4.0 pairs related to social topics and relationvsr:ouw-man, vader-moeder, jong-oud, zoon-vader, meisje-jongen, and kind-ouder.

Figure 3 shows the adjusted PMI of the 15 highest-scoring antonym pairs that appear in at least ten years7. By 昀椀ltering on word pairs occurring over a minimum period of ten years, we aim to exclude word pairs pertinent to brief or ephemeral events within the domains of politics, economics, or society. Almost all of these word pairs are from the political domain. Several refer to continuous political relations of the United Statmeser(ikaanskoreaans, amerikaans-italiaans, amerikaans-japans, amerikaans-mexicaans). Other word pairs are more conceptual, yet undeniably stem from the realm of political discouirlsleeg: aallegaal, tegenstander-voorstander, meerderheid-minderheid, binnenlands-buitenlands, integratieimmigratie, conservatief-progressief, and internationaal-regionaal. It demonstrates that the employment of antonyms predominantly prevails in political contexts.

We are furthermore interested in antonym pairs that show a clear development over time. We therefore applied the Mann-Kendall test to detect the pairs with the most consistent monotonic upward and downward trend. IFnigure 4, the development of the 15 antonym pairs with the clearest decrease is shown8. Two of these pairs belong to the earlier de昀椀ned geopolitical cluster of word pairasm:erikaans-israëlisch and amerikaans-japans. The decrease of the word pairzwart-blank (black-blank) represents the shi昀琀 from the last years to replace the word blank with wit (white). The wordblank sounds more positive in comparison tzowart, while wit and zwart are considered neutral. Two other pairs similarly show the decline of political concepts frequently juxtaposed in the twentieth century, such aksatholiek-protestant and werkgever-werknemer.

The 15 antonym pairs with the most pronounced upward trend are shown Finigure 5.9 Selecting the top 15 thus results in trends marked by limited increase. The decrease of one pair does not seem to directly increase another pair. It re昀氀ects the declining trend as depicted in Figure 1. What stands out is that several of these word pairs are related to the third cluster we distinguished earlier, with word pairs concerning social topics and relatisochnoso:l-ziekenhuis, baby-ouder, and eigen-collectief.

Apart from the (trans)national events and developments to which these patterns can be linked, they might also re昀氀ect a change in the journalistic style and scope of the newspaper NRC. In 2006, as an addition to the evening newspaperNRC Handelsblad, the morning newspaper nrc.next was launched, which targeted younger readers. In 201N7,RC Handelsblad and nrc.next became two editions of the same newspaper, respectively a morning and a昀琀ernoon edition. Since 2022, only the morning edition exists, bearing the namNeRC. The website from which our corpus originates includes articles frnormc.next, NRC Handelsblad, and NRC. The sharp decrease in sentence lengthF(igure 2) coincides with the launch ofnrc.next in 2006. This, together with the earlier decline in co-occurrence of antonym-paiFrisgu( re 1), suggests that targeting a younger audience results in a change in journalism that is both re昀氀ected in style and content. Based on the qualitative review of the declining pairs, we suspect thnartc.next was part of a more general decline of political-economic coverage relative to issues around 7The translations of these antonym pairs are listed Tinable 4. 8The translations of these antonym pairs are listed Tinable 5. 9The translations of these antonym pairs are listed Tinable 5. amerikaans-koreaans illegaal-legaal tegenstander-voorstander zeker-onzeker amerikaans-italiaans amerikaans-japans lifestyle and social issues, which resulted in a decline of antonym pairs in the former area.

Finally, inFigure 6, we have visualized how the development of a certain word pair relates to the development in frequencies of the distinctive words in that pair. We normalize the Adjusted PMI scores, as well as the relative frequencies for both words in the pair between 0 and 1 for comparability. For most antonym pairs, these time series assume the shape of tehiegen-ander pair, with a high positive correlation betweePn(w1) / P(w2 and PMI. The example above of a decreasingPMI between zwart and blank occurs in the context of the general decrease in the frequency of both terms. There is only one pair that stands out with a clear negative correlation: the vrouw-man pair. The words in this pair show an upward trend. Their joint appearance, 0 1990 katholiek protestants belasting premie geestelijk lichamelijk leven dood algemeen bijzonder economisch sociaal lang kort televisie radio twee één zwart blank vrouw man amerikaans israëlisch snel tempo amerikaans japans werkgever werknemer 2000 2010 however, decreases over time, which means that both terms are increasingly used separately, as visible in a Pearson correlation coe昀케cient of -.49 betweenP(w1) and PMI. A similar but weaker divergence is visible in therechts-links pair and the vader-kind pair. Although this does not show the end of binary oppositions, speci昀椀c cases like these do clearly decline over time. jong ziek politie burger uiteindelijk aanvankelijk school ziekenhuis los vast ver dicht amerikaans braziliaans muziek beeldend schrijver schilder president vicepresident ver precies baby ouder zwart grijs wedstrijd rust eigen collectief 2000 2010 2000 2010 2000

2010

4.2. Methodological considerations

The utilization of BERT models enabled us to discover a broad spectrum of antonyms. At the same time, we also observed a notable incidence of false positives during our analysis. Therefore, in this paragraph, we re昀氀ect on the methodological gains and pitfalls of using a machine-learning approach to discover antonyms in texts and contrast that with the use of a simple lexicon.

Our initial list of antonyms consisted of 2,109 pairs. Using the methodology we presented in this paper, we found 128,294 unique pairs of antonyms in our corpus. Surprisingly, only 1.00 0.75 s ice 0.50 n e uq 0.25 e r F + 0.00 s e r o c s IPM 1.00 d ize 0.75 l a rom 0.50 N 0.25 0.00

PMI(w1,w2) vrouw-man

eigen-ander zwart-wit

rechts-links 2020 Years (1991 603 of these pairs overlapped with the ones from our initial list, indicating that the majority of antonym pairs in the initial list were not present in the corpus. With the discovery of 127,691 new antonyms, the use of BERT models signi昀椀cantly expanded our analysis scope.

When we examine the antonyms with the highest probability (seeTable 6 for the top 15), we observe that the model can identify a wide range of antonyms. Firstly, there are quite a few antonyms where one of the words begins with a negative pre昀椀x (in Dutch: non-, on-, anti-, dis-, niet-). 7,680 (6%) of the discovered pairs contain such a negative pre昀椀x. Incorporating all these words into a lexicon would be a labour-intensive task. However, a possible solution might involve combining lexicons with rule-based systems. Secondly, surprisingly, we also observe a lot of antonyms that refer to abstract ideological or artistic movements, such as naturalistisch-surrealistisch (naturalistic-surrealistic)c,ommunisme-individualisme (communism-individualism) andanarchistisch-kapitalistisch (anarchist-capitalist). Additionally, during our analysis, we found numerous antonyms opposing two countries, sucdhuiatsnederlands (german-dutch), oramerikaans-russisch (american-russian). The utilization of BERT models thus results in a signi昀椀cant number of pairs that would be considered false negatives in a lexicon approach. These antonyms describe new forms of thinking or emerging societal developments, which is of particular importance in a journalism context.

At the same time, the use of a machine learning approach also produces a considerable number of false positives, which contaminate the analysis. Ideally, these false positives would be manually removed from the list, but at this scale, it would be quite time-consuming. The most evident form of false positives we found during our analysis are pairs where the words are not actually opposites but frequently occur near each other, sucheuarso-procent (europercentage), ministerie-volksgezondheid (ministry-public health),eeneiig-tweeling (identicaltwin). In other cases, the false positives included misclassi昀椀ed synonymsm(onetair-economisch, monetary-economic) and, surprisingly, even di昀erent spellings of the same wordasm(erikaansamerikaanse, american-american).

In the examples provided above, the classi昀椀cation of these pairs as antonyms is clearly incorrect. However, we also encountered numerous borderline cases that challenge the boundaries of how we de昀椀ne an antonym. For instance, is a geologist the opposite of a psychologist? Is a tomato antonymous to a bell pepper? Or a cyclist to a pedestrian? In all these cases, the answer depends on the context; on whether these words were used in an antonymous or complementary manner (‘Cyclists are becoming a growing concern for pedestrians’ vs. ‘Drivers must be vigilant about pedestrians and cyclists.’). Although we utilized contextual language models, our setup did not fully incorporate the contextual aspect since we trained our models at the word level. To unlock the full potential of BERT models, a possible improvement to our setup would be to move beyond the word level and train models on a token sequence level, which would require a training dataset where words are tagged as antonyms in their context. This way, we can better capture the nuances and contextuality of antonymous relationships between words. However, curating such a dataset was beyond the scope of this paper.

5. Conclusion

Are binary constructions constructed or deconstructed in Dutch newspaper articles of the last thirty years? The question does not have a straightforward answer, as evidenced. It is challenging to determine when something can be considered an antonym pair, let alone a binary opposition. Nevertheless, our initial exploration yielded some noteworthy results. We have observed a modest decline in the use of antonym pairs in our corpus. Detected patterns in frequency changes could not only be linked to (trans)national events, but also to developments in the journalistic style, scope, and target groups of the newspapNerRC. Moreover, we have shown that using a BERT model in this task has led to promising results. Intriguing binary oppositions such as man-woman, black-white, and employer-employee emerged through the use of the trained classi昀椀er and our subsequent analysis. Utilizing an LLM has also confronted us with many new challenges. Pursuing and obtaining high-performance scores in training and 椀昀netuning large language models is no guarantee for success. Nevertheless, we are optimistic that further exploration of the potential of these models can lead to a more profound insight into the use of binary oppositions in Dutch newspapers and beyond.

A. Methods We use a decay function of the form: Where:

Adjusted PMI(, ) PMI(, )

Adjusted PMI(, ) =

PMI(, ) ⋅

exp( ⋅ Sentence Length) represents the adjusted PMI score for word pa irand .

is the standard PMI score for the word pairand . is the decay constant that controls the rate of decay.

Sentence Length refers to the length of the sentence in which the word pair is found. The decay constant ( ) is estimated through a curve-昀椀tting process. We 昀椀t the decay function to the PMI scores and sentence lengths in the dataset using a nonlinear curve 昀椀tting. The estimated constant is determined based on this 昀椀tting process. This adjusted approach enhances the reliability of PMI-based analyses, o昀ering a more consistent representation of word associations even as sentence lengths vary.

B. Antonym pairs and their translations

original katholiek - protestants belasting - premie geestelijk - lichamelijk leven - dood algemeen - bijzonder economisch - sociaal lang - kort televisie - radio twee - één zwart - blank vrouw - man translation catholic - protestant tax - premium mental - physical life - death general - particular economic - social short - long television - radio two - one black - white woman - man amerikaans - israëlisch

american - israelian snel - tempo amerikaans - japans werkgever - werknemer fast - pace american - japanese employer - employee original jong - ziek politie - burger los - vast ver - dicht amerikaans - braziliaans muziek - beeldend schrijver - schilder ver - precies baby - ouder zwart - grijs wedstrijd - rust eigen - collectief Translations of most decreasing and increasing antonym pairs in Figure 4 and Figure 5. The top 15 antonyms with the highest probability.

original globalisering - individualisering geoloog - psycholoog helder - onhelder individualisering - mondialisering burgerschap - ministerschap degelijk - ongelijk excommunistische - socialistisch doordacht - ondoordacht democratisch - militaristisch begrip - onbegrip juistheid - onjuistheid biologisch - homeopathisch bioloog - criminoloog desintegratie - integratie onrechtvaardigheid - rechtvaardigheid injustice - justice translation globalization - individualization translation young - ill police - citizen loose - fixed far - close american - brazilian music - visual writer - painter far - precise baby - parent black - grey game - break own - collective cos sim

in train ds × × × × × × × × × × × × uiteindelijk - aanvankelijk school - ziekenhuis ultimately - initially school - hospital president - vicepresident

[1]

Church ,

Cai , and

Bian . “Training on Lexical Resources”. InP:roceedings of the Thirteenth Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association , 2022 , pp. 6290 - 6299 .

[2]

Church and

Hanks . “ Word association norms, mutual information, and lexicography” . In: Computational linguistics 16.1 ( 1990 ), pp. 22 - 29 .

[3]

Derrida .Positions. Chicago, Ill.: University of Chicago Press, 1981 .

[4]

Devlin , M.-

Chang ,

Lee , and

Toutanova . “Bert: Pre-training of deep bidirectional transformers for language understanding” .aIrnX: iv preprint arXiv: 1810 . 04805 ( 2018 ).

[5]

Fallows . A Complete Dictionary Of Synonyms And Antonyms . 1898 .

[6]

He ,

Gao , and

Chen . “Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing” . Ina:rXiv preprint arXiv:2111.09543 ( 2021 ).

[7]

Klages . Literary Theory: A Guide for the Perplexed . London: Bloomsbury Publishing Plc, 2007 .

[8]

Lévi-Strauss . The raw and the cooked . New York: Harper and Row, 1969 .

[9]

K. A.

Nguyen , S.

Schulte im Walde, and

N. T.

Vu . “ Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network”. InP:roceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics : Volume 1 ,

Long

Papers . Valencia, Spain: Association for Computational Linguistics, 2017 , pp. 76 - 85 . url: https://aclanthology.org/E17-100.8 [10]

Roth and S. Schulte im Walde. “Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classi昀椀cation” . In:Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2 : Short

Papers).

Baltimore , Maryland: Association for Computational Linguistics, 2014 , pp. 524 - 530 . do1i : 0 .3115/v1/ P14 -2086.

[11]

Santus ,

Lu ,

Lenci , and

C.-R.

Huang . “ Taking Antonymy Mask o昀 in Vector Space” . In: Proceedings of the 28th Paci昀椀c Asia Conference on Language, Information and Computing . Phuket,Thailand: Department of Linguistics, Chulalongkorn University, 2014 , pp. 135 - 144 . url: https://aclanthology.org/Y14-101. 8

[12]

Scheible , S.

Schulte im Walde, and

Springorum . “ Uncovering Distributional Di昀erences between Synonyms and Antonyms in a Word Space Model” . InP: roceedings of the Sixth International Joint Conference on Natural Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing , 2013 , pp. 489 - 497 . urhl:ttps://aclan thology. org/I13-105 . 6

[13]

Schwartz ,

Reichart , and

Rappoport . “ Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction” . PInro:ceedings of the Nineteenth Conference on Computational Natural Language Learning . Beijing, China: Association for Computational Linguistics, 2015 , pp. 258 - 267 . doi: 10 .18653/v1/ K15 -1026.

[14]

Xie and

Zeng . “A Mixture-of-Experts Model for Antonym-Synonym Discrimination” . In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2 :

Short

Papers ) . Online: Association for Computational Linguistics , 2021 , pp. 558 - 564 . doi: 10 .18653/v1/ 2021 .acl-short. 7 1 .

[15] W.-t. Yih, G. Zweig, and

Platt . “ Polarity Inducing Latent Semantic Analysis” . IPnr:oceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island , Korea: Association for Computational Linguistics, 2012 , pp. 1212 - 1222 . url:https://aclanthology. org/D12-111.1 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.70 0.49 0.59 0.76 0.45 0.47 0.66 0.78 0.55 0.85 0.67 0.89 0.50 0.52 0 . 72