Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 115 Tracing Research Paradigm Change Using Terminological Methods A Pilot Study on “Machine Translation” in the ACL Anthology Reference Corpus Anne-Kathrin Schumann† and Behrang QasemiZadeh † Applied Linguistics, Translation and Interpreting Universität des Saarlandes Campus A2.2, 66123 Saarbrücken, Germany anne.schumann@mx.uni-saarland.de behrang.qasemizadeh@insight-centre.org Abstract into being, and (b) how they develop over time as the scientific field itself evolves. Empirical work This paper explores the use of terminology on the creation and development of terminologies extraction methods for detecting paradig- is especially relevant for investigations into the matic changes in scientific articles. We history of science. Furthermore, studies of this use a statistical method for identifying kind are also likely to benefit terminology as a dis- salient nouns and adjectives that signal these paradigmatic changes. We then em- cipline, since they might provide insights into the ploy the extracted lexical units for discov- driving forces of terminological development and ering terms that are assumed to be central knowledge organization. in characterising paradigm shifts. To as- The method proposed here identifies lexical sess the method’s performance, in this pi- units the importance of which increases or de- lot study, we work on “machine transla- creases upon the transition from an earlier period tion” (MT) research articles sampled from to a more recent one. In other words, we approach the ACL anthology reference corpus. We analyse this corpus to check whether the history of science in the form of a trend analysis proposed approach can trace the dramatic task. Formally, this task consists of two sub-tasks, changes that machine translation research namely: has experienced in the last decades: from transformational rule-based methods to sta- (a) the detection of those periods in time when a tistical machine learning-based techniques. paradigm change is taking place (e.g., as sig- nalled by terminological dynamics in a do- main); 1 Introduction (b) the extraction of terms that are indicative of a Research in computational terminology tradition- declining or rising paradigm. ally focuses on static models of knowledge ac- quisition and representation. Corpus-based ap- The pilot study described in this paper relates proaches have led to an increased interest in the only to the extraction of terms signalling paradigm automatic extraction and semantic categorisation shift (i.e., sub-task (b)). The material for our of terms with many successful applications. How- analysis consists of research articles dealing with ever, progress in the empirical description and “machine translation”. These articles are sam- computational modelling of terminological dy- pled from the ACL Anthology Reference Corpus namics has been rather slow. (ACL ARC)—introduced in Bird et al. (2008). This paper suggests that terminological meth- Linguistically, the proposed method is inspired ods and principles can be employed in empirical by studies on register.1 Register linguistics ap- investigations of diachronic knowledge evolution. proaches linguistic variation as the description of In particular, terminological methods can provide 1 See Cabré (1998) for an elaboration of terminological new insights into problems of diachrony since they aspects of register. Also, see Teich et al. (2015) for an applied can be used to trace (a) how terminologies come perspective. Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 116 changing configurations of linguistic features on (2011) elaborates a typology for the description the textual level. One of the relevant dimen- of short-term term evolution patterns such as ne- sions for this type of study certainly is the lexi- ology–necrology (i.e., appearance–disappearance con. Accordingly, we hypothesise that paradig- of terms), term migration, and topic central- matic changes in a field of knowledge are the ity–disappearance. Both papers, unfortunately, do cause of terminological dynamics. These dynam- not provide any methodology for the automatic ics are expressed in the form of the rise or de- detection of these dynamics. cline of not just isolated terms but whole groups In computational linguistics, trend analysis is of terms. usually approached by computing topic centrality We conclude that terms extracted by our method and/or community influence measures and plot- are salient if they are able to depict the paradig- ting them on a timeline. An example is the work matic change that the MT field has undergone in by Hall et al. (2008) who try to trace the “de- the last decades—that is, the advent of statistical velopment of research ideas over time”. They methods in contrast to symbolic approaches that employ the standard Latent Dirichlet Allocation were in use earlier. The remainder of this paper (LDA) algorithm (Blei et al., 2003)—a term-by- is structured as follows. Section 2 briefly sum- document model—for identifying “topic clusters”. marises relevant previous work. Section 3 outlines The method involves manual selection of relevant our extraction method. Section 4 reports the re- topics and seed words in multiple runs of the LDA sults of our pilot study, followed by an evaluation algorithm. Probabilities derived from the LDA in Section 5. Section 6 discusses obtained results model are then used for the identification of rising and concludes this paper. and declining topics. Similar to our work, the au- thors report experiments over the ACL ARC, us- 2 Related Work ing publications from 1978–2006. The term “paradigm” in the sense intended here A term-based approach to topic and trend analy- goes back to Kuhn (1962). According to Kuhn, a sis is proposed by Mariani et al. (2014). The anal- paradigm emerges from a generally acknowledged ysis is conducted on the ELRA Anthology of LREC scientific contribution to a research field. The publications starting in 1998. A term extraction significance of the paradigm consists in its abil- method, namely TermoStat (Drouin, 2004), is em- ity to propose research problems and solutions to ployed to extract “topic keywords”. For each year, these problems to the relevant community. Some terms and their variants are grouped into synsets of Kuhn’s arguments can be traced back to Fleck and the most frequent terms are found. Finally, (1935). Fleck describes scientific communities as the authors study the rank development for the 50 communities of thought (“Denkkollektive”) who most frequent terms in order to extract informa- share habits in their way of perceiving and solving tion on whether topics designated by these terms scientific problems (“Denkstil”, literally “style of have risen, declined, or stayed stable over the pe- thought”). What is important here for our research riod under analysis. Relevant co-occurrences of question is that paradigms are coupled not only terms are also listed. with specific types of problems and research meth- Gupta and Manning (2011) stress that for the ods, but also with terminologies: they constitute purpose of detailed investigations into the history the inventory of lexical units used to refer to con- of science “. . . an understanding of more than just cepts that are central for a given paradigm. Con- the ’topics’ of discussion . . . ” is necessary. They sequently, they are subject to change whenever the extract semantic information for the categories conceptual outline of the discipline changes. FOCUS (i.e., the main contribution of an article), Terminological dynamics have been ap- TECHNIQUE , and DOMAIN from the title and ab- proached by terminology proper from various stract sentences of research papers using a set of perspectives. Relevant to our study are the bootstrapped patterns. They then identify com- articles by Kristiansen (2011) and Picton (2011). munities using the LDA algorithm. An influence Kristiansen (2011) provides a detailed account measure is defined and calculated for communi- of external motivating factors of conceptual ties based on the number of times their FOCUS, and, eventually, terminological dynamics. Picton DOMAIN , or TECHNIQUE have been adopted by Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 117 other communities. Finally, results obtained from for a given research paradigm. Secondly, we the ACL ARC are projected onto a timeline. use extracted lemmas for identifying paradigmatic The work listed above has a number of short- terms. comings, amongst them are: The first step (i.e., extraction of lemmas) con- sists of three sub-processes: • Approaches based on topic modeling do not always provide readily interpretable topics. 1. extraction of frequency per document infor- While many of the induced topics are con- mation for all nouns and adjectives in the two vincing in terms of their lexical outline, we sub-corpora under analysis and removal of believe that the use of terminology, as pro- strings containing non-alpha-numeric char- posed by Mariani et al. (2014), can provide acters; more targeted information. 2. ranking of lexemes obtained for the two time • For any detailed understanding of the his- periods using the method explained below; tory of a given discipline, it is insufficient 3. comparison of the two ranked lists in order to to measure how “central” or “popular” cer- identify those lexemes that have undergone tain topics were at different periods in time. relevant rank-shifts. Instead, the internal, fine-grained dynamics of the field such as paradigms and paradigm Frequency and document-related information is shifts need to be understood. To our knowl- extracted using the IMS Open Corpus Workbench edge, the work by Mariani et al. (2014) is the (CWB) loaded with our data (Evert and Hardie, only one that includes a study of the lexical 2011). For ranking, we employ the measure for context of terminological units; however, this calculating domain consensus proposed by Sclano analysis is not carried out systematically. We and Velardi (2007). This measure—DC Di (t)—is believe that a systematic study of how groups defined as follows: of terms change over time can provide rich X DC Di (t) = − nf (t, dk ) log(nf (t, dk )), (1) information for users that are interested in the dk ∈Di history of a given scientific discipline (e.g., see Figure 2). where dk denotes the kth document in domain Di , and nf is the normalised frequency of term t in 3 Detection of Lexical Rank Shifts: The dk ∈ Di . DC Di (t) goes beyond the use of raw Method frequencies (e.g., as used by Mariani et al. (2014)). Instead, DC Di (t) favors lexemes that are evenly Our work differs from previous studies in that distributed over all the texts in the two sub-corpora we exploit the notion of rank shifts for detect- as opposed to candidates that are frequent just in ing fine-grained shifts rather than measuring topic a small number of texts. The process results in centrality or popularity. The comparison of rank ranked lists of lexemes for the two time periods shifts between two lists of sorted lexical items that we want to compare. Each lexeme either oc- is an established research method in the field of curs in only one of the two lists or in both of them. quantitative historical linguistics (e.g., c.f. Arapov To detect major rank shifts RS for a lexeme t that and Cherc (1974)) and we believe that it can be occurs in both lists, we use the following formula: adapted to our purposes. In essence, our approach to the detection of 1 1 terminological dynamics revealing a paradigm RS (t) = − , (2) RNew (t) ROld (t) change is two-fold. Firstly, we extract lemmas that experience a change in their ranks upon the where R(t) denotes the rank of t in the two ranked transition from older publications to more re- lists New (recent publications) and Old (early cent ones. We believe that these lemmas are publications). either paradigmatic terms themselves or can be In the next step, the lemmas with highest rank used to extract paradigmatic terms. We restrict shifts are employed to build partly lexicalised term word classes to nouns and adjectives since we be- extraction patterns for identifying paradigmatic lieve that they are the most characteristic units terms. PoS sequence patterns are taken from the Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 118 Pattern CWB query Up terms Down terms adjective + [pos=”JJ.*”] machine translation natural language noun [lemma=”lexicon”] language model deep structure past participle [pos=”VVN”] translation system phrase structure + noun [lemma=”lexicon”] word sense transformational rule noun + noun [pos=”N.*”] training datum syntactic analysis [lemma=”lexicon”] test set surface structure noun + noun + [pos=”N.*”] [pos=”N.*”] mt system sentence structure noun [lemma=”lexicon”] translation model physics problem noun + prepo- [pos=”N.*”] [pos=”IN”] sentence pair semantic theory sition + noun [lemma=”lexicon”] statistical machine translation transformational grammar adjective + ad- [pos=”JJ.*”] [pos=”JJ.*”] machine translation system phrase structure grammar jective + noun [lemma=”lexicon”] bleu score average number parallel corpus linguistic theory training set conversion rule Table 1: Examples of partly lexicalised term ex- english word source language traction patterns. ONLY NEW ONLY OLD UP DOWN Table 3: Most frequent paradigmatic term can- alignment periphrasing word language didates extracted using the proposed lexicalised tag cannonical translation sentence PoS sequence patterns. We consider Up terms and annotation transcodage corpus structure database transcoded model analysis Down terms as indicators of topics that are trend- baseline pidgin result rule ing and un-trending, respectively. ontology sjstem text form threshold descri method problem monolingual ption information semantic multilingual versinn feature grammar search has undergone a major paradigm shift since learning periphrasin system computer the late 1980s, we want to examine whether our architecture paragrapher approach program engine subroutine set theory method is able to capture and characterise this n-gram Noninclusive training way paradigm shift. decoder inclusiveness pair possible To prepare the data for experiments, we ex- tagger quelques source dictionary tract nouns and adjectives from papers contain- (a) (b) ing either the string “machine translation” or “au- tomatic translation”. We divide the corpus into Table 2: The result obtained from processing and two sets of articles: Old (1960s–70s) and New comparing the Old and New sub-corpora. Note (1980s onwards). Since New is substantially that dues to the presence of noise in pre-processes larger than Old , we randomly reduce the size of (e.g., OCR), the extracted lists of lexemes also the New set in order to make it more comparable contain invalid lexical units such as in Table 2a. to Old . Despite this effort, the two sub-corpora still have a different size and structure—New con- multilingual term extraction tool TTC TermSuite tains 290,337 nouns and adjectives whereas Old (Daille and Blancafort, 2013)2 . Table 1 provides contains only 79,247. examples of these patterns. The extracted lemmas are weighted using Equa- tions 1 and 2. Consequently, four sets of words are 4 Experiment generated: As stated earlier, we used the ACL ARC as a • words that occur only in New (ONLY NEW); dataset. The corpus contains research articles on • words that occur only in Old (ONLY OLD); the topic of human language technology dating • words whose rank increases upon the transi- back as far as 1965. In our experiments, we tion from Old to New (UP); use the preprocessed segmented version of the ACL ARC (i.e., the ACL RD-TEC) provided by • words whose rank decreases upon the transi- QasemiZadeh and Handschuh (2014). Our pilot tion from Old to New (DOWN). study is limited to the research publications in the The first set—items that occur only in New—is domain of MT. Given our knowledge that MT re- comparatively large and contains 14,347 adjec- 2 http://code.google.com/p/ttc-project/ tives and nouns. Old, on the other hand, has Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 119 1 Up Down UP training transformational DOWN 0.8 corpus routine score force Precision 0.6 probability picture target location 0.4 pair numeral evaluation title 0.2 task reverse statistical geometric 0 source physics 50 100 150 200 250 300 Top n term performance decimal bilingual personal feature intension Figure 1: Precision at n for the extracted list of error Russian terms using the lexicalised patterns for the Up and sense storage Down lemmas. Table 4: The baseline lemma list: top 15 lemmas sorted by frequency and rank shifts. 7,094 unique adjectives and nouns. 1,023 lem- mas have an increased rank over time, and 2,880 words are subject to rank decrease. Table 2 de- (b) the lists as a whole contain words that tails the results by showing the top 15 items in are typical for the mainstream research each set of generated words. Table 2a shows paradigms in the respective periods. words that occur only in New or only in Old. Ta- ble 2b, however, shows common words with the To investigate (a), participants make binary dis- largest rank shifts. Note that ONLY NEW and tinctions (i.e., in each of the Up and Down lists, a ONLY OLD have been ranked by their assigned lemma is marked either as relevant or irrelevant). DC score (Equation 1), whereas Up and Down To investigate (b), participants are asked to pro- are sorted according to the score computed using vide a grade indicating the relevance of the lists of Equation 2. terms on a scale from 1 (“list is irrelevant”) to 5 In the second step, we select the top 30 plausi- (“relevant”). ble noun lemmas from the UP list (shown in Ta- In order to assess whether the DCDi (t) rank- ble 2b) and use them for building term extraction ing mechanism proposed in this paper (i.e., Equa- patterns (as exemplified in Table 1). This pro- tions 1 and 2) outperforms simpler ranking meth- cess is also repeated for the top 30 nouns from ods, we also construct a baseline data-set: nouns the DOWN list. The two obtained sets of pat- and adjectives in New and Old are sorted by their terns are employed to extract terms from the New frequency and then evaluated by the differences in and the Old sub-corpora, respectively. Table 3 their ranks. The resulting baseline data is given in provides an overview over the 15 most frequent Table 4. Evaluators are asked to repeat the above- candidate terms extracted by this method. Fig- mentioned assessment also for this baseline with- ure 1 reports the precision for the first 300 Up and out being aware of how both data-sets were pro- Down paradigmatic term candidates obtained by duced. Table 5 summarizss the results of this eval- automatically comparing them to terms annotated uation. in the ACL RD-TEC by QasemiZadeh and Hand- Each row of the Sub-Tables 5 summarises the schuh (2014). input from each of the expert evaluators. The first and the second column in each sub-table 5 Evaluation show the sum of positively marked Up and Down items—that is, the sum of those lemmas (out The 15 lemmas listed in Table 2b (i.e., DCDi (t)- of 15) that were found salient for either the ranked lemmas) are presented to 5 researchers in 1960s–1970s or the 1980s–2000s (sub-task (a)). the area of machine translation. The evaluators are The third column presents the overall evaluation asked whether of the lists (i.e., sub-task (b)). Table 5a provides (a) the individual lemmas in Table 2b are salient the results for the list of lexical items that are for the period they are supposed to represent ranked using the DC Di (t) score (i.e., listed in Ta- (New and Old); and, ble 2b). Table 5b provides the assessments for the Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 120 Up Down Overall UP DOWN Overall Sub-Corpus Old Sub-Corpus New 12 10 4:5 15 5 3:5 natural language machine translation 12 10 4:5 11 2 3:5 13 12 4:5 14 11 3:5 machine translation natural language 10 10 3:5 11 6 4:5 computational linguistics language processing 3 4 2:5 6 2 3:5 data base translation system artificial intelligence target language (a) (b) language processing computational linguistics phrase structure natural language processing Table 5: Each row of these tables summarises syntactic analysis training data the assessment of each of the evaluators. Ta- translation system source language ble 5a shows the results for the sets of lexical items automatic translation test set ranked by DC Di (t) (listed in Tables 2b). Table 5b, natural languages information retrieval in contrast, provides the result for the sets of lex- information retrieval machine translation system noun phrase language model ical items that are sorted by their raw frequencies language understanding training corpus (listed in Tables 4). noun phrases noun phrase Table 6: 15 most frequent terms (two tokens or baseline list (i.e., listed in Table 4). longer) in the Old and the New sub-corpora. This As can be observed in Table 5, the evaluators list was collected using the manual annotations in tend to prefer the DC Di (t)-ranked lexical items the ACL RD-TEC and from the documents in the over the baseline data-set. Except for one of the two Old and New sub-corpora. annotators who suggests that the baseline method provides more informative output (i.e., the last row of Tables 5a and 5b), the evaluators consistently and is only slightly preceded by “statistical ma- prefer the ranking mechanism proposed in this pa- chine translation” itself. We also find that, during per, assigning an overall grade of 3–4 (out of 5) the 1980s, references to “linguistic‘theory” were points to the output. However, the difference re- rather frequent, but they have largely vanished mains but slight. since 1990. Themes such as generative gram- Table 6 shows the 15 most frequent terms in mar or phrase structure grammar were not dom- the Old and the New corpus, respectively. These inant even in the earlier decades, but they exhibit terms were collected using the manual annotations a constant decline at least since the 1990s. Ev- in the ACL RD-TEC by QasemiZadeh and Hand- idently, the plot confirms that our attribution of schuh (2014). By comparing these terms to the terms to the categories Up and Down is justified. output of our method (Table 3), we observe con- Moreover, this plot supports our hypothesis that siderable differences. Evidently, for the detection paradigm shifts are lexically expressed by dynam- of paradigm shifts, terms extracted using semi- ics of whole groups of related terms. lexicalised part-of-speech (PoS) patterns based on our DC Di (t) method are better indicators of the 6 Discussion and future work paradigm shift than terms ranked by their raw fre- For a detailed understanding of the dynamics of quencies. science, it is insufficient to measure how “central” Figure 2 exemplifies some of the dynamics de- or “popular” certain topics are at different periods tected by our method. For each year, the plot of time. Instead, those groups of terms that signal shows the frequencies of terms normalised by the paradigm changes must be detected—this is the sum of all term frequencies extracted from the key idea that motivates the research presented in publications in that year. All plotted terms were this paper. The pilot study described here, there- among the top items in our Up and Down lists. fore, aims at showing that terminological methods Up paradigmatic terms are given in blue whereas can be employed to serve this purpose, and to pro- Down paradigmatic terms are plotted in black. vide information for understanding what is going Figure 2 illustrates what types of information on in a scientific field at a given moment in time. can be drawn from the analysis conducted here. An inspection of our method’s output indi- For example, we observe that “automatic eval- cates that the renewal of vocabulary (happening by uation” rises synchronously with “Bleu score” some words falling from use and others being in- Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 121 ·10−3 5 4 statistical machine translation bleu score automatic evaluation test set Relative Frequency generative grammar phrase structure grammar 3 linguistic theory 2 1 0 1965 1975 1985 1990 1995 2000 2005 Publication Year Figure 2: Terms mapped onto a timeline: For each year, the y-axis shows the frequencies of terms normalised by the sum of the frequencies of all the terms extracted in that year. troduced) is considerable given the relatively short rience a similar increase in frequency and speci- time span under analysis in our experiments. We ficity. That is, it is harder to distinguish irrelevant observe that the content words shared by the two collocations containing Down terms from collo- data sets are, in fact, a minority. However, we also cations with terminological value. Hence, term observe that Only New (Table 2a) clearly con- extraction performance for Down terms is worse. tains items that are indicative of more recent MT We believe that, if this property can be shown to research such as “alignment”, “n-gram” or “de- hold in general, it is highly relevant as it can be coder”. The items that are specific to Only Old , used for the extraction of emergent and semanti- on the other hand, seem to be rather spurious and cally related terms. Term extraction performance low-frequent. These lexical units, rather unsur- itself can be further improved by integrating stan- prisingly, disappear upon the transition from Old dard practices such as stop-word filtering. to New . Last not but not least, a timeline plot of Up Our evaluation also indicates that the lemmas and Down paradigmatic terms indicates that Down extracted by our method (Table 2b) are indicative terms, as expected, do not exhibit the same expo- of the respective time periods, at least as far as the nential growth as Up paradigmatic terms. How- top ranks are concerned. MT experts prefer the ever, what we also observe is that many relevant output of our proposed method over the output of terms do not simply fall from use (e.g., the term the baseline method, perhaps due to the improved “linguistic theory”). They may even increase their coverage of the relevant Down lemmas. absolute frequency or become salient again in new Moreover, the terminological evaluation of the or unforeseen contexts. extracted paradigmatic terms (Figure 1) shows The local context of terms therefore remains an that Up lemmas indeed help to extract valid com- unexplored factor in trend analysis research. If putational linguistics terms. Performance for we look more closely into our data, we find unex- Down lemmas, however, is consistently worse. pected formulations such as “the language model This difference in performance, in our opinion, is in the human” or “translation model based on se- related to the higher productivity of the Up lem- mantic interpretation”. Future work will need to mas from Table 2b: Up lemmas are used in a address these kinds of dynamics in superficially growing number of more specific and more fre- identical terms that are even more fine-grained quent terms, whereas Down lemmas do not expe- than the rank shifts observed in this pilot study. Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 122 Several measures can be taken into considera- The acl anthology reference corpus: A reference tion for improving our current evaluation method. dataset for bibliographic research in computational Future work will also strive for a comparison of linguistics. In Proceedings of LREC’08, Marrakech, multiple sub-corpora that represent time slices of Morocco, may. ELRA. David Blei, Andrew Ng, and Michael Jordan. 2003. different granularity, perhaps of more similar size Latent Dirichlet Allocation. Journal of Machine and structure. The detection of time periods in Learning Research, (3). which paradigm shifts occur and a more precise Maria Teresa Cabré. 1998. Do we need an autonomous modelling of their interplay with terminological theory of terms? Terminology, 2(5). dynamics are also important topics for future re- Béatrice Daille and Helena Blancafort. 2013. search. Knowledge-poor and Knowledge-rich Approaches Finally, we would like to mention that an im- for Multilingual Terminology Extraction. In Ci- cling. portant observation about the dependence of lexi- Patrick Drouin. 2004. Detection of Domain Specific cal dynamics on frequency has already been made Terminology Using Corpora Comparison. In LREC. by Arapov and Cherc (1974) who explicitly refer Stefan Evert and Andrew Hardie. 2011. Twenty-first to Zipf: century Corpus Workbench: Updating a query ar- chitecture for the new millennium. In Corpus Lin- The speed of decay . . . can, in a way, be guistics. understood as the probability of decay. Ludwik Fleck. 1935. Entstehung und Entwick- The higher the ordinal number (rank) of lung einer wissenschaftlichen Tatsache: Einführung a [word] group . . . , the lower the fre- in die Lehre vom Denkstil und Denkkollektiv. quency of the words belonging to that Schwabe. Sonal Gupta and Christopher D. Manning. 2011. Ana- group, the higher is the speed of decay lyzing the Dynamics of Research by Extracting Key of this group.3 Aspects of Scientific Papers. In IJCNLP. David Hall, Daniel Jurafsky, and Christopher D. Man- It is no surprise that term frequency does play a ning. 2008. Studying the History of Ideas Using role in term necrology. However, the formula that Topic Models. In EMNLP. we currently use for rank comparison (i.e., Equa- Marita Kristiansen. 2011. Domain dynamics in schol- tion 2) does not account for this aspect. Further- arly areas: How external pressure may cause con- more, the question how to compare terms the fre- cept and term changes. Terminology, 17(1). quencies of which differ by sizes of magnitude Thomas S. Kuhn. 1962. The Structure of Scientific Revolutions. University of Chicago. is also yet unresolved. Future work will address Joseph Mariani, Patrick Paroubek, Gil Francopoulo, these shortcomings. and Olivier Hamon. 2014. Rediscovering 15 Years of Discoveries in Language Resources and Evalua- Acknowledgements tion: The LREC Anthology Analysis. In LREC. We thank Mihael Arcan, Iacer Calixto, Peyman Aurelie Picton. 2011. Picturing short-period di- achronic phenomena in specialised corpora: A tex- Passban, Liling Tan and colleagues for evaluating tual terminology description of the dynamics of our data. We would also like to thank Prof. Elke knowledge in space technologies. Terminology, Teich for her comments and advice. This research 17(1). has been supported by the Deutsche Forschungs- Behrang QasemiZadeh and Siegfried Handschuh. gemeinschaft (DFG, German Research Founda- 2014. The ACL RD-TEC: A Dataset for Bench- tion) through the Cluster of Excellence ‘Multi- marking Terminology Extraction and Classification modal Computing and Interaction’. in Computational Linguistics. In Computerm. Francesco Sclano and Paola Velardi. 2007. Termex- tractor: A Web Application to Learn the Shared Ter- References minology of Emergent Web Communities. In Enter- prise Interoperability II: New Challenges and Ap- M. V. Arapov and M. M. Cherc. 1974. Matematičeskie proaches. Springer. metody v istoričeskoj lingvistike. Nauka. Elke Teich, Stefania Degaetano-Ortlieb, Peter Steven Bird, Robert Dale, Bonnie Dorr, Bryan Gibson, Fankhauser, Hannah Kermes, and Ekaterina Mark Joseph, Min-Yen Kan, Dongwon Lee, Brett Lapshinova-Koltunski. 2015. The Linguistic Con- Powley, Dragomir Radev, and Yee Fan Tan. 2008. strual of Disciplinarity: A Data-Mining Approach 3 Using Register Features. J. Assoc. Inf. Sci. Technol. Translated from Russian.