Solving Morphological Analogies Through Generation

Solving Morphological Analogies Through Generation KevinChan chan3@etu.univ-lorraine.fr Université de Lorraine CNRS LORIA

F-54000 France

ShanePKaszefski-Yaschuk Université de Lorraine CNRS LORIA

F-54000 France

CamilleSaran camille.saran5@etu.univ-lorraine.fr Université de Lorraine CNRS LORIA

F-54000 France

EstebanMarquer esteban.marquer@loria.fr Université de Lorraine CNRS LORIA

F-54000 France

MiguelCouceiro miguel.couceiro@loria.fr Université de Lorraine CNRS LORIA

F-54000 France

Solving Morphological Analogies Through Generation 1613-0073 BE5E43A86DB9957D0E9EEC1561949F16 GROBID - A machine learning software for extracting information from scholarly documents Morphological analogy Analogy solving Representation learning Word generation

This contribution is a first attempt at solving morphological analogies through generation, instead of relying on retrieval approaches. Our preliminary experiments show promising results for some languages and reveal the feasibility of the approach in generating solutions of analogical equations in the morphology setting.

Introduction

Analogical proportions are understood as statements of the form "𝐴 is to 𝐵 as 𝐶 is to 𝐷" denoted 𝐴 : 𝐵 :: 𝐶 : 𝐷, and they are the basis of analogical inference. Analogical inference is a remarkable capability of human reasoning, and that has been used to solve hard reasoning tasks. To some extent, it can be thought of as transferring knowledge from a source domain to a different, but somewhat similar, target domain by relying simultaneously on similarities and dissimilarities. Analogy based reasoning (AR) is closely related to case-based reasoning and has gained increasing interest from the artificial intelligence (AI) community, and has shown its potential in multiple machine learning (ML) tasks such as classification, decision making and recommendation with competitive results [1,2,3,4]. Furthermore, analogical inference can support data augmentation through analogical extension and extrapolation for model learning, especially in environments with few labeled examples [5]. Also, it has been successfully applied to several classical NLP tasks such as machine translation [6], several semantic and morphological tasks [7,8,9], as well as (visual) question answering and solving puzzles and scholastic aptitude tests [10,11].

There are two basic tasks associated with AR. The first is analogy detection that corresponds to the task of deciding whether a quadruple 𝐴, 𝐵, 𝐶, 𝐷 constitutes a valid analogical proportion. This task asks for a common theoretical framework. However, the notion of analogy is not consensual, and there have been several efforts that follow different axiomatic and logical approaches [12,13]. For instance, [14] introduces the following 4 postulates in the linguistic context as a guideline for formal models of analogical proportions: symmetry (if 𝐴 : 𝐵 :: 𝐶 : 𝐷, then 𝐶 : 𝐷 :: 𝐴 : 𝐵), central permutation (if 𝐴 : 𝐵 :: 𝐶 : 𝐷, then 𝐴 : 𝐶 :: 𝐵 : 𝐷), strong inner reflexivity (if 𝐴 : 𝐴 :: 𝐶 : 𝐷, then 𝐷 = 𝐶), and strong reflexivity (if 𝐴 : 𝐵 :: 𝐴 : 𝐷, then 𝐷 = 𝐵). Such postulates appear reasonable in the word domain, but they can be criticized in other application domains [15,16].

The second basic task is analogy solving that refers to the task of extrapolating or generating, for a given triple 𝐴, 𝐵, 𝐶 the value 𝑋 such that 𝐴 : 𝐵 :: 𝐶 : 𝑋 is a valid analogy. One approach to tackling this task is by retrieval and adaptation, i.e., defining an 𝑋 from a pool of retrieved candidate solutions to be suitably adapted. In fact, analogy solving is somewhat related to case-based reasoning (CBR) [17] where, given a set 𝑃 of problems, a set 𝑆 of solutions and a set 𝒞 of cases (𝑥, 𝑦) ∈ 𝑃 × 𝑆, the CBR task is to find a solution 𝑦 𝑡 to a given target problem 𝑥 𝑡 . CBR basically consists in (1) selecting 𝑘 source cases in the case base according to some criteria related to the target problem (retrieval step), and (2) reusing the 𝑘 retrieved cases for proposing a target solution (adaptation step). Despite being a reasonable approach in controlled settings, it suffers from several drawbacks: it requires a suitable choice of examples and is intrinsically limited by case based approaches, that prevent creative inference and innovation.

More recent approaches to analogy solving take advantage of recent deep neural network frameworks that rely on vector representations and on the structure of the underlying multidimensional space. Essentially, analogical proportions are formalized in terms of the parallelogram rule by which four vectors 𝑒 𝐴 , 𝑒 𝐵 , 𝑒 𝐶 , and 𝑒 𝐷 (representing four elements 𝐴, 𝐵, 𝐶, and 𝐷) are in analogical proportion if 𝑒 𝐷 − 𝑒 𝐶 = 𝑒 𝐵 − 𝑒 𝐴 . Such an arithmetic view of analogical proportions has been used since the first works on analogy [18], and it was the key element in the methodology employed by earlier neural-based approaches [19,20]. In the absence of a decoder, the authors implicitly generate a representation 𝑒 𝑋 and then retrieve the closest candidate 𝐷 from the vocabulary to solve the analogical equation 𝐴 : 𝐵 :: 𝐶 : 𝑋 (see brief discussion of Subsection 2.2). However, Chen et al. [21] argue that the latter two methods significantly differ from human performance.

In the case of sentence analogies (i.e., where 𝐴, 𝐵, 𝐶 are sentences), [22] overcomes this issue by training a decoder that is then used to decode 𝑒 𝑋 . In this paper, we employ a similar approach in the setting of word analogies. More precisely, following the tracks of [23,24,25], we address morphological issues on words and tackle the problem of solving morphological analogies. Inspired by the work of [22] to solving sentence analogies, the novelty in our contribution is to make use of autoencoders to solving morphological analogies on words. More precisely, the main contributions of this paper are as follows: (i) we propose a model to generate words at character level from word embeddings with high reconstruction performance, and (ii) we achieve encouraging results to solving morphological analogies by generation, thus indicating the feasibility of the approach. Nonetheless, this constitutes ongoing research that requires further investigations.

The paper is organized as follows. We first briefly survey previous work on both main tasks dealing with morphological analogies in Section 2. We then describe the key components of the deep learning architecture as well as the analogy solving procedure we use in Section 3. The empirical setting setting is then presented in Section 4 where we also discuss the experimental results. We conclude with a general overview of this contribution in Section 5 and propose further directions of future research.

Related Approaches

In this paper, we focus on morphological analogies, i.e., analogies on words 𝐴, 𝐵, 𝐶, and 𝐷 that capture morphological transformations of words (e.g., conjugation or declension). In this section we introduce key approaches of analogy detection and solving in morphology. The main trend follows the seminal work of [26] by exploiting the postulates of analogical proportions mentioned in introduction, but some approaches including ours take a slightly different approach. As deep learning approaches to morphological analogies are strongly related to approaches on semantic word analogies, the latter will also be discussed here.

Analogy Detection

As mentioned above, the analogy detection task corresponds to classifying quadruples 𝐴, 𝐵, 𝐶, 𝐷 into valid or invalid analogies. The tools in [27] detect morphological analogies using the number of characters occurrences and the length of the longest common subword. Their approach is designed to generate analogical grids, i.e., matrices of transformations of various words, similar to paradigm tables in linguistics [7]. A data-driven alternative was implemented by [8] for semantic word analogies. Using a dataset of semantic analogies, they learn a neural network to classify quadruples 𝐴, 𝐵, 𝐶, 𝐷 into valid or invalid analogies, using their embedding 𝑒 𝐴 , 𝑒 𝐵 , 𝑒 𝐶 , and 𝑒 𝐷 . This approach was applied to morphological analogies in [24] by replacing the GloVe [28] semantic embeddings used by Lim et al. with a morphology-oriented word embedding model.

Analogy Solving

Approaches to analogy solving usually generate the fourth element to solve the analogy, but it is also possible to leverage a list of candidates and retrieve the most fitting fourth term to solve the analogy. In Subsubsection 2.2.2 we describe key approaches using the former method to solve morphological analogies, and similarly in Subsubsection 2.2.1 for the latter method. Many approaches in embedding spaces use the latter method because generation from an embedding space can be challenging, and we describe some in Subsubsection 2.2.1. However, such retrieval approaches are limited to the available vocabulary and are unable to perform analogical innovation, despite it being a key mechanism in the evolution of languages [29,30].

Retrieval

Analogy solving on word embeddings has been around since early works on Latent Semantic Analysis [31] and word embeddings [20,23], in which examples like 𝑘𝑖𝑛𝑔 − 𝑚𝑎𝑛 + 𝑤𝑜𝑚𝑎𝑛 = 𝑞𝑢𝑒𝑒𝑛 have been used do demonstrate the ability to encode semantic features in the word representation. These examples can be formulated as analogical equations 𝑚𝑎𝑛 : 𝑤𝑜𝑚𝑎𝑛 :: 𝑘𝑖𝑛𝑔 : 𝑋, for which the solution is retrieved among a vocabulary of candidate words. In [23], the authors use morphological 1 analogies to demonstrate that some word embedding models encode a degree of morphological information. Two of the most used methods for solving analogies in embedding spaces by retrieval are 3CosAdd [20] and 3CosMul [32]. In 3CosAdd, the solution 𝑋 is retrieved from the vocabulary by minimizing the cosine distance 𝑐𝑜𝑠(𝑒 𝑤𝑜𝑟𝑑 , 𝑒 𝑋 ), with 𝑒 𝑋 = 𝑒 𝐶 − 𝑒 𝐴 + 𝑒 𝐵 and 𝑒 𝐴 , 𝑒 𝐵 , 𝑒 𝐶 , and 𝑒 𝑋 the embeddings of 𝐴, 𝐵, 𝐶, and 𝑋. 3CosMul follows a similar intuition but we refer the reader to [32] for a detailed description. However, the quality of the solution produced by the methods described above have been criticized by [19] for being far from human performance in some cases. Nonetheless, frameworks based on analogy datasets like those mentioned in [8] appear to bridge this gap in performance. By replacing the arbitrary formula by a learned estimator, Lim et al. significantly improved performance on solving semantic word analogies. This latter approach was adapted to morphological word analogies in [25] and outperforms the generative methods described in Subsubsection 2.2.2. Those two approaches rely on the postulates of analogical proportions, and achieve high analogy solving performance.

While the approach by Marquer et al. has state of the art performance on solving analogical equations in morphology, it suffers from the limitations of retrieval approaches: the solutions are retrieved from a de facto finite vocabulary and analogical innovation is impossible. By using a generative deep learning model, the present work aims to maintain state of the art performance while solving the limitation of retrieval approaches.

Generation

In [33], the author uses the postulates of [26] to address multiple characteristics of words, such as their length, the occurrence of letters and of patterns. Based on these features, Lepage proposes an algorithm to solve analogies between character strings. Following the results of [34] about closed form solutions, the Alea algorithm [6] proposes a Monte-Carlo estimation of the solutions of an analogical equation by sampling among multiple sub-transformations. Those sub-transformations are obtained by considering the words as bags of characters and generating permutations of characters that are present in 𝐵 but not in 𝐴 on one side, and characters of 𝐶 on the other. Intuitively, if we consider 𝑏𝑎𝑔(𝐴) the bag of characters in 𝐴, Alea considers 𝑏𝑎𝑔(𝐷) = (𝑏𝑎𝑔(𝐵) − 𝑏𝑎𝑔(𝐴)) + 𝑏𝑎𝑔(𝐶) and thus 𝐷 is a permutation of the characters of 𝑏𝑎𝑔(𝐷). Recently, a more empirical approach was proposed by [9], which does not rely on the axioms of analogical proportions. The generation model proposed by the authors considers some transformation 𝑓 such that 𝐵 = 𝑓 (𝐴) and 𝑓 (𝐶) is computable. The simplest transformation 𝑓 is usually the one human use to solve analogies [9], and is found by minimizing the Kolmogorov complexity of 𝑓 . This complexity is estimated by first expressing 𝑓 using a language of operations (insertion, deletion, etc.), and computing the length of the resulting program. Unlike Alea, Kolmo is able to handle mechanisms like reduplication (repeating part of a word).

Recently, [22] proposed a generation framework to solving sentence analogies. They use an autoencoder model (named ConRNN) trained to reconstruct sentences, and perform simple First, a sentence (as a sequence of words) is used as input to an encoder RNN, and the last hidden state of the RNN is used as the sentence embedding. The latter is then fed to a decoder RNN that tries to predict the words of the input sentence. The use of a generative model achieves significantly better results than previous retrieval approaches on the same embedding space. The current work aims to extend the one of [25] by replacing the retrieval by the generation of the solution of morphological analogical equations. To do so, it is necessary to generate words at the character level from fixed-size embeddings, however in the literature there is to our best knowledge no approach proposed to tackle this specific issue. Inspired by the success of [22], we propose a character-level autoencoder for words and display its performance in solving morphological analogies.

Our Approach

In this section we present the approach we use, illustrated in Figure 1. The architecture for our model is a character-level sequence-to-sequence autoencoder model, based on the model described in [35]. In order to properly decode the final vector solution, the model is trained to encode words and then decode the resulting vector back into the same word. Each character in a word 𝑤 is encoded into a one-hot vector and is then fed into the encoder, which uses a Bidirectional Long Short Term Memory (BiLSTM) layer. This layer outputs four the last hidden state ℎ 𝑓 and cell state 𝑐 𝑓 in the forward direction, and similarly ℎ 𝑏 and 𝑐 𝑏 for the backward direction. The concatenation of these vectors 𝑒 𝑤 = concat(ℎ 𝑓 , ℎ 𝑏 , 𝑐 𝑓 , 𝑐 𝑏 ) is the embedding of the word. The decoder is a regular LSTM layer, followed by a dense layer with softmax activation. The input for the first step of the decoder is the above-mentioned embedding, split into two states ℎ = concat(ℎ 𝑓 , ℎ 𝑏 ) and 𝑐 = concat(𝑐 𝑓 , 𝑐 𝑏 ). During training, we use teacher forcing: (i) the characters of the word 𝑤 to predict are used as input, with an added beginning-of-word (BOW) character at the begining; (ii) the prediction targets are the characters of 𝑤, but ahead by one time-step and with an end-of-word (EOW) character at the end.

To compute the solution of an analogy 𝐴 : 𝐵 :: 𝐶 : 𝑋, the embeddings 𝑒 𝐴 , 𝑒 𝐵 , and 𝑒 𝐶 are computed by the encoder and used to compute 𝑒 𝑋 = 𝑒 𝐵 − 𝑒 𝐴 + 𝑒 𝐶 . Then, 𝑒 𝑋 is decoded into a word 𝑋 by the decoder. Beginning with the BOW character, at each time-step the sampled character with the highest probability of occurrence is added to the word until either the EOW character is predicted or the length of the word is the same as the longest word in the dataset.

Experiments

In this section we present our experimental setup. First, the dataset we use is described in Subsection 4.1. We then report the performance of our model in the autoencoder setting in Subsection 4.2. The analogy solving performance of our approach is compared with baselines in Subsection 4.3. Finally, we discuss the overall performance of the model in Subsection 4.4.

Datasets

For our experiments, we used the analogies from 8 languages available in the Siganalogies dataset [36]: Arabic, English, French, German, Hungarian, Portuguese, Russian, and Spanish extracted from the high resource languages of Sigmorphon2019 [37]. These languages were chosen such that, in later stages of the work, the authors have enough linguistic knowledge to interpret the model outputs. In order to obtain train and test sets, non-overlaping random subsets of analogies from the entire Sigmorphon dataset for a given language are taken, to ensure that no analogy is seen in both the train and test sets.

The Siganalogies dataset also provides a method for data augmentation via permutating the four words in a given analogy. These permutations are obtained using the symmetry and central permutation postulates of analogy. From a base form 𝐴 : 𝐵 :: 𝐶 : 𝐷, we generate 7 permutations:

• 𝐴 : 𝐶 :: 𝐵 : 𝐷; • 𝐷 : 𝐵 :: 𝐶 : 𝐴; • 𝐶 : 𝐴 :: 𝐷 : 𝐵; • 𝐶 : 𝐷 :: 𝐴 : 𝐵; • 𝐵 : 𝐴 :: 𝐷 : 𝐶;

• 𝐷 : 𝐶 :: 𝐵 : 𝐴;

• 𝐵 : 𝐷 :: 𝐴 : 𝐶.

In Siganalogies, the base forms 𝐴 : 𝐵 :: 𝐶 : 𝐷 are such that 𝐵 is an inflected form of 𝐴 and 𝐷 is inflected from 𝐶. In addition to that, base forms 𝐴 : 𝐴 :: 𝐵 : 𝐵 derived from the identity postulate (𝐴 : 𝐴 :: 𝐵 : 𝐵 is true for all 𝐴 and 𝐵) are present. An example of analogy in English is dog : dogs :: cat : cats, another in French is revérifier : revérifiasse :: tormenter : tormentasse, and in German there is Donor : Donor :: Herstellungsverfahren : Herstellungsverfahren (identity, but also accusative singular declension of the noun).

Autoencoder performance

As shown in Table 1, our autoencoder achieves very high accuracy in decoding vectors back into words, meaning that any wrong solutions are a result of the operations performed on the analogy rather than the decoding process. In our experiments, the model encodes the words as 128-dimensional vectors and there is a 0.1 dropout on the decoder LSTM layer. The loss function used is categorical cross-entropy, since a probability for the likelihood of each character appearing is required at each time-step. An 80/20 train/validation split is used, and the validation loss was the metric used for the early stopping. If the early stopping is not triggered, the model is trained for 100 epochs.

Analogy solving performance

Two metrics are used to determine the performance of our model. The first one is a variation of Levenshtein distance, which calculates the minimum number of edits required to change one sequence into another using insertions, deletions, and substitutions. In order to display how close the decoded analogy solutions are to the expected analogy solutions, the Levenshtein distance 𝐿 was normalized using the length of the manipulated words into a percentage 𝐿 𝑝 like so:

𝐿 𝑝 (𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑, 𝑑𝑒𝑐𝑜𝑑𝑒𝑑) = 1 − 𝐿(𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑, 𝑑𝑒𝑐𝑜𝑑𝑒𝑑) max (𝑙𝑒𝑛 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 , 𝑙𝑒𝑛 𝑑𝑒𝑐𝑜𝑑𝑒𝑑 ) Table 2

Results for 8 languages for 10,000 base analogies and all of their permutations (80,000 analogies in total). We report 𝐿 𝑝 in % and the accuracy (Acc.) in %. Our autoencoder was trained for 100 epochs on 40,000 random words per language. Baselines Alea [6] and Kolmo [9] were tested in the same setting. The accuracy of the retrieval model ANNr [25] is reported as mean ± standard deviation for 10 random initialization, but note that these results are not completely comparable with our approach as they were obtained in a closed setting. The resulting percentage measures the rate of correctly decoded characters per word -when it is 1 (or 100%), then the decoded solution matches the expected solution perfectly. The second metric, accuracy, was calculated by dividing the number of correctly decoded analogies by the total number of analogies for each language. We report the results of decoding a test set of 10,000 base analogies and their permutations in Table 2. We compare our approach with Alea [6] and Kolmo [9] described in Subsubsection 2.2.2. We also report the retrieval accuracy of ANNr [25], however as the model is a retrieval approach it is not directly comparable to our model and other baselines. Instead, it indicates the performance one can reach when bypassing the issue of generation. Our model reaches comparable performance to the generation baselines in terms of 𝐿 𝑝 for all languages, and comparable performance in terms of accuracy for half of the languages.

Language

Discussion

The performance on Arabic of all generation models is very low, while the retrieval model does not appear to suffer from the same effect. Further analysis of the data reveals that the character encoding used for Arabic decomposes each character into multiple encoded characters, resulting in longer and more complex sequences of characters than expected. We suppose this makes generation harder and is the cause of this low performance.

There is a significant difference in the performance of the model depending on which permutations are used. Due to the high accuracy of the decoder and the nature of the parallelogram rule, the model performs very well on analogies where the solution 𝐷 is the same as another element in the analogical equation. Permutations of this form include strong reflexivity, strong inner reflexivity, and identity.

As Table 2 shows, the raw accuracy is often lower than the baselines Alea and Kolmo, but the Levenshtein percentage is often on par or higher. This suggests that more individual characters are correctly decoded with our model on average when compared to the baselines, but that it does not decode entire words with 100% accuracy as often. This is to be expected as the model is not trained to solve analogies, but rather is trained to properly decode words after vector arithmetic is performed on the encoded vectors.

When applied to the encoded vectors, the parallelogram rule is highly accurate with regular morphology and with certain permutations, but it often struggles when the morphology is more irregular. Since the model decodes individual words with high accuracy, the problem lies with the operations performed on the three vectors in an analogical equation after encoding. Given the model's current performance without explicitly encoding any morphological features or features of analogical equations when training, we expect that better performance can be obtained if these features are included in future iterations of the trained autoencoder.

Conclusion and Perspectives

In this paper we proposed an autoencoder framework to solving morphological analogies by generating solutions. This partially addresses the limitations of previous works relying on case based approaches that prevent creative inference and innovation. Our adaptation to the morphology setting was illustrated in several languages with promising results, and that reveal new potential directions for future work.

However, this is a preliminary proposal that will profit from further training and the combination with state of the art retrieval approaches such as ANNr from [25]. Moreover, we will also explore its tranferability potential and its generalization across multiple modalities and data contexts.

Figure 1 :1Figure 1: Character-based word auto encoder and vector arithmetic to solve analogies

Table 11Autoencoder accuracy at the word level for 8 languages, trained for 100 epochs on 40,000 random words.LanguageAccuracy (%)Arabic99.99English99.98French99.99German99.98Hungarian99.97Portuguese99.99Russian99.96Spanish99.98

In[23] the authors refer to morphological transformations as syntactic transformations, because they refer to the syntactic role of the word (e.g., past participle) and not the arrangement of its morphemes (e.g., the addition of the suffix "-ed").

Acknowledgments

This research work was partially supported by TAILOR, a project funded by EU Horizon 2020 research and innovation program under GA No 952215, and the Inria Project Lab "Hybrid Approaches for Interpretable AI" (HyAIAI).

Learning to rank based on analogical reasoning MAFahandar EHüllermeier AAAI 2018 Analogical embedding for analogy-based learning to rank MAFahandar EHüllermeier IDA Springer 2021 12695 Analogical proportion-based methods for recommendation -first investigations NHug HPrade GRichard MSerrurier Fuzzy Sets Systems 366 2019 Abstraction and analogy-making in artificial intelligence MMitchell Ann. N.Y. Acad. Sci 1505 2021 Analogy-preserving functions: A way to extend boolean samples MCouceiro NHug HPrade GRichard 26th IJCAI 2017 Improvements in analogical learning: Application to translating multi-terms of the medical domain PLanglais FYvon PZweigenbaum 12th EACL, ACL 2009 Morphological predictability of unseen words using computational analogy RFam YLepage 24th ICCBR workshops 2016 Solving word analogies: A machine learning perspective SLim HPrade GRichard 15th ECSQARU 2019 11726 Solving analogies on words based on minimal complexity transformation P.-AMurena MAl-Ghossein J.-LDessalles ACornuéjols 29th IJCAI 2020 Visalogy: Answering visual analogy questions FSadeghi CLZitnick AFarhadi 2015 NeurIPS Detecting unseen visual relations using analogies JPeyre ILaptev CSchmid JSivic IEEE ICCV 2019 Analogy and formal languages YLepage 6th CFG and 7th CML 2001 53 Analogical dissimilarity: Definition, algorithms and two experiments in machine learning LMiclet SBayoudh ADelhay JAIR 32 2008 YLepage De l'analogie rendant compte de la commutation en linguistique 2003 Université Joseph-Fourier -Grenoble I Habilitation à diriger des recherches CAntic Analogical proportions 2022 Analogy between concepts NBarbot LMiclet HPrade Artificial Intelligence 275 2019 When Revision-Based Case Adaptation Meets Analogical Extrapolation JLieber ENauer HPrade 29th ICCBR 2021 12877 A model for analogical reasoning DERumelhart AAAbrahamson Cognitive Psychology 5 1973 ADrozd AGladkova SMatsuoka Word embeddings, analogies, and machine learning: Beyond king -man + woman = queen 2016 26th COLING Efficient estimation of word representations in vector space TMikolov KChen GCorrado JDean 1st ICLR, Workshop Track 2013 Evaluating vector-space models of analogy DChen JCPeterson TGriffiths 39th CogSci, Cognitive Science Society 2017 Vector-to-sequence models for sentence analogies LWang YLepage ICACSIS 2020 Linguistic regularities in continuous space word representations TMikolov W.-TYih GZweig NAACL 2013 A neural approach for detecting morphological analogies SAlsaidi ADecker PLay EMarquer P.-AMurena MCouceiro IEEE 8th DSAA 2021 A Deep Learning Approach to Solving Morphological Analogies EMarquer SAlsaidi ADecker P.-AMurena MCouceiro 2022 Saussurian analogy: a theoretical account and its application YLepage SAndo 16th COLING 1996 Tools for the production of analogical grids and a resource of n-gram analogical grids in 11 languages RFam YLepage 11th LREC ELRA 2018 Glove: Global vectors for word representation JPennington RSocher CDManning EMNLP 2014 Analogy and morphological change DLFertig 2013 Edinburgh University Press Analogy in Word-formation EMattiello 2017 De Gruyter Mouton Using latent semantic to improve access to textual information STDumais GWFurnas TKLandauer SDeerwester RHarshman SIGCHI 1988 Dependency-based word embeddings OLevy YGoldberg Short Papers), ACL 2014 2 52nd ACL Character-position arithmetic for analogy questions between word forms YLepage 25th ICCBR workshops 2017 2028 FYvon Finite-state transducers solving analogies on words Rapport GET/ENST&LTCI 2003 Character-level recurrent sequence-to-sequence model FChollet 2017 Siganalogies -morphological analogies from Sigmorphon EMarquer MCouceiro SAlsaidi ADecker 2016. 2019. 2022 The SIGMORPHON 2019 shared task: Morphological analysis in context and cross-lingual transfer for inflection ADMccarthy EVylomova SWu CMalaviya LWolf-Sonkin GNicolai CKirov MSilfverberg SJMielke JHeinz RCotterell MHulden 16th CRPPM workshops, ACL 2019