Applying MIPVU Metaphor Identification Procedure on Czech
                                   Dalibor Pavlas, Ondřej Vrabeľ, Jiří Kozmér
                                                Palacký University Olomouc
                                           Křížkovského 511/8, 771 47 Olomouc
                         dalibor.pavlas@gmail.com, ondra.vrabel@seznam.cz, jiri.kozmer@gmail.com

                                                              Abstract
This paper represents the current state of the research project aimed at modifying the MIPVU protocol for metaphor annotation for
usage on Czech-language texts. Three annotators were trained to use metaphor identification procedure MIPVU and annotated 2 short
text excerpts of about 600 tokens length, then the reliability of annotation was measured using Fleiss’ kappa. The resultant inter-
annotator agreement of 0.70 was below kappa values reported by annotators of VU Amsterdam Metaphor Corpus (Steen et al., 2010)
and very similar to the agreement that researchers (Badryzlova et al., 2013) got in their first reliability test with unmodified MIPVU
procedure applied on Russian texts. Some modifications of the annotation procedure are proposed in order for it to be more suitable for
Czech language. The modifications are based on the observations made by annotators in error analysis and by authors of similar
projects aimed to transfer MIPVU procedure to Slavic/inflected languages. The functionality of the annotation procedure refinements
now have to be tested in the second reliability test.

Keywords: Metaphor, MIPVU, MIP, annotation, Metaphor Identification Procedure, inter-annotator agreement, Fleiss’ kappa, Czech
language
                                                                       metaphoricity is almost unnoticeable. This caused need
                    1.     Introduction                                for clearly defined guidelines for metaphor identification
                                                                       in text but due to complexity of the task it was not until
This paper represents the current state of the research
project aimed at modifying the MIPVU protocol for                      2007 before such a procedure was established. It was done
                                                                       by a group of researchers which called themselves
metaphor annotation for usage on Czech-language texts. It
                                                                       Pragglejaz group.
is the initial stage of creation of Czech metaphor corpus
which could be a very valuable resource for several fields             Their method called MIP (Metaphor Identification
                                                                       Procedure; Pragglejaz group (2007)) was then refined in
of linguistic research (such as computational, cognitive
                                                                       several ways and applied on data from The British
and corpus linguistics).
This initial stage includes:                                           National Corpus. The upgraded procedure is called
                                                                       MIPVU and the resulting annotated source is VU
     1) Modification of the MIPVU protocol for reliable
                                                                       Amsterdam Metaphor Corpus (VUAMC; Steen et al.,
          linguistic metaphor identification in Czech
     2) Introducing an alternative tag (located in parallel            2010). It consists of approximately 200,000 words taken
                                                                       from the BNC’s Baby Corpus and it is divided into four
          to original MIPVU tags) which, if needed, will
                                                                       genres: academic, news, fiction, and conversation.
          allow us to filter out the highly conventionalized
          cases of metaphors.                                          In MIPVU, lexical units (words) whose contextual
                                                                       meanings are opposed to their basic meanings are
The process of modifying the MIPVU procedure is
                                                                       considered metaphor-related words (MRWs). Annotators
described in the following parts of this work. The addition
of the alternative tag for highly conventionalized                     establish the basic and the contextual meaning for each
                                                                       word in the corpus using dictionary.
metaphors is motivated by the desire to use the resulting
                                                                       If basic meaning of a word is:
corpus for training of systems for automatic identification
of metaphor.                                                           a) more concrete; what it evokes is easier to imagine, see,
                                                                       hear, feel, smell and taste;
Lexicalized cases of metaphors can be successfully
                                                                       b) related to bodily action;
interpreted using standard word sense disambiguation
techniques (Shutova, 2015), which means that if they are               c) more precise (as opposed to vague);
                                                                       the word is marked as MRW.
labelled metaphorical in training data it may be causing
                                                                       The history of a lexical unit is usually not taken into
metaphor identification system to be less effective.
Our goal is to keep the data for metaphor usage statistics,            account, which is one of the differences between MIP and
                                                                       MIPVU.
so it can be directly comparable with the same statistics
available for English, and, at the same time, make the                 2.2    Applications of MIPVU to different
resulting corpus more suitable for computational
approaches to metaphor.
                                                                              languages
                                                                       Yulia Badryzlova and her colleagues (2013) modified the
                   2.     Related work                                 MIPVU protocol for Russian-language texts and
                                                                       attempted to extend annotation to the level of conceptual
2.1    MIP and MIPVU                                                   mappings “deep annotation”.
Since early ninety-eighties, when conceptual metaphor                  They measured the inter-annotator agreement on texts
theory (CMT; Lakoff and Johnson, 1980) was introduced,                 using original MIPVU and their modified version and
there has been a great interest in metaphor research. At the           compared it with the results of the same tests made by
same time metaphor, even if we take into account only its              Steen and his colleagues (2010) in the process of
manifestation in language, is a very complex                           establishing MIPVU procedure. In the second test their
phenomenon. It varies from novel and very creative                     resulting inter-annotator agreement outperformed the
expressions to extremely lexicalized ones, whose                       agreement reported for VUAMC. The project was then
                                                                       discontinued, but recently Badryzlova and Lyashevskaya

                                                                  37
(2017) renewed the pursuit for creation of Russian                     Applying        Russian         Russian          VU
metaphor corpus. They used an annotation procedure                     MIPVU           corpus of       corpus of        Amsterdam
based on MIPVU but modified in several ways. In their                  on Czech;       conceptual      conceptual       Metaphor
project, linguistic metaphor annotation is added as a new              3 annotators,   metaphor;       metaphor;        Corpus;
layer to SynTagRus, the Russian syntactical dependencies               1209 tokens     3 annotators,   3 annotators,    4 annotators,
treebank.                                                                              approx. 2000    approx. 2000     1921 tokens
Justina Urbonaitė (2015) examined metaphors of law                                     tokens          tokens
related concepts in English and Lithuanian using MIPVU
procedure for annotation. Although unable to report inter-                             (Badryzlova     (Badryzlova      (Steen et al.
annotator agreement as she was the only annotator, her                                 et al. 2013)    et al. 2013)     2010)
work offered very useful remarks on applying MIPVU on
an inflected language.                                                 Reliability     Reliability     Reliability      Reliability
For the current stage of our project we are using a model              test 1          test 1          test 2           test 6
similar to work of (Badryzlova et al., 2013) and are trying            0.70            0.68            0.90             0.85
to utilize the findings and observations from all the three
above mentioned sources.                                              Table 2: Comparison of inter-annotator agreement in other
                                                                                            MIPVU projects
                   3.    Reliability test                             It shows that our kappa is yet below the desired numbers
                                                                      and very similar to the agreement that Badryzlova and her
We annotated two text excerpts each of about 600 tokens
                                                                      colleagues got in their first reliability test with unmodified
length. First excerpt (598 tokens) belonged in the fiction
genre and was taken from short story “Zasraný vánoce”                 MIPVU procedure.
by Michal Viewegh. The second one (611 tokens) was
taken from a transcription of proceedings of European                          4.      Error analysis and proposed
Parliament. These transcriptions are available from the                                       modifications
parallel corpus InterCorp (Rosen et al., 2017), which is a            4.1    Cases of disagreement
part of The Czech National Corpus project.
Dictionary of Standard Czech Language (Vácha et al.,                  The table 3 shows disagreement count for both annotated
1971; abbreviation SSJČ is commonly used) and                         texts in total and in respect of different parts of speech.
Dictionary of Standard Czech (Kroupová et al., 2005;                  Part of speech which in both annotated excerpts
SSČ) were used to establish basic meanings.                           manifested most of the disagreement were verbs, followed
Two of the 3 annotators were Ph.D. students and the                   by prepositions in case of the fiction text by Michal
remaining one was a Master's student, all of them in the              Viewegh, and by nouns in the case of European
field of linguistics and with prior experience in conceptual          Parliament proceedings.
metaphor studies.                                                      POS                 Viewegh      Europarl       Sum of
The reliability of the annotation was measured using                                                                   disagreement
Fleiss' kappa, a statistical measure of inter-annotator                Nouns               6            18             24
agreement which corrects for chance agreement between                  Verbs               18           30             48
analysts (Artstein and Poesio, 2008).                                  Adjectives          6            6              12
In this first reliability test, the annotators were trained in         Adverbs             5            4              9
MIPVU protocol and instructed to follow it. The                        Prepositions        11           16             27
annotation was performed in the manner similar to                      Conjunctions        0            1              1
reliability tests in the process of making VUAMC, which                All POS             46           75             121
means the annotators worked only with plain text and
marked each lexical unit with either 1 (MRW) or 0 (non-                                Table 3: Disagreement count
MRW). The Fleiss' kappa calculation as well as                        It is noteworthy that while the annotated excerpt of
determination of the cases of disagreement was carried                European Parliament        proceedings shows        more
out by a Python program designed specifically for this                disagreements in annotation it nevertheless shows higher
task.                                                                 inter-annotator agreement (as seen in Tab. 1). This is
The results can be seen in Tab. 1.                                    caused by the fact that more than twice as many
                           Percentage unanimous                       metaphors are present in the text compared to the other
   Text     Tokens        Not                        Fleiss’κ         excerpt. This corresponds with the findings of Steen and
                                   MRW       Total                    his colleagues (2010) that from the four registers,
                         MRW
 Viewegh 598            87.46      4.85     92.31    0.65             (academic, news, fiction, and conversation) only
 Europarl 611           76.76      10.97    87.73    0.72             conversation had lower frequency of MRWs than fiction
 Total Fleiss’ κ                                     0.70             texts.
                                                                      Part of the disagreement in verb annotation seems to be
       Table 1: Resultant inter-annotator agreement                   caused more by a bias of individual annotators than a
The minimum thresholds accepted for Fleiss' kappa are                 systematic pattern in the annotation protocol. In case of
commonly stated to be 0.67, 0.7 or 0.8 (Artstein and                  the European Parliament proceedings one of the
Poesio, 2008; Badryzlova et al., 2013), more important is             annotators did not marked several metaphorically used
the comparison of the resultant inter-annotator agreement             lexical units as MRWs. The reason was that in case of
with the agreement observed on VUAMC and with the                     some verbs the annotator overlooked personifying
work (Badryzlova et al., 2013). See the comparison in                 connection between the verb and its subject if the latter
Tab. 2.                                                               was highly abstract (e.g. luck, possibility, right or


                                                                 38
freedom), the annotator have realized this omission                  the contextual one, so it is annotated as not-MRW, which
immediately after the annotation course was finished.                matches better with the general sense of the sentence. 1
The approach we have chosen for dealing with                         Similarly, Czech auxiliary verbs such as “bych” are
disagreements in preposition annotation is showed in                 considered integral parts of the full verb’s conjugation
chapter 4.2.                                                         forms.
                                                                     Therefore for reflexive pronouns “se/si” and auxiliary
4.2    Prepositions                                                  verbs we applied the same policy as annotators of
In English and presumably in many languages,                         VUAMC used for phrasal verbs in English, which means
prepositions are the most metaphor-rich part of speech as            that they count as one lexical unit altogether with the full
they are reported to account for 38.5-46.9% of metaphor-             verb.
related words in VUAMC (Steen et al., 2010). Czech                   On the other hand, meanings commonly expressed by
prepositions are more homonymous than prepositions in                phrasal verbs in English tend to be expressed by prefixes
English and there was a substantial disagreement between             in Czech which are already parts of the word as seen in 7).
the annotators.                                                           7) zesílit; turn up
Just like Badryzlova and her colleagues (2013) did, we
made a list of major prepositions’ basic meanings. We
                                                                     4.4     Set expressions
followed the Czech linguistic tradition where                        Dealing with set expressions, we followed remarks on
prepositions’ meanings are distinguished by grammatical              MIPVU made recently by the main author of VUAMC
case (Veselková, 1986; Štícha et al., 2013). This helped to          (Steen, 2017), which is to treat each word of set
filter out homonymy and made it possible to choose just              expression as a lexical unit itself. This renders the
one basic meaning.                                                   demarcation line between metaphor and idiom unclear.
Take for example these expressions containing                        On the other hand, using dictionaries to determine set
preposition “za”. While it is clear that in sentences 3) and         expressions as (Badryzlova et al., 2013) did, seemed to be
4) “za” is a MRW, in the case of 1) and 2) both meanings             problematic because unlike the dictionaries used in the
are clearly distinct but equally concrete and bodily related.        original MIPVU procedure, dictionaries available for
      1) Petr stojí za mnou; Petr stands behind me                   Czech are neither corpus based, nor contemporary.
      2) Chytil jsem ho za nohu; I caught him by the leg
      3) Za 2 roky to bude hotové; It will be done in                                       5.    Summary
          2 years                                                    So far, we have applied MIPVU on Czech texts and tested
      4) Vyměnil jsem kolo za auto; I traded the bike for            inter-annotator agreement. Direct transferability of the
          the car                                                    MIPVU procedure to Czech language turned out to be
If we distinguish between “za” in instrumental (expression           problematic, which we expected, as the same
1)) and in accusative 2), we can have basic meaning for              complications were reported by researchers applying the
each one, moreover “accusative za” standing for basic                procedure on Russian (Badryzlova, 2013) and Lithuanian
meaning of this preposition in sentences 3) and 4) which             (Urbonaitė, 2015).
both are MRWs.                                                       After the error analysis, we have proposed several minor
4.3    Reflexive pronouns “se/si” and auxiliary                      modifications of the guidelines in order to make them
                                                                     more suitable for Czech and we plan to conduct second
       verbs                                                         reliability test as soon as possible.
Reflexive pronouns “se/si” are used either when the                  The next step after successfully transferring MIPVU to be
subject and object of the sentence are identical 5) or as an         used on Czech texts would be to annotate the data with an
integral part of a reflexive verb whose lexical meanings             additional tag for highly lexicalized metaphors. It is meant
they often determine. The presence of a reflexive pronoun            to work not by asking whether the contextual meaning is
“se/si” can result in a complete shift of meaning as                 different from basic one but rather whether there is a
illustrated in 6).                                                   literal word in use which can express the given contextual
     5) umyji se; I will wash myself                                 meaning. If there is not, it is probably a highly
     6) rozvést / rozvést se; to develop (an idea) / to              conventionalized metaphor.
          divorce                                                    Nevertheless, there are several yet unanswered questions
Expectably, the original MIPVU procedure does not                    regarding this approach, the most important one being if
account for this phenomenon. The table 4 shows its effect            annotators will agree sufficiently on those cases.
on an actual annotated sentence.
                                                                                     6.    Acknowledgements
Annotated sentence Když     se před třemi lety rozvedl [...]
                                                                     This work was funded by Ministry of Education, Youth
Original MIPVU       0       0   1     0    0       1
                                                                     and Sport of the Czech Republic as a part of the project
Modified MIPVU       0       0   1     0    0       0
                                                                     “Počáteční fáze tvorby korpusu metafory v češtině”
     Table 4: Annotation of a sentence where reflexive               (Grant number IGA_FF_2018_026).
              pronoun causes a shift of meaning
The highlighted tokens, when treated as separate lexical             1
units, will render the basic meaning of the word “rozvedl”             In the first course of annotation the interconnection of words is
to be “he developed/expanded (something)” and the                    realized simply by giving the reflexive pronoun (or an auxiliary
                                                                     verb) always the same value of metaphoricity which is given to
contextual meaning which is “he got divorced” should
                                                                     its corresponding verb. This naive method is justifiable because
therefore be a MRW. On the other hand, the expression                this stage of the project only serves to refine the annotation
“se” + “rozvedl”, when counted as one lexical unit which             manual. It is not suitable for actual corpus generation as it would
is distinct from “rozvedl”, has an equal basic meaning to            influence the metaphor usage statistics.

                                                                39
           Bibliographical References                           Steen, G. (2017). Identifying metaphors in language. In
Arstein, R. and Poesio, M. (2008). Inter-coder agreement           Semino, E., Demjén, Z. (Eds.) The Routledge
  for     computational     linguistics.   Computational           Handbook of Metaphor and Language. London:
  Linguistics, 34(4): 554–596.                                     Routledge, chapt. 5.
Badryzlova Y., Lyashevskaya O. (2017). Metaphor Shifts          Steen, G., Aletta, G., Dorst, J., Herrmann, B., Kaal, A. A.,
  in Constructions: the Russian Metaphor Corpus. In                Krennmayr, T., Pasma, T. (2010). A method for
  AAAI Spring Symposium Series, pp. 127-130.                       linguistic metaphor identification: From MIP to
Badryzlova, Y., Shekhtman, N., Isaeva, Y., Kerimov, R.             MIPVU. Amsterdam, John Benjamins.
  (2013). Annotating a Russian corpus of conceptual             Štícha, F., et al. (2013). Akademická gramatika spisovné
  metaphor: a bottom-up approach. In Proceedings of the            češtiny. Praha: Academia.
  First Workshop on Metaphor in NLP. Atlanta, GA:               Urbonaitė, J. (2015). Metaphor identification procedure
  Association for Computational Linguistics, pp. 77–86.            MIPVU: an attempt to apply it to Lithuanian. Taikomoji
Kroupová, L. et al. (2005). Slovník spisovné češtiny pro           kalbotyra, (7):1–26.
  školu a veřejnost: s Dodatkem Ministerstva školství,          Vácha, J., editor, et al. (1971). Slovník spisovného jazyka
  mládeže a tělovýchovy České republiky. Praha:                    českého. Praha: Academia.
  Academia.                                                     Veselková, J., et al. (1986). Mluvnice češtiny. 2,
Lakoff, G., Johnson, M. (1980). Metaphors We Live By.              Tvarosloví. Praha: Academia.
  University of Chicago Press.
Pragglejaz Group (2007). MIP: A method for identifying                   Language Resource References
  metaphorically used words in discourse. Metaphor and          Rosen, A., Vavřín, M., Zasina, A. J. (2017): InterCorp,
  Symbol, 22(1):1–39.                                             10.0, Institute of the Czech National Corpus, Charles
Shutova, E. (2015). Design and Evaluation of Metaphor             University,      Prague.          Available     from:
  Processing Systems. Computational Linguistics, 41(4):           http://www.korpus.cz
  579-623.


                                                           40