<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Morphological Priming in German: The Word is Not Enough (Or Is It?)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sebastian Pad o´</string-name>
          <email>pado@ims.uni-stuttgart.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Britta D. Zeller</string-name>
          <email>zeller@ims.uni-stuttgart.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Sˇnajder</string-name>
          <email>jan.snajder@fer.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Copyright c by the paper's authors. Copying permitted for private and academic purposes. In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final Conference</institution>
          ,
          <addr-line>Pisa</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Stuttgart University, Institut fu ̈r maschinelle Sprachverarbeitung Pfaffenwaldring 5b</institution>
          ,
          <addr-line>70569 Stuttgart</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3</institution>
          ,
          <addr-line>10000 Zagreb</addr-line>
          ,
          <country country="HR">Croatia</country>
        </aff>
      </contrib-group>
      <fpage>42</fpage>
      <lpage>45</lpage>
      <abstract>
        <p>Studies across multiple languages show that overt morphological priming leads to a speed-up only for transparent derivations but not for opaque derivations. However, in a recent experiment for German, Smolka et al. (2014) show comparable speed-ups for transparent and opaque derivations, and conclude that German behaves unlike other Indo-European languages and organizes its mental lexicon by morphemes rather than lemmas. In this paper we present a computational analysis of the German results. A distributional similarity model, extended with knowledge about morphological families and without any notion of morphemes, is able to account for all main findings of Smolka et al. We believe that this puts into question the call for German-specific mechanisms. Instead, our model suggests that cross-lingual differences between morphological systems underlie the experimentally observed differences.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Priming is a general property of human language
processing: it refers to the speed-up effect that
a stimulus can have on subsequent processing
        <xref ref-type="bibr" rid="ref10">(Meyer and Schvaneveldt, 1971)</xref>
        . This effect is
assumed to result from an activation (in a broad
sense) of mental representations, and priming is
a popular method to investigate properties of the
mental lexicon. The original study by Meyer and
Schvaneveldt established lexical priming (nurse →
doctor), but priming effects have also been
identified on other linguistic levels, such as syntactic
priming
        <xref ref-type="bibr" rid="ref1">(Bock, 1986)</xref>
        and morphological priming
        <xref ref-type="bibr" rid="ref6">(Kempley and Morton, 1982)</xref>
        .
      </p>
      <p>A recent study by Smolka et al. (2014)
investigated overt morphological priming on prefix verbs
in German, where the base verb and derived verb
can be semantically related (transparent
derivation: schließen – abschließen (close – lock) ) or
not (opaque derivation: fu¨hren – verfu¨hren (lead –
seduce)). Experiment 1, an overt visual priming
experiment (300 ms SOA) involved 40 six-tuples that
paired up a base verb with five prefix verbs of five
prime types (see Figure 1). The verbs were normed
carefully, e.g., for association, to exclude
confounding factors. The authors reported three main
findings: (a), no priming for Form and Unrelated; (b),
no priming for Synonymy; (c), significant
priming of the same strength for both Transparent and
Opaque Derivation.</p>
      <p>
        These findings suggest that morphological
priming on German prefix verbs use a mechanism that
is different from lexical priming, which assumes
that the strength of the semantic relatedness is the
main determinant of priming – i.e., lexical
priming would predict finding (a), but neither (b) nor
(c). The findings by Smolka et al. are also at odds
with overt priming patterns found in similar
experimental setups for other languages such as French
        <xref ref-type="bibr" rid="ref9">(Meunier and Longtin, 2007)</xref>
        and Dutch
        <xref ref-type="bibr" rid="ref13">(Schriefers
et al., 1991)</xref>
        , where patterns were found to be
indeed consistent with lexical priming. Smolka et
al. (2014) interpret this divergence as evidence for
a German Sonderweg: the typological properties
of German (separable prefixes, morphological
richness, many opaque derivations) are taken to suggest
a morpheme-based organization of the mental
lexicon more similar to Semitic languages like Hebrew
or Arabic than to other Indo-European languages.
      </p>
      <p>Our paper investigates this claim on the
computational level. We present a simple model of
corpusbased word similarity, extended with a database of
morphological families, that is able to predict the
three main findings by Smolka et al. outlined above.
The ability of the model to do so, even though it
operates completely at the word level without any
notion of morphemes, may put into question Smolka
Target
binden (bind)
1 Transparent Derivation zubinden (tie)
2 Opaque Derivation entbinden (give birth)
3 Synonym zuschnu¨ren (tie)
4 Form abbilden (depict)
5 Unrelated abholzen (log)
et al.’s call for novel morpheme-level mechanisms
for German.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Modeling Priming</title>
      <p>We model the priming effects shown in Smolka et
al. by combining two computational information
sources: A distributional semantic model, and a
derivational lexicon.</p>
      <p>
        Distributional Semantics and Priming.
Distributional semantics builds on the distributional
hypothesis
        <xref ref-type="bibr" rid="ref5">(Harris, 1968)</xref>
        , according to which the
similarity of lemmas correlates with the
similarity of their linguistic contexts. The meaning of a
lemma is typically represented as a vector of its
contexts in large text collections
        <xref ref-type="bibr" rid="ref15 ref3">(Turney and Pantel,
2010; Erk, 2012)</xref>
        , and semantic similarity is
operationalized by using a vector similarity measure
such as cosine similarity. Traditional models
construct vectors directly from context co-occurrences,
while more recent models learn distributed
representations with neural networks
        <xref ref-type="bibr" rid="ref11">(Mikolov et al.,
2013)</xref>
        , which can be seen as advanced forms of
dimensionality reduction.
      </p>
      <p>
        A classical test case of distributional models is
exactly lexical priming, which has been modeled
successfully in a number of studies
        <xref ref-type="bibr" rid="ref7 ref8">(McDonald and
Lowe, 1998; Lowe and McDonald, 2000)</xref>
        . The
assumption of this model family, which we call
DISTSIM, is that the cosine similarity between a
prime vector p~ and a target vector ~t is a direct
predictor of lexical priming:
      </p>
      <p>
        priming DISTSIM(p, t) ∝ cos p~, ~t
Regarding morphological priming, this model
predicts the result patterns for French and Dutch but
should not be able to explain the German results.
Derivational Morphology in a Distributional
Model. In Pado´ et al. (2013), we proposed to
extend distributional models with morphological
knowledge in the form of derivational families D,
that is, sets of lemmas that are derivationally
(either transparently or opaquely) related
        <xref ref-type="bibr" rid="ref2">(Daille et
al., 2002)</xref>
        , such as:
knienV (to kneelV ), beknienV (to begV ),
KniendeN (kneeling personN ), kniendA
(kneelingA), KnieN (kneeN )
While our motivation was primarily computational
(we aimed at improving similarity estimates for
infrequent words by taking advantage of the shared
meaning within derivational families), these
families can be reinterpreted in the current context as
driving morphological generalization in priming.
More specifically, consider the following model
family, which we call MORGEN and which is an
asymmetrical version of the “Average Similarity”
model from Pado´et al. (2013):
priming MORGEN(p, t) ∝
This model predicts priming as the average
similarity between the target t and all lemmas p0 within
the derivational family of the prime p. It
operationalizes the intuition that the prime “activates”
its complete derivational family, no matter if
transparently or opaquely related. Each of the family
members then contributes to the priming effect just
like in standard lexical priming.
      </p>
      <p>The MORGEN model should have a better
chance of modeling Smolka et al.’s results than
the DISTSIM model. Note, however, that it
remains completely at the word level, with
derivational families as its only source of morphological
knowledge.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experiment</title>
      <p>
        Setup. We compute a DISTSIM model by
running word2vec
        <xref ref-type="bibr" rid="ref11">(Mikolov et al., 2013)</xref>
        , a system
to extract distributional vectors from text, with its
default parameters, on the lemmatized 800M-token
German web corpus SdeWaC
        <xref ref-type="bibr" rid="ref12 ref16 ref4">(Faaß and Eckart,
2013)</xref>
        . To build MORGEN, we use the
derivational families from DERIVBASE v1.4, a
semiautomatically induced large-coverage German
lexicon of derivational families
        <xref ref-type="bibr" rid="ref16">(Zeller et al., 2013)</xref>
        .1
1DERIVBASE defines derivational families through a set
of about 270 surface form transformation rules. MORGEN
does not use information about rules, only family membership.
Nevertheless, it is a question for future research to assess
the potential criticism that the rule-based induction method
implicitly introduces morpheme-level information into the
families.
1 Transparent Derivation
2 Opaque Derivation
3 Synonym
4 Form
5 Unrelated
      </p>
      <p>Following Smolka et al., we analyze the
predictions with a series of one-way ANOVAs (factor
Prime Type with reference level Unrelated). As
appropriate for multiple comparisons, we adopt a
more conservative significance level (p=0.01).
Results. Table 1 reports the experimental results
and model predictions (average experimental
reaction times, cosine model predictions, and
significance of differences). Model contrasts that match
experiment contrasts are marked in bold.</p>
      <p>As expected, DISTSIM predicts the patterns of
classical lexical priming: we observe significant
priming effects for Transparent Derivation and
Synonymy, and no priming for Opaque Derivation.
This is contrary to Smolka et al.’s experimental
results.</p>
      <p>Our instance of the MORGEN model does a
much better job: It predicts highly significant
priming effects for both Transparent and Opaque
derivations (p&lt;0.001) while priming is not significant at
p&lt;0.01 for Synonyms (p=0.04). These predictions
correspond very well to Smolka et al.’s findings (cf.
Table 1). We tested for two additional contrasts
analyzed by Smolka et al.: the difference in priming
strength between Transparent and Opaque
Derivation (not significant in either experiment or model)
and the difference between Transparent Derivation
and Synonym (highly significant in both
experiment and model).
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>In sum, we find a very good match
betweenMORGEN and the experimental results, while the
DISTSIM model cannot account for the experimental
evidence. Recall that the main difference between
the two models is that MORGEN’s includes all
members of the prime’s derivational family into
the prediction of the priming strength. This leads
to the following changes compared to DISTDIM:
1. For Opaque Derivation, MORGEN typically
predicts stronger priming than DISTSIM,
since prime and target are typically members
of the same derivational family (assuming that
there are no coverage gaps in DERIVBASE),
and the average similarity between the target
and the words in the family is higher than the
similarity to the prime itself. Taking Figure 1
as an example, the Opaque Derivation pair
entbinden (give birth) – binden (bind)is
relatively dissimilar, and the similarity increases
when other pairs like binden (bind) – zubinden
(tie) are taken into consideration.
2. For Synonymy, MORGEN typically predicts
weaker priming than DISTSIM, since the
average similarity between target and all
members of the prime’s family tends to be lower
than the similarity between target and original
prime. Again considering Figure 1, the
Synonym pair binden (bind) – zuschnu¨ren (tie)
is relatively similar, while including terms
derivationally related to the prime zuschnu¨ren
(tie) like schnurlos (cordless) introduces
lowsimilarity pairs like schnurlos (cordless) –
binden (bind).</p>
      <p>MORGEN is not the only model that takes a
distributional stance towards morphological derivation.
Marelli and Baroni (2014) propose a compositional
model that computes separate distributional
representations for the meanings of stems and affixes
and is able to compute representations for novel,
unseen derived terms. The morpheme-level approach
of Marelli and Baroni’s model corresponds more
directly to Smolka et al.’s claims and might also be
able to account for the experimental patterns.</p>
      <p>However, our considerably simpler model,
which only has knowledge about distributional
families, is also able to do so. This at the very least
means that morpheme-level processing is not an
indispensable property of any model that explains
Smolka et al.’s experimental results and that the
evidence for a special organization of the German
mental lexicon, in contrast to other languages, must
be examined more carefully.</p>
      <p>In fact, our model provides a possible
alternative source of explanations for the cross-lingual
differences: Since the MORGEN predictions are
directly influenced by the size and members of
the derivational families, German opaque
morphological priming may simply result from the high
frequency of opaque derivations. In the future, we
plan to apply the model to Dutch and French to
check this alternative explanation.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We gratefully acknowledge funding by Deutsche
Forschungsgemeinschaft through
Sonderforschungsbereich 732, project B9.</p>
      <p>Lexicon, pages 7–8, Niagara on the Lake, Canada.</p>
      <p>Abstract.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>J. Kathryn</given-names>
            <surname>Bock</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>Syntactic persistence in language production</article-title>
          .
          <source>Cognitive Psychology</source>
          ,
          <volume>18</volume>
          :
          <fpage>355</fpage>
          -
          <lpage>387</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>Be´atrice Daille, Ce´cile Fabre, and Pascale Se´billot</article-title>
          .
          <year>2002</year>
          .
          <article-title>Applications of computational morphology</article-title>
          . In Paul Boucher, editor,
          <source>Many morphologies</source>
          , pages
          <fpage>210</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Katrin</given-names>
            <surname>Erk</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Vector space models of word meaning and phrase meaning: A survey</article-title>
          .
          <source>Language and Linguistics Compass</source>
          ,
          <volume>6</volume>
          (
          <issue>10</issue>
          ):
          <fpage>635</fpage>
          -
          <lpage>653</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Gertrud</given-names>
            <surname>Faaß</surname>
          </string-name>
          and
          <string-name>
            <given-names>Kerstin</given-names>
            <surname>Eckart</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>SdeWaC - a corpus of parsable sentences from the web</article-title>
          . In Iryna Gurevych, Chris Biemann, and Torsten Zesch, editors,
          <source>Language Processing and Knowledge in the Web</source>
          , volume
          <volume>8105</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>61</fpage>
          -
          <lpage>68</lpage>
          . Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Zellig</given-names>
            <surname>Harris</surname>
          </string-name>
          .
          <year>1968</year>
          . Mathematical Structures of Language. Wiley.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Steve</surname>
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Kempley</surname>
          </string-name>
          and John Morton.
          <year>1982</year>
          .
          <article-title>The effects of priming with regularly and irregularly related words in auditory word recognition</article-title>
          .
          <source>British Journal of Psychology</source>
          , pages
          <fpage>441</fpage>
          -
          <lpage>445</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Will</given-names>
            <surname>Lowe</surname>
          </string-name>
          and
          <string-name>
            <given-names>Scott</given-names>
            <surname>McDonald</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>The direct route: Mediated priming in semantic space</article-title>
          .
          <source>In Proceedings of the 22nd Annual Conference of the Cognitive Science Society</source>
          , pages
          <fpage>675</fpage>
          -
          <lpage>680</lpage>
          , Philadelphia, PA.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Marco</given-names>
            <surname>Marelli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Marco</given-names>
            <surname>Baroni</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Dissecting semantic transparency effects in derived word processing: A new perspective from distributional semantics</article-title>
          .
          <source>In 9th International Conference on the Mental Scott McDonald and Will Lowe</source>
          .
          <year>1998</year>
          .
          <article-title>Modelling functional priming and the associative boost</article-title>
          .
          <source>In Proceedings of the 20th Annual Conference of the Cognitive Science Society</source>
          , pages
          <fpage>675</fpage>
          -
          <lpage>680</lpage>
          , Madison, WI.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Fanny</given-names>
            <surname>Meunier</surname>
          </string-name>
          and
          <string-name>
            <surname>Catherine-Marie Longtin</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Morphological decomposition and semantic integration in word processing</article-title>
          .
          <source>Journal of Memory and Language</source>
          ,
          <volume>56</volume>
          :
          <fpage>457</fpage>
          -
          <lpage>471</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>David E.</given-names>
            <surname>Meyer</surname>
          </string-name>
          and Roger W. Schvaneveldt.
          <year>1971</year>
          .
          <article-title>Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations</article-title>
          .
          <source>Journal of Experimental Psychology</source>
          ,
          <volume>90</volume>
          (
          <issue>2</issue>
          ):
          <fpage>227</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , Ilya Sutskever, Kai Chen, Greg S. Corrado, and
          <string-name>
            <given-names>Jeff</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pages
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Sebastian</surname>
            <given-names>Pado´</given-names>
          </string-name>
          ,
          <source>Jan Sˇ najder, and Britta Zeller</source>
          .
          <year>2013</year>
          .
          <article-title>Derivational smoothing for syntactic distributional semantics</article-title>
          .
          <source>In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>731</fpage>
          -
          <lpage>735</lpage>
          , Sofia, Bulgaria.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Herbert</given-names>
            <surname>Schriefers</surname>
          </string-name>
          , Pienie Zwitserlood, and
          <string-name>
            <given-names>Ardi</given-names>
            <surname>Roelofs</surname>
          </string-name>
          .
          <year>1991</year>
          .
          <article-title>The identification of morphologically complex spoken words: Continuous processing or decomposition</article-title>
          ?
          <source>Journal of Memory and Language</source>
          ,
          <volume>30</volume>
          :
          <fpage>26</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Eva</given-names>
            <surname>Smolka</surname>
          </string-name>
          ,
          <string-name>
            <surname>Katrin H. Preller</surname>
            , and
            <given-names>Carsten</given-names>
          </string-name>
          <string-name>
            <surname>Eulitz</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>'verstehen' ('understand') primes 'stehen' ('stand'): Morphological structure overrides semantic compositionality in the lexical representation of German complex verbs</article-title>
          .
          <source>Journal of Memory and Language</source>
          ,
          <volume>72</volume>
          :
          <fpage>16</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Peter D. Turney</surname>
            and
            <given-names>Patrick</given-names>
          </string-name>
          <string-name>
            <surname>Pantel</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>From frequency to meaning: Vector space models of semantics</article-title>
          .
          <source>Journal of Artificial Intelligence Research</source>
          ,
          <volume>37</volume>
          (
          <issue>1</issue>
          ):
          <fpage>141</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Britta</given-names>
            <surname>Zeller</surname>
          </string-name>
          , Jan Sˇ najder, and Sebastian Pado´.
          <year>2013</year>
          .
          <article-title>DErivBase: Inducing and evaluating a derivational morphology resource for German</article-title>
          .
          <source>In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>1201</fpage>
          -
          <lpage>1211</lpage>
          , Sofia, Bulgaria.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>