<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OFAI{UKP at HAHA@IberLEF2019: Predicting the Humorousness of Tweets Using Gaussian Process Preference Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Trist</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>n Mill</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erik-L</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>n Do Dinh</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>win Simpson</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Austrian Research Institute for Arti cial Intelligence (OFAI)</institution>
          ,
          <addr-line>Freyung 6, 1010 Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ubiquitous Knowledge Processing Lab (UKP-TUDA), Department of Computer Science, Technische Universitat Darmstadt</institution>
          ,
          <addr-line>Hochschulstra e 10, 64289 Darmstadt, Germany, https://</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>180</fpage>
      <lpage>190</lpage>
      <abstract>
        <p>Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic approach, a variant of Gaussian process preference learning (GPPL), that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations. We apply our system, which had previously shown good performance on English-language one-liners annotated with pairwise humorousness annotations, to the Spanish-language data set of the HAHA@IberLEF2019 evaluation campaign. We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF2019 data and the pairwise judgment annotations required for our method.</p>
      </abstract>
      <kwd-group>
        <kwd>Computational humour</kwd>
        <kwd>Humour</kwd>
        <kwd>Gaussian process preference learning</kwd>
        <kwd>GPPL</kwd>
        <kwd>Best{worst scaling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Humour is an essential part of everyday communication, particularly in social
media [
        <xref ref-type="bibr" rid="ref21 ref33">21,33</xref>
        ], yet it remains a challenge for computational methods. Unlike
conventional language, humour requires complex linguistic and background knowledge
to understand, which are di cult to integrate with NLP methods [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        An important step in the automatic processing of humour is to recognize its
presence in a piece of text. However, its intensity may be present or perceived
to varying degrees to its human audience [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This level of appreciation (i.e.,
humorousness or equivalently funniness) can vary according to the text's content
and structural features, such as nonsense or disparagement [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] or, in the case of
puns, contextual coherence [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and the cognitive e ort required to recover the
target word [18, pp. 123{124].
      </p>
      <p>
        While previous work has considered mainly binary classi cation approaches
to humorousness, the HAHA@IberLEF2019 shared task [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] also focuses on its
gradation. This latter task is important for downstream applications such as
conversational agents or machine translation, which must choose the correct
tone in response to humour, or nd appropriate jokes and wordplay in a target
language. The degree of creativeness may also inform an application whether the
semantics of a joke can be inferred from similar examples.
      </p>
      <p>This paper describes the OFAI{UKP system that participated in both subtasks
of the HAHA@IberLEF2019 evaluation campaign: binary classi cation of tweets
as humorous or not humorous, and the quanti cation of humour in those tweets.
Our system employs a Bayesian approach|namely, a variant of Gaussian process
preference learning (GPPL) that infers humorousness scores or rankings on the
basis of manually annotated pairwise preference judgments and automatically
annotated linguistic features. In the following sections, we describe and discuss
the background and methodology of our system, our means of adapting the
HAHA@IberLEF2019 data to work with our system, and the results of our
system evaluation on this data.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>
        Pairwise comparisons can be used to infer rankings or ratings by assuming a
random utility model [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ], meaning that the annotator chooses an instance with
probability p, where p is a function of the utility of the instance. Therefore,
when instances in a pair have similar utilities, the annotator selects one with
a probability close to 0.5, while for instances with very di erent utilities, the
instance with higher utility will be chosen consistently. The random utility model
forms the core of two popular preference learning models, the Bradley{Terry
model [
        <xref ref-type="bibr" rid="ref26 ref31 ref6">6,26,31</xref>
        ], and the Thurstone{Mosteller model [
        <xref ref-type="bibr" rid="ref28 ref37">37,28</xref>
        ]. Given this model
and a set of pairwise annotations, probabilistic inference can be used to retrieve
the latent utilities of the instances.
      </p>
      <p>
        Besides pairwise comparisons, a random utility model is also employed by
MaxDi [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], a model for best{worst scaling (BWS), in which the annotator
chooses the best and worst instances from a set. While the term \best{worst
scaling" originally applied to the data collection technique [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], it now also refers
to models such as MaxDi that describe how annotators make discrete choices.
Empirical work on BWS has shown that MaxDi scores (instance utilities) can
be inferred using either maximum likelihood or a simple counting procedure that
produces linearly scaled approximations of the maximum likelihood scores [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
The counting procedure de nes the score for an instance as the fraction of times
the instance was chosen as best, minus the fraction of times the instance was
chosen as worst, out of all comparisons including that instance [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. From this
point on, we refer to the counting procedure as BWS, and apply it to the tasks
of inferring scores from both best{worst scaling annotations for metaphor novelty
and pairwise annotations for funniness.
      </p>
      <p>
        Gaussian process preference learning (GPPL) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], a Thurstone{Mosteller{
based model that accounts for the features of the instances when inferring their
scores, can make predictions for unlabelled instances and copes better with sparse
pairwise labels. GPPL uses Bayesian inference, which has been shown to cope
better with sparse and noisy data [
        <xref ref-type="bibr" rid="ref24 ref38 ref39 ref4">39,38,4,24</xref>
        ], including disagreements between
multiple annotators [
        <xref ref-type="bibr" rid="ref12 ref14 ref22 ref36">12,36,14,22</xref>
        ]. Through the random utility model, GPPL is
able to handle disagreements between annotators as noise, since no label has a
probability of one of being selected.
      </p>
      <p>
        Given a set of pairwise labels, and the features of labelled instances, GPPL
can estimate the posterior distribution over the utilities of any instances given
their features. Relationships between instances are modelled by a Gaussian
process, which computes the covariance between instance utilities as a function
of their features [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. Since typical methods for posterior inference [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] are not
scalable (O(n3), where n is the number of instances), some of the present authors
introduced a scalable method for GPPL that permits arbitrarily large numbers
of instances and pairs [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. This method uses stochastic variational inference [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
which limits computational complexity by substituting the instances for a xed
number of inducing points during inference.
      </p>
      <p>
        Our GPPL method has already been applied with good results to ranking
arguments by convincingness (which, like funniness, is an abstract linguistic
property that is hard to quantify directly) and to ranking English-language
one-liners by humorousness [
        <xref ref-type="bibr" rid="ref34 ref35">35,34</xref>
        ]. In these two tasks, GPPL was found to
outperform SVM and BiLSTM regression models that were trained directly on
gold-standard scores, and to outperform BWS when given sparse training data,
respectively. We therefore elect to use GPPL on the Spanish-language Twitter
data of the HAHA@IberLEF2019 shared task.
      </p>
      <p>In the interests of replicability, we will be freely releasing the code for running
our GPPL system, including the code for the data conversion and subsampling
process detailed in §3.2.3
3
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <sec id="sec-3-1">
        <title>Tasks</title>
        <p>The HAHA@IberLEF2019 evaluation campaign consists of two tasks. Task 1 is
humour detection, where the goal is to predict whether or not a given tweet is
humorous, as determined by a gold standard of binary, human-sourced annotations.
Systems are scored on the basis of accuracy, precision, recall, and F-measure.
Task 2 is humorousness prediction, where the aim is to assign each funny tweet a
score approximating the average funniness rating, on a ve-point scale, assigned
3 https://github.com/UKPLab/haha2019-GPPL
by a set of human annotators. Here system performance is measured by
rootmean-squared error (RMSE). For both tasks, the campaign organizers provide a
collection of 24 000 manually annotated training examples. The test data consists
of a further 6000 tweets whose gold-standard annotations were withheld from
the participants.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Data Preparation</title>
        <p>
          For each of the 24 000 tweets in the HAHA@IberLEF2019 training data, the
task organizers asked human annotators to indicate whether the tweet was
humorous, and if so, how funny they found it on a scale from 1 (\not funny") to
5 (\excellent"). (This is essentially the same annotation scheme used for the rst
version of the corpus [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] which was used in the previous iteration of HAHA [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].)
As originally distributed, the training data gives the text of each tweet along
with the number of annotators who rated it as \not humour", \1", \2", \3", \4",
and \5". For the purposes of Task 1, tweets in the positive class received at least
three numerical annotations and at least ve annotations in total; tweets in the
negative class received at least three \not humour" annotations, though possibly
fewer than ve annotations in total. Only those tweets in the positive class are
used in Task 2.
        </p>
        <p>This ordinal data cannot be used as-is with our GPPL system, which requires
as input a set of preference judgments between pairs of instances. To work around
this, we converted the data into a set of ordered pairs of tweets such that the
rst tweet has a lower average funniness score than the second. (We consider
instances in the negative class to have an average funniness score of 0.) While
an exhaustive set of pairings would contain 575 976 000 pairs (minus the pairs in
which both tweets have the same score), we produced only 10 730 229 pairs, which
was the minimal set necessary to accurately order the tweets. For example, if
the original data set contained three tweets A, B, and C with average funniness
scores 5.0, 3.0, and 1.0, respectively, then our data would contain the pairs (C; B)
and (B; A) but not (C; A). To save memory and computation time in the training
phase, we produced a random subsample such that the number of pairs where
a given tweet appeared as the funnier one was capped at 500. This resulted in
a total of 485 712 pairs. In a second con guration, we subsampled up to 2500
pairs per tweet. We used a random 60% of this set to meet memory limitations,
resulting in 686 098 pairs.</p>
        <p>With regards to the tweets' textual data, we do only basic tokenization as
preprocessing. For lookup purposes (synset lookup; see §3.3), we also lemmatize
the tweets.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Experimental Setup</title>
        <p>
          As we adapt an existing system that works on English data [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ], we generally
reuse the features employed there, but use Spanish resources instead. Each tweet
is represented by the vector resulting from a concatenation of the following:
{ The average of the word embedding vectors of the tweet's tokens, for which
we use 200-dimensional pretrained Spanish Twitter embeddings [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
{ The average frequency of the tweet's tokens, as determined by a Wikipedia
dump.4
{ The average word polysemy|i.e., the number of synsets per lemma of the
tweet's tokens, as given by the Multilingual Central Repository (MCR 3.0,
release 2016) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          Using the test data from the HAHA@IberLEF2018 task [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] as a development
set, we further identi ed the following features from the UO UPV system [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] as
helpful:
{ The heuristically estimated turn count (i.e., the number of tokens beginning
with - or --) and binary dialogue heuristic (i.e., whether the turn count is
greater than 2).
{ The number of hashtags (i.e., tokens beginning with #).
{ The number of URLs (i.e., tokens beginning with www or http).
{ The number of emoticons.5
{ The character and token count, as well as mean token length.
{ The counts of exclamation marks and other punctuation (.,;?).
        </p>
        <p>
          We adapt the existing GPPL implementation6 using the authors'
recommended hyperparameter defaults [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]: batch size jPij = 200, scale
hyperparameters 0 = 2 and 0 = 200, and the number of inducing points (i.e., the smaller
number of data points that act as substitutes for the tweets in the dataset)
M = 500. The maximum number of iterations was set to 2000. Using these
feature vectors, hyperparameter settings, and data pairs, we require a training
time of roughly two hours running on a 24-core cluster with 2 GHz CPU cores.
        </p>
        <p>After training the model, an additional step is necessary to transform the
GPPL output values to the original funniness range (0, 1{5). For this purpose,
we train a Gaussian process regressor which we supply with the output values
of the GPPL system as features and the corresponding HAHA@IberLEF2018
test data values as targets. However, this model can still yield results outside the
desired range when applied to the GPPL output of the HAHA@IberLEF2019
test data. Thus, we afterwards map too-large and too-small values onto the
range boundaries. We further set an empirically determined threshold for binary
funniness estimation.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Results and Discussion</title>
        <p>4 https://dumps.wikimedia.org/eswiki/20190420/eswiki-20190420-pages-articles.xml.</p>
        <p>bz2; last accessed on 2019-06-15.
5 https://en.wikipedia.org/wiki/List of emoticons#Western, Western list; last accessed
on 2019-06-15.
6 https://github.com/UKPLab/tacl2018-preference-convincing
own system, as well as those of the top-performing system and a nave baseline.
For Task 1, the nave baseline makes a random classi cation for each tweet (with
uniform distribution over the two classes); for Task 2, it assigns a funniness score
of 3.0 to each tweet.</p>
        <p>
          In the binary classi cation setup, our system achieved an F-measure of 0.660
on the test data, representing a precision of 0.588 and a recall of 0.753. In the
regression task, we achieved RMSE of 1.810. The results are based on the second
data subsample (up to 2500 pairs), with the results for the rst (up to 500 pairs)
being slightly lower. Our results for both tasks, while handily beating those of the
nave baseline, are signi cantly worse than those reported by some other systems
in the evaluation campaign, including of course the winner. This is somewhat
surprising given GPPL's very good performance in our previous English-language
experiments [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ].
        </p>
        <p>Unfortunately, our lack of uency in Spanish and lack of access to the
goldstandard scores for the test set tweets precludes us from performing a detailed
qualitative error analysis. However, we speculate that our system's less than
stellar performance can partly be attributed to the information loss in converting
between the numeric scores used in the HAHA@IberLEF2019 tasks and the
preference judgments used by our GPPL system. In support of this explanation,
we note that the output of our GPPL system is rather uniform; the scores
occur in a narrow range with very few outliers. (Figure 1 shows this outcome
for the HAHA@IberLEF2018 test data.) Possibly this e ect would have been
less pronounced had we used a much larger subsample, or even the entirety, of
the possible training pairs, though as discussed in §3.2, technical and temporal
limitations prevented us from doing so. We also speculate that the Gaussian
process regressor we used may not have been the best way of mapping our
GPPL scores back onto the task's funniness scale (albeit still better than a linear
mapping).</p>
        <p>
          Apart from the di culties posed by the di erences in the annotation and
scoring, our system may have been a ected by the mismatch between its language
resources and the language of the test data. That is, while we relied on language
resources like Wikipedia and MCR that re ect standardized registers and prestige
dialects, the HAHA@IberLEF2019 data is drawn from unedited social media,
whose language is less formal, treats a di erent range of topics, and may re ect a
wider range of dialects and writing styles. Twitter data in particular is known to
present problems for vanilla NLP systems, at least without extensive cleaning and
normalization [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. This is re ected in our choice of word embeddings: while we
achieved a Spearman rank correlation of = 0.52 with the HAHA@IberLEF2018
test data using embeddings based on Twitter data [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], the same system using
more \standard" Wikipedia-/news-/Web-based embeddings [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] resulted in a
correlation near zero.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper has presented the OFAI{UKP system for predicting both binary
and graded humorousness. It employs Gaussian process preference learning, a
Bayesian system that learns to rank and rate instances by exploiting pairwise
preference judgments. By providing additional feature data (in our case, shallow
linguistic features), the method can learn to predict scores for previously unseen
items.</p>
      <p>Though our system had previously achieved good results with rudimentary,
task-agnostic linguistic features on two English-language tasks (including one
involving the gradation of humorousness), its performance on the Spanish-language
Twitter data of HAHA@IberLEF2019 was less impressive. We tentatively
attribute this to the information loss involved in the (admittedly arti cial)
conversion between the numeric annotations used in the task and the preference
judgments required as input to our method, and to the fact that we do not
normalize the Twitter data to match our linguistic resources.</p>
      <p>
        Possible future work would include mitigating the above two problems (for
example, by normalizing the language of the tweets, and by coming up with a
better way of converting between humour annotation formats, or by sourcing new
preference judgments from Spanish-speaking annotators), and by using additional,
humour-speci c features, including some of those used in past work as well as
those inspired by the prevailing linguistic theories of humour [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The bene ts
of including word frequency also point to possible further improvements using
n-grams, tf{idf, or other task-agnostic linguistic features.
      </p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <p>This work has been supported by the German Federal Ministry of Education and
Research (BMBF) under the promotional reference 01UG1816B (CEDIFOR),
by the German Research Foundation (DFG) as part of the QA-EduInf project
(grants GU 798/18-1 and RI 803/12-1), by the DFG-funded research training group
\Adaptive Preparation of Information from Heterogeneous Sources" (AIPHES;
GRK 1994/1), and by the Austrian Science Fund (FWF) under project M
2625N31. The Austrian Research Institute for Arti cial Intelligence is supported by
the Austrian Federal Ministry for Science, Research and Economy.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <source>Proceedings of the ACL 2015 Workshop on Noisy User-generated Text. Association for Computational Linguistics (Jul</source>
          <year>2015</year>
          ). https://doi.org/10.18653/v1/
          <fpage>W15</fpage>
          -43, https://www.aclweb.org/anthology/W15-4300
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Almeida</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bilbao</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <source>Spanish 3B words word2vec embeddings (version 1.0)</source>
          (
          <year>2018</year>
          ). https://doi.org/10.5281/zenodo.1410403
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Attardo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Linguistic Theories of Humor. Mouton de Gruyter, Berlin (
          <year>1994</year>
          ). https://doi.org/10.1515/9783110219029
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Beck</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Specia</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Joint emotion analysis via multi-task Gaussian processes</article-title>
          .
          <source>In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing</source>
          . pp.
          <volume>1798</volume>
          {
          <year>1803</year>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2014</year>
          ). https://doi.org/10.3115/v1/
          <fpage>D14</fpage>
          -1190
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bell</surname>
          </string-name>
          , N.D.:
          <article-title>Failed humor</article-title>
          . In: Attardo,
          <string-name>
            <surname>S</surname>
          </string-name>
          . (ed.)
          <source>The Routledge Handbook of Language and Humor</source>
          , pp.
          <volume>356</volume>
          {
          <fpage>370</fpage>
          . Routledge Handbooks in Linguistics, Routledge, New York (
          <year>Feb 2017</year>
          ), https://www.routledgehandbooks.com/doi/10.4324/9781315731162. ch25
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bradley</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Terry</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          :
          <article-title>Rank analysis of incomplete block designs: I. The method of paired comparisons</article-title>
          .
          <source>Biometrika</source>
          <volume>39</volume>
          (
          <issue>3</issue>
          /4),
          <volume>324</volume>
          {345 (Dec
          <year>1952</year>
          ). https://doi.org/10.2307/2334029
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Carretero-Dios</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buela-Casal</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Assessing the appreciation of the content and structure of humor: Construction of a new scale</article-title>
          .
          <source>Humor: International Journal of Humor Research</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <volume>307</volume>
          {325 (Aug
          <year>2010</year>
          ). https://doi.org/10.1515/humr.
          <year>2010</year>
          .014
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of the HAHA task: Humor analysis based on human annotation at IberEval 2018</article-title>
          . In: Rosso,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Mart</surname>
          </string-name>
          <string-name>
            <given-names>nez</given-names>
            , R.,
            <surname>Montalvo</surname>
          </string-name>
          , S., de Albornoz, J.C. (eds.)
          <source>Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages. CEUR Workshop Proceedings</source>
          , vol.
          <volume>2150</volume>
          , pp.
          <volume>187</volume>
          {
          <fpage>194</fpage>
          .
          <source>Spanish Society for Natural Language Processing (Sep</source>
          <year>2018</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2150</volume>
          /overview-HAHA.pdf
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moncecchi</surname>
          </string-name>
          , G.:
          <article-title>A crowd-annotated Spanish corpus for humor analysis</article-title>
          .
          <source>In: Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media</source>
          . pp.
          <volume>7</volume>
          {
          <fpage>11</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2018</year>
          ), http://aclweb.org/anthology/W18-3502
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castro</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etcheverry</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prada</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Overview of HAHA at IberLEF 2019:
          <article-title>Humor analysis based on human annotation</article-title>
          .
          <source>In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ).
          <source>CEUR Workshop Proceedings (Sep</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Chu</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Preference learning with Gaussian processes</article-title>
          .
          <source>In: Proceedings of the 22nd International Conference on Machine Learning</source>
          . pp.
          <volume>137</volume>
          {
          <fpage>144</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2005</year>
          ). https://doi.org/10.1145/1102351.1102369
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Specia</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Modelling annotator bias with multi-task Gaussian processes: An application to machine translation quality estimation</article-title>
          .
          <source>In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics</source>
          . vol.
          <volume>1</volume>
          , pp.
          <volume>32</volume>
          {
          <fpage>42</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2013</year>
          ), http://aclweb.org/anthology/ P13-1004
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Deriu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Luca</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Severyn</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Muller,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Cieliebak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , Ho mann, T.,
            <surname>Jaggi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Leveraging large amounts of weakly supervised data for multi-language sentiment classi cation</article-title>
          .
          <source>In: Proceedings of the 26th International World Wide Web Conference</source>
          . pp.
          <volume>1045</volume>
          {
          <fpage>1052</fpage>
          .
          <string-name>
            <surname>International World Wide Web Conferences Steering Committee</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Felt</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ringger</surname>
            ,
            <given-names>E.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seppi</surname>
          </string-name>
          , K.D.:
          <article-title>Semantic annotation aggregation with conditional crowdsourcing models and word embeddings</article-title>
          .
          <source>In: Proceedings of the 26th International Conference on Computational Linguistics</source>
          . pp.
          <volume>1787</volume>
          {
          <issue>1796</issue>
          (
          <year>2016</year>
          ), http://aclweb.org/anthology/C16-1168
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Finn</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Louviere</surname>
            ,
            <given-names>J.J.:</given-names>
          </string-name>
          <article-title>Determining the appropriate response to evidence of public concern: The case of food safety</article-title>
          .
          <source>Journal of Public Policy &amp; Marketing</source>
          <volume>11</volume>
          (
          <issue>2</issue>
          ),
          <volume>12</volume>
          {25 (Sep
          <year>1992</year>
          ). https://doi.org/10.1177/074391569201100202
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Flynn</surname>
            ,
            <given-names>T.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marley</surname>
            ,
            <given-names>A.A.J.:</given-names>
          </string-name>
          <article-title>Best{worst scaling: Theory and methods</article-title>
          . In: Hess,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Daly</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . (eds.) Handbook of Choice Modelling, pp.
          <volume>178</volume>
          {
          <fpage>201</fpage>
          . Edward Elgar Publishing, Cheltenham, UK (
          <year>2014</year>
          ). https://doi.org/10.4337/9781781003152.00014
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Gonzalez-Agirre</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laparra</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rigau</surname>
          </string-name>
          , G.:
          <article-title>Multilingual Central Repository version 3.0</article-title>
          .
          <source>In: Proceddings of the 8th International Conference on Language Resources and Evaluation</source>
          . pp.
          <volume>2525</volume>
          {
          <fpage>2529</fpage>
          .
          <string-name>
            <surname>European Language Resources Association</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Hempelmann</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          :
          <article-title>Paronomasic Puns: Target Recoverability Towards Automatic Generation</article-title>
          .
          <source>Ph.D. thesis</source>
          , Purdue University, West Lafayette, IN, USA (Aug
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Hempelmann</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          :
          <article-title>Computational humor: Beyond the pun</article-title>
          ? In: Raskin,
          <string-name>
            <surname>V</surname>
          </string-name>
          . (ed.)
          <source>The Primer of Humor Research</source>
          , pp.
          <volume>333</volume>
          {
          <fpage>360</fpage>
          . No. 8 in Humor Research, Mouton de Gruyter, Berlin (
          <year>2008</year>
          ). https://doi.org/10.1515/9783110198492.333
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Ho</surname>
            <given-names>man</given-names>
          </string-name>
          , M.D.,
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paisley</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          :
          <article-title>Stochastic variational inference</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>14</volume>
          ,
          <volume>1303</volume>
          {1347 (May
          <year>2013</year>
          ), http://jmlr.org/ papers/v14/ho man13a.html
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Holton</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S.C.</given-names>
          </string-name>
          :
          <article-title>Journalists, social media, and the use of humor on Twitter</article-title>
          .
          <source>Electronic Journal of Communication</source>
          <volume>21</volume>
          (
          <issue>1</issue>
          &amp;2) (
          <year>2011</year>
          ), http://www.cios. org/EJCPUBLIC/021/1/021121.html
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Kido</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Okamoto</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A Bayesian approach to argument-based reasoning for attack estimation</article-title>
          . In: Sierra,
          <string-name>
            <surname>C</surname>
          </string-name>
          . (ed.)
          <source>Proceedings of the Twenty-Sixth International Joint Conference on Arti cial Intelligence</source>
          . pp.
          <volume>249</volume>
          {
          <fpage>255</fpage>
          .
          <source>International Joint Conferences on Arti cial Intelligence</source>
          (
          <year>2017</year>
          ). https://doi.org/10.24963/ijcai.
          <year>2017</year>
          /36
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Kiritchenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S.M.:</given-names>
          </string-name>
          <article-title>Capturing reliable ne-grained sentiment associations by crowdsourcing and best{worst scaling</article-title>
          .
          <source>In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          . pp.
          <volume>811</volume>
          {
          <fpage>817</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2016</year>
          ). https://doi.org/10.18653/v1/
          <fpage>N16</fpage>
          -1095
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Lampos</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aletras</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Preotiuc-Pietro</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Predicting and characterising user impact on Twitter</article-title>
          .
          <source>In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics</source>
          . pp.
          <volume>405</volume>
          {
          <fpage>413</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2014</year>
          ). https://doi.org/10.3115/v1/
          <fpage>E14</fpage>
          -1043
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Lippman</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunn</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          :
          <article-title>Contextual connections within puns: E ects on perceived humor and memory</article-title>
          .
          <source>Journal of General Psychology</source>
          <volume>127</volume>
          (
          <issue>2</issue>
          ),
          <volume>185</volume>
          {197 (Apr
          <year>2000</year>
          ). https://doi.org/10.1080/00221300009598578
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Luce</surname>
          </string-name>
          , R.D.:
          <article-title>On the possible psychophysical laws</article-title>
          .
          <source>Psychological Review</source>
          <volume>66</volume>
          (
          <issue>2</issue>
          ),
          <volume>81</volume>
          {
          <fpage>95</fpage>
          (
          <year>1959</year>
          ). https://doi.org/10.1037/h0043178
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Marley</surname>
            ,
            <given-names>A.A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Louviere</surname>
            ,
            <given-names>J.J.:</given-names>
          </string-name>
          <article-title>Some probabilistic models of best, worst, and best{worst choices</article-title>
          .
          <source>Journal of Mathematical Psychology</source>
          <volume>49</volume>
          (
          <issue>6</issue>
          ),
          <volume>464</volume>
          {
          <fpage>480</fpage>
          (
          <year>2005</year>
          ). https://doi.org/10.1016/j.jmp.
          <year>2005</year>
          .
          <volume>05</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Mosteller</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Remarks on the method of paired comparisons: I. The least squares solution assuming equal standard deviations and equal correlations</article-title>
          .
          <source>Psychometrika</source>
          <volume>16</volume>
          (
          <issue>1</issue>
          ), 3{9 (Mar
          <year>1951</year>
          ). https://doi.org/10.1007/BF02313422
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Nickisch</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rasmussen</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          :
          <article-title>Approximations for binary Gaussian process classi cation</article-title>
          .
          <source>Journal of Machine Learning Research 9</source>
          ,
          <year>2035</year>
          {2078 (Oct
          <year>2008</year>
          ), http://www.jmlr.org/papers/volume9/nickisch08a/nickisch08a.pdf
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Ortega-Bueno</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , Mun~iz Cuza,
          <string-name>
            <given-names>C.E.</given-names>
            ,
            <surname>Medina</surname>
          </string-name>
          <string-name>
            <surname>Pagola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.E.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          : UO UPV:
          <article-title>Deep linguistic humor detection in Spanish social media</article-title>
          . In: Rosso,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Mart</surname>
          </string-name>
          <string-name>
            <given-names>nez</given-names>
            , R.,
            <surname>Montalvo</surname>
          </string-name>
          , S., de Albornoz, J.C. (eds.)
          <source>Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages. CEUR Workshop Proceedings</source>
          , vol.
          <volume>2150</volume>
          , pp.
          <volume>203</volume>
          {
          <fpage>213</fpage>
          .
          <source>Spanish Society for Natural Language Processing (Sep</source>
          <year>2018</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-2150/HAHA paper2.pdf
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Plackett</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          :
          <article-title>The analysis of permutations</article-title>
          .
          <source>Journal of the Royal Statistical Society</source>
          , Series C (Applied Statistics)
          <volume>24</volume>
          (
          <issue>2</issue>
          ),
          <volume>193</volume>
          {
          <fpage>202</fpage>
          (
          <year>1975</year>
          ). https://doi.org/10.2307/2346567
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Rasmussen</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>C.K.I.</given-names>
          </string-name>
          :
          <article-title>Gaussian Processes for Machine Learning</article-title>
          .
          <source>Adaptive Computation and Machine Learning</source>
          , MIT Press, Cambridge, MA, USA (
          <year>2006</year>
          ), http://www.gaussianprocess.org/gpml/
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Shifman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Memes in Digital Culture</article-title>
          .
          <source>Essential Knowledge</source>
          , MIT Press, Cambridge, MA, USA (Oct
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Do Dinh</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurevych</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Predicting humorousness and metaphor novelty with Gaussian process preference learning</article-title>
          .
          <source>In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL</source>
          <year>2019</year>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          (
          <year>Jul 2019</year>
          ), to appear
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurevych</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Finding convincing arguments using scalable Bayesian preference learning</article-title>
          .
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>6</volume>
          ,
          <issue>357</issue>
          {
          <fpage>371</fpage>
          (
          <year>2018</year>
          ), http://aclweb.org/anthology/Q18-1026
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>E.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venanzi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reece</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kohli</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guiver</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jennings</surname>
            ,
            <given-names>N.R.</given-names>
          </string-name>
          :
          <article-title>Language understanding in the wild: Combining crowdsourcing and machine learning</article-title>
          .
          <source>In: Proceedings of the 24th International Conference on World Wide Web</source>
          . pp.
          <volume>992</volume>
          {
          <fpage>1002</fpage>
          .
          <string-name>
            <surname>International World Wide Web Conferences Steering Committee</surname>
          </string-name>
          (
          <year>2015</year>
          ). https://doi.org/10.1145/2736277.2741689
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Thurstone</surname>
            ,
            <given-names>L.L.:</given-names>
          </string-name>
          <article-title>A law of comparative judgment</article-title>
          .
          <source>Psychological Review</source>
          <volume>34</volume>
          (
          <issue>4</issue>
          ),
          <volume>273</volume>
          {
          <fpage>286</fpage>
          (
          <year>1927</year>
          ). https://doi.org/10.1037/h0070288
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          38.
          <string-name>
            <surname>Titov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klementiev</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A Bayesian approach to unsupervised semantic role induction</article-title>
          .
          <source>In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics</source>
          . pp.
          <volume>12</volume>
          {
          <fpage>22</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2012</year>
          ), http://aclweb.org/anthology/E12-1003
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          39.
          <string-name>
            <surname>Xiong</surname>
          </string-name>
          , H.Y.,
          <string-name>
            <surname>Barash</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frey</surname>
            ,
            <given-names>B.J.:</given-names>
          </string-name>
          <article-title>Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context</article-title>
          .
          <source>Bioinformatics</source>
          <volume>27</volume>
          (
          <issue>18</issue>
          ),
          <volume>2554</volume>
          {2562 (Sep
          <year>2011</year>
          ). https://doi.org/10.1093/bioinformatics/btr444
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>