<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Bi-ISCA: Bidirectional Inter-Sentence Contextual Attention Mechanism for Detecting Sarcasm in User Generated Noisy Short Text</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Prakamya</forename><surname>Mishra</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution" key="instit1">https</orgName>
								<orgName type="institution" key="instit2">dictionary.cambridge.org</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Saroj</forename><surname>Kaushik</surname></persName>
							<email>saroj.kaushik@snu.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Shiv Nadar University</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kuntal</forename><surname>Dey</surname></persName>
							<email>kuntal.dey@accenture.com</email>
							<affiliation key="aff1">
								<orgName type="institution">Accenture Technology Labs</orgName>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution" key="instit1">https</orgName>
								<orgName type="institution" key="instit2">dictionary.cambridge.org</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Bi-ISCA: Bidirectional Inter-Sentence Contextual Attention Mechanism for Detecting Sarcasm in User Generated Noisy Short Text</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">7457E880F7E7E1E173318661F0326F85</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T01:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Many online comments on social media platforms are hateful, humorous, or sarcastic. The sarcastic nature of these comments (especially the short ones) alters their actual implied sentiments, which leads to misinterpretations by the existing sentiment analysis models. A lot of research has already been done to detect sarcasm in the text using user-based, topical, and conversational information but not much work has been done to use inter-sentence contextual information for detecting the same. This paper proposes a new deep learning architecture that uses a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA) to capture intersentence dependencies for detecting sarcasm in the user-generated short text using only the conversational context. The proposed deep learning model demonstrates the capability to capture explicit, implicit, and contextual incongruous words &amp; phrases responsible for invoking sarcasm. Bi-ISCA generates results comparable to the state-of-the-art on two widely used benchmark datasets for the sarcasm detection task (Reddit and Twitter). To the best of our knowledge, none of the existing models use an intersentence contextual attention mechanism to detect sarcasm in the user-generated short text using only conversational context.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Sentiment analysis is one of the most important natural language processing (NLP) applications. Its goal is to identify, extract, quantify, and study subjective information. The sudden rise in the usage of social media platforms as a means of communication has led to a vast amount of data being shared between its users on a wide range of topics. This type of data is very helpful to several organizations for analyzing the sentiments of people towards products, movies, political events, etc. Understanding the unique intricacies of the human language remains one of the most important pending NLP problems of this time. Humans regularly use sarcasm as a crucial part of the day-to-day conversations when venting, arguing, or * Contact Author maybe engaging on social media platforms. Sarcastic remarks on these platforms inflict problems on the existing sentiment analysis systems in identifying the true intentions of the users.</p><p>The Cambridge Dictionary 1 describes sarcasm as an irony conveyed hilariously or amusingly to criticize something. Sarcasm may not show criticism on the surface but instead might have a criticizing implied meaning. Such a figurative aspect of sarcasm makes it difficult to be detected in the modern micro texts <ref type="bibr" target="#b3">[Ghosh and Veale, 2016]</ref>. Several linguistic research has been done to analyze different aspects of sarcasm. Kind of responses evoked because of comments has been considered a major indicator of sarcasm <ref type="bibr" target="#b3">[Eisterhold et al., 2006]</ref>. <ref type="bibr" target="#b14">[Wilson, 2006]</ref> states that circumstantial incongruity between a comment and its corresponding contextual information plays an important role in implying sarcasm.</p><p>Previous research works have used policy-based, statistical, and deep-learning-based methods for detecting sarcasm. The use of contextual information like conversational context, author personality features, or prior knowledge of the topic, have proved to be very useful. <ref type="bibr" target="#b6">[Khattri et al., 2015]</ref> used sentiments of the author's historical tweets as context. <ref type="bibr" target="#b11">[Rajadesingan et al., 2015]</ref> used personality features like the author's familiarity with twitter, language (structure and word usage), and the author's familiarity with sarcasm (history of previous sarcastic tweets) for consolidating context. <ref type="bibr" target="#b0">[Bamman and Smith, 2015]</ref> explored the use of historical terms, topics, and sentiments along with profile information as the author's context. They also exploited the use of conversational context like the immediate previous tweets in the thread. <ref type="bibr" target="#b6">[Joshi et al., 2015]</ref> demonstrated that concatenation of preceding comment with the objective comment in a discussion forum led to an increase in the precision score.</p><p>Overall in recent years a lot of work has been done to use different types of contextual information for sarcasm detection but none of them have used inter-sentence dependencies. In this paper, we propose a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA) based deep learning neural network for sarcasm detection. The main contribution of this paper can be summarised as follows:</p><p>• We propose a new deep learning architecture that uses a novel Bidirectional Inter-Sentence Contextual attention mechanism (Bi-ISCA) for detecting sarcasm in short texts (short texts are more difficult to analyze due to shortage of contextual information). • Bi-ISCA focuses on only using the conversational contextual comment/tweet for detecting sarcasm rather than using any other topical/personality-based features, as using only the contextual information enriches the model's ability to capture syntactical and semantical textual properties responsible for invoking sarcasm. • We also explain model behavior and predictions by visualizing attention maps generated by Bi-ISCA, which helps in identifying significant parts of the sentences responsible for invoking sarcasm. The rest of the paper is organized as follows. Section 2 describes the related work. Then section 3, explains the proposed model architecture for detecting sarcasm. Section 4 will describe the datasets used, pre-processing pipeline, and training details for reproducibility. Then experimental results are explained in section 5 and section 6 illustrates model behavior and predictions by visualizing attention maps. Finally we conclude in section 7.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>A diverse spectrum of approaches has been used to detect sarcasm. Recent sarcasm detection approaches have either mainly focused on using machine learning based approaches that leverage the use of explicitly declared relevant features or they focus on using neural network based deep learning approaches that do not require handcrafted features. Also, the recent advances in using deep learning for preforming natural language processing tasks have led to a promising increase in the performance of these sarcasm detection systems.</p><p>A lot of research has been done using bag of words as features. However, to improve performance, scholars started to explore the use of several other semantic and syntactical features like punctuations <ref type="bibr" target="#b13">[Tsur et al., 2010]</ref>; emotion marks and intensifiers <ref type="bibr" target="#b9">[Liebrecht et al., 2013]</ref>; positive verbs and negative phrases <ref type="bibr" target="#b11">[Riloff et al., 2013]</ref>; polarity skip grams <ref type="bibr" target="#b11">[Reyes et al., 2013]</ref>; synonyms &amp; ambiguity <ref type="bibr" target="#b1">[Barbieri et al., 2014]</ref>; implicit and explicit incongruity-based <ref type="bibr" target="#b6">[Joshi et al., 2015]</ref>; sentiment flips <ref type="bibr" target="#b11">[Rajadesingan et al., 2015]</ref>; affect-based features derived from multiple emotion lexicons <ref type="bibr" target="#b3">[Farías et al., 2016]</ref>.</p><p>Every day an enormous amount of short text data is generated by users on popular social media platforms like Twitter 2 and Reddit 3 . Easy accessibility of such data sources has enticed researchers to use them for extracting user-based and discourse-based features. <ref type="bibr" target="#b5">[Hazarika et al., 2018]</ref> utilized contextual information by making user-embeddings for capturing indicative behavioral traits. These user-embeddings incorporated personality features along with the author's writing style (using historical posts). They also used discourse comments along with background cues and topical information for detecting sarcasm. They performed their experiments on the largest Reddit dataset SARC <ref type="bibr" target="#b6">[Khodak et al., 2018]</ref>. Many have only used the target text for classification purposes, where a target 2 www.twitter.com/ 3 www.reddit.com/ text is a textual unit that has to be classified as sarcastic or not. Simply using gated recurrent units (GRU) <ref type="bibr" target="#b1">[Cho et al., 2014]</ref> or long short term memory (LSTM) <ref type="bibr" target="#b5">[Hochreiter and Schmidhuber, 1997]</ref> do not capture in between interactions of word pairs which makes it difficult to model contrast and incongruity. <ref type="bibr" target="#b12">[Tay et al., 2018]</ref> were able to solve this problem by looking in-between word pairs using a multi-dimensional intra-attention recurrent network. They focused on modeling the intra-sentence relationships among the words. <ref type="bibr">[Kumar et al., 2020]</ref> exploited the use of a multi-head attention mechanism <ref type="bibr" target="#b13">[Vaswani et al., 2017]</ref> which could capture dependencies between different representations subspaces in different positions. Their model consisted of a word encoder for generating new word representations by summarizing comment contextual information in a bidirectional manner. On top of that, they used multi-head attention for focusing on different contexts of a sentence, and in the end, a simple multi-layer perceptron was used for classification.</p><p>There has not been much work done in conversation dependent (comment and reply) approaches for sarcasm detection. <ref type="bibr" target="#b3">[Ghaeini et al., 2018]</ref> proposed a model that not only used information from the target utterance but also used its conversational context to perceive sarcasm. They aimed to detect sarcasm by just using the sequences of sentences, without any extra knowledge about the user and topic. They combined the predictions from utterance-only and conversation-dependent parts for generating its final prediction which was able to capture the words responsible for delivering sarcasm. <ref type="bibr" target="#b4">[Ghosh and Veale, 2017</ref>] also modeled conversational context for sarcasm detection. They also attempted to derive what parts of the conversational context triggered a sarcastic reply. Their proposed model used sentence embeddings created by taking an average of word embeddings and a sentence-level attention mechanism was used to generate attention induced representations of both the context and the response which was later concatenated and used for classification.</p><p>Among all the previous works, [ <ref type="bibr" target="#b3">Ghaeini et al., 2018]</ref> and <ref type="bibr" target="#b4">[Ghosh and Veale, 2017]</ref> share similar motives of detecting sarcasm using only the conversational context. However, we introduce a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA) for detecting sarcasm. Unlike previous works, our work considers short texts for detecting sarcasm, which is far more challenging to detect when compared to long texts as long texts provide much more contextual information.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Model</head><p>This section will introduce the proposed Bi-ISCA: Bidirectional Inter Sentence Contextual Attention based neural network for sarcasm detection (as shown in Figure <ref type="figure" target="#fig_0">1</ref>). Sarcasm detection is a binary classification task that tries to predict whether a given comment is sarcastic or not. The proposed model uses comment-reply pairs for detecting sarcasm. The input to the model is represented by U</p><formula xml:id="formula_0">= [W u 1 , W u 2 , ...., W u n ] and V = [W v 1 , W v 2 , ...., W v n ],</formula><p>where U represents the comment sentence and V represents the reply sentence (both sentences padded to a length of n). Here, W u i , W v j ∈ R d are d−dimensional word embedding vectors. The objective is to predict label y which indicates whether the reply to the corresponding comment was sarcastic or not.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Intra-Sentence Word Encoder Layer</head><p>The primary purpose of this layer is to summarize intrasentence contextual information from both directions in both the sentences (comment &amp; reply) using Bidirectional Long Short Term Memory Networks (Bi-LSTM). A Bi-LSTM <ref type="bibr" target="#b11">[Schuster and Paliwal, 1997]</ref> processes information in both the directions using a forward LSTM [Hochreiter and Schmidhuber, 1997] − → h , that reads the sentence S = [w 1 , w 2 , ...., w n ] from w 1 to w n and a backward LSTM ← − h that reads the sentence from w n to w 1 . Hidden states from both the LSTMs are added to get the final hidden state representations of each word. So the hidden state representation of the t th word (h t ) can be represented by the sum of t th hidden representations of the forward and backward LSTMs (</p><formula xml:id="formula_1">− → h t , ← − h t ) as show in equations below. − → h t = −−−−→ LST M (w t , − − → h t−1 ); ← − h t = ←−−−− LST M (w t , ← − − h t−1 )<label>(1)</label></formula><formula xml:id="formula_2">h t = ← − h t + − → h t (2)</formula><p>This Intra-Sentence Word Encoder Layer consists of two independent Bidirectional LSTMs for both comment (BiLST M c ) and reply (BiLST M r ). Apart from the hidden states, both these Bi-LSTMs also generate separate (forward &amp; backward) final cell states represented by</p><formula xml:id="formula_3">← − C &amp; − → C .</formula><p>The comment sentence U is given as an input to BiLST M c and the reply sentence V is given as an input to BiLST M r . The outputs of both the Bi-LSTMs are represented by the equations 3 and 4.</p><formula xml:id="formula_4">− → C u , h u , ← − C u = BiLST M c (U ) (3) − → C v , h v , ← − C v = BiLST M r (V ) (4) Here, − → C u , − → C v ∈ R d are the final cell states of the for- ward LSTMs corresponding to BiLST M c &amp; BiLST M r ; ← − C u , ← − C v ∈ R d are the final cell states of the backward LSTMs corresponding to BiLST M c &amp; BiLST M r ; h u = [h u 1 , h u 2 , ...., h u n ] and h v = [h v 1 , h v 2 , ...., h v n ] are the hidden state representations of BiLST M c &amp; BiLST M r respec- tively, where h u i , h v j ∈ R d and h u , h v ∈ R n×d .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Bi-ISCA: Bidirectional Inter-Sentence Contextual Attention Mechanism</head><p>Sarcasm is context-dependent in nature. Even humans sometimes have a hard time understanding sarcasm without having any contextual information. The hidden states generated by both the Bi-LSTMs (BiLST M c &amp; BiLST M r ) captures the intra-sentence bidirectional contextual information in comment &amp; reply respectively, but fails to capture the intersentence contextual information between them. This paper introduces a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA) for capturing the inter-sentence contextual information between both the sentences. Bi-ISCA uses hidden state representations of U &amp; V along with the auxiliary sentence's cell state representations ( − → C &amp; ← − C ) to capture the inter-sentence contextual information. At first, the attention mechanism captures four sets of attentions scores namely, (α   ). In the equations below (×) represents multiplication between a scalar and a vector.</p><formula xml:id="formula_5">− → Cu , α ← − Cu , α − → Cv , α ← − Cv ∈ R n ).</formula><formula xml:id="formula_6">− → Cu = [α − → Cu 1 , α − → Cu 2 , ...., α − → Cu n ]; α − → Cu i = − → C u • h v i (5) α ← − Cu = [α ← − Cu 1 , α ← − Cu 2 , ...., α ← − Cu n ]; α ← − Cu i = ← − C u • h v i (6) α − → Cv = [α − → Cv 1 , α − → Cv 2 , ...., α ← − Cv n ]; α − → Cv i = − → C v • h u i (7) α ← − Cv = [α ← − Cv 1 , α ← − Cv 2 , ...., α ← − Cv n ]; α ← − Cv i = ← − C v • h u i (<label>8</label></formula><formula xml:id="formula_7">h − → Cu v = [h − → Cu v,1 , h − → Cu v,2 , ...., h − → Cu v,n ], ; h − → Cu v,i = α − → Cu i × h v i (9) h ← − Cu v = [h ← − Cu v,1 , h ← − Cu v,2 , ...., h ← − Cu v,n ], ; h ← − Cu v,i = α ← − Cu i × h v i (10) h − → Cv u = [h − → Cv u,1 , h − → Cv u,2 , ...., h − → Cv u,n ], ; h − → Cv u,i = α − → Cv i × h u i (11) h ← − Cv u = [h ← − Cv u,1 , h ← − Cv u,2 , ...., h ← − Cv u,n ], ; h ← − Cv u,i = α ← − Cv i × h u i (12)</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Integration and Final Prediction</head><p>The proposed model uses Convolutional Neural Networks (CNN) <ref type="bibr" target="#b8">[Lecun et al., 1998]</ref> for capturing location-invariant local features from the newly obtained contextualized hidden representations h</p><formula xml:id="formula_8">← − Cv u , h − → Cv u , h ← − Cu v , h − → Cu v . Four independent CNN blocks (CN N 1 , CN N 2 , CN N 3 , CN N 4</formula><p>) are used, corresponding to each of the newly obtained contextualized hidden representations. Each CN N block consists two convolutional layers. Both the convolution layer consist of k filters of height h. The role of these filters is to detect particular features at different locations of the input. The output c l i of the l th layer consists of k l feature maps of height h. The i th feature map (c l i ) is calculated as:</p><formula xml:id="formula_9">c l i = b l i + j=1 k l−1 K l i,j * c l−1 j (13)</formula><p>In the above equation, b l i is a bias matrix and K l i,j is a filter connecting j th feature map of layer (l − 1) to the i th feature map of layer (l). The output of each convolution layer is passed through a activation function f . The proposed model uses LeakyReLu as its activation function.</p><formula xml:id="formula_10">f = a * x, for x ≥ 0; a ∈ R (14)</formula><p>x, for x &lt; 0 (15)</p><p>For each of the CNN blocks, the corresponding contextualized hidden representations are first concatenated (⊕) and then given as input. The outputs of all the CNN blocks are flattened (F 1 , F 2 , F 3 , F 4 ∈ R dk ) and concatenated to generate a new vector (p ∈ R 4dk ), where d represents the dimension of the hidden representation and k represents number of convolutional filters used. This concatenated (p) vector is then given as input to a dense layer having 4dk neurons and is followed by the final sigmoid prediction layer. </p><formula xml:id="formula_11">F 1 = CN N 1 ([h − → Cv u,1 ⊕ h − → Cv u,2 ⊕ .... ⊕ h − → Cv u,n ])<label>(16)</label></formula><formula xml:id="formula_12">F 2 = CN N 2 ([h ← − Cv u,1 ⊕ h ← − Cv u,2 ⊕ .... ⊕ h ← − Cv u,n ])<label>(17)</label></formula><formula xml:id="formula_13">F 3 = CN N 3 ([h − → Cu v,1 ⊕ h − → Cu v,2 ⊕ .... ⊕ h − → Cu v,n ])<label>(18)</label></formula><formula xml:id="formula_14">F 4 = CN N 4 ([h ← − Cu v,1 ⊕ h ← − Cu v,2 ⊕ .... ⊕ h ← − Cu v,n ]) (19) p = [F 1 ⊕ F 2 ⊕ F 3 ⊕ F 4 ]<label>(20)</label></formula><formula xml:id="formula_15">ŷ = σ(W p + b), W ∈ R 4dk ; b ∈ R<label>(</label></formula><formula xml:id="formula_16">L = − 1 N N i=1 y i • log(ŷ i ) + (1 − y i ) • log(1 − ŷi ) (22)</formula><p>4 Evaluation Setup comments from Reddit containing the \s (sarcasm) tag. It contains replies, their parent comment (acts as context), and a label that shows whether the reply was sarcastic/non-sarcastic to their corresponding parent comment. To compare the performance of the model on a different dataset (latest), the proposed model was also evaluated on the Twitter dataset provided in the FigLang<ref type="foot" target="#foot_3">5</ref> 2020 workshop <ref type="bibr" target="#b5">[Ghosh et al., 2020]</ref> for the "sarcasm detection shared task". This consists of sarcastic/nonsarcastic tweets and their corresponding contextual parent tweets. The sarcastic tweets were collected using hashtags like #sarcasm, #sarcastic, and #irony, similarly non-sarcastic tweets were collected using hashtags like #happy, #sad, and #hate. This dataset sometime contains more than one contextual parent tweet, so in those cases, all of the contextual tweets are considered independently with the target tweet.</p><p>In both the datasets, replies are the target comment/tweet to be classified as sarcastic/non-sarcastic, and their corresponding parent comment/tweet acts as context. Both the datasets constitute of comments/tweets of varying lengths, but because this paper only focuses on detecting sarcasm in the short text, only the short comment/reply pairs were used. Comment/reply sentences of length (no. of words) less than 20, 40 were used in the case of SARC and Twitter dataset respectively. In both cases, the balanced datasets contain equal proportions of sarcastic/non-sarcastic comment/reply pairs, and the imbalanced datasets maintain a 20:80 ratio (approximately) between sarcastic and non-sarcastic comment/reply pairs. Testing was done on 10% of the dataset and the rest was used for training. 10% of the training set was used for validation purposes. Statistics of both the datasets are shown in Table <ref type="table" target="#tab_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Data Preprocessing</head><p>The preprocessing of the textual data was done by first lowercasing all the sentences and separating punctuations from the words. We do not remove the stop-words because we believe that sometimes stop-words play a major role in making a sentence sarcastic e.g., "is it?" and "am I?". The problem with social media platforms is that, users use a lot of abbreviations, shortened words and slang words like, "IMO" for "in my opinion", lmk" for "let me know ", "fr" for "for", etc. These words are challenging to taken care of in the NLP tasks, particularly in the automatic discovery of flexible word usages. So to solve this problem, these words are converted to their corresponding full-forms using abbreviation/slang word dictionaries obtained from urban dictionary<ref type="foot" target="#foot_4">6</ref> . After this, all the sentences were tokenized into a list of words. The proposed model had a fixed input size for both comment and reply, but not all the sentences were of the same length. So all the sentences were padded to the length of the longest sentence (20 in the case of the Reddit dataset and 40 in the case of the Twitter dataset). Word embeddings are used to give semantically-meaningful dense representations to the words. Word-based embeddings are constructed using contextual words whereas character-based embeddings are constructed from character n-grams of the words. Character-based in contrast to the Word-based embeddings solves the problem of out of vocabulary words and performs better in the case of infrequent words by creating word embeddings based only on their spellings. So for generating proper representations for words we have used FastText<ref type="foot" target="#foot_5">7</ref> , a character-based word embedding. This would not only give words better representation compared to the word-based model but also incorporate slang/shortened/infrequent words (which commonly appear in social media platforms).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Training Details</head><p>We have used macro-averaged (F1) and accuracy (Acc) scores as the evaluation metric, as it is standard for the sarcasm detection task. We have also reported Precision (P) and Recall (R) scores in the case of the Twitter dataset as well as for the Reddit dataset (wherever available). Hyperparameter tuning was used to find optimum values of the hyperparameters. The FastText embeddings used were of size d = 30 and were trained for 30 iterations having window size of 3, 5 in the case of SARC, and Twitter dataset respectively. The number of filters in all the convolutional blocks were <ref type="bibr">[64,</ref><ref type="bibr">64]</ref> of height <ref type="bibr">[2,</ref><ref type="bibr">2]</ref>. The learning optimizer used is Adam with an initial learning rate of 0.01. The value of α in all the LeakyReLu layers was set to 0.3. All the models were trained for 20 epochs. L2 regularization set to 10 −2 is applied to all the feed-forward connections along with early stopping having the patience of 5 to avoid overfitting. The mini-batch size was tuned amongst {100, 500, 1000, 2000, 3000, 4000} and was observed that mini-batch size of 2000, 500 gave the best performance for the SARC and Twitter dataset respectively.</p><p>The recent success of transformer-based language models has led to their wide usage in sentiment analysis tasks. They are known for generating high quality high dimensional word representations (768-dimensional for BERT). Their only drawback is that they require high processing power and memory to train. The above-mentioned configuration of the proposed model generates ≈1120K trainable parameters, and increasing either the embedding size or the number of tokens in a sentence led to an exponential increase in the number of trainable parameters. So due to computational resource limitations, we limited our experiments to lower-dimensional word embeddings.  Bi-ISCA focuses on only using the contextual comment/tweet for detecting sarcasm rather than using any other topical/personality-based features. Using only the contextual information enriches the model's ability to capture syntactical and semantical textual properties responsible for invoking sarcasm in any type of conversation. Table <ref type="table" target="#tab_2">2</ref> reports performance results on the SARC datasets. For comparison purposes, F1score (F1), Accuracy score (Acc), Precision (P) and Recall (R) were used.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Models</head><p>When compared with the existing works, Bi-ISCA was able to outperform all the models (only ‡) that use only conversational context for sarcasm detection (Improvement of ∆ 7.9% in F1 score when compared to <ref type="bibr" target="#b4">[Ghosh and Veale, 2017]</ref>; ∆ 6.2% in F1 score and ∆ 2.8% in accuracy when compared to AMR <ref type="bibr" target="#b3">[Ghaeini et al., 2018]</ref>), and was even able to perform better than the models ( † ) that use personality-based features along with the target sentence for detecting sarcasm (improvement of ∆ 7.7% in F1 and ∆ 4.3% in accuracy score when compared to CNN-SVM <ref type="bibr" target="#b10">[Poria et al., 2016]</ref>; ∆ 6.7% in F1 score and ∆ 2.3% in accuracy when compared to CUE-CNN <ref type="bibr" target="#b0">[Amir et al., 2016]</ref>). MHA-BiLSTM <ref type="bibr" target="#b7">[Kumar et al., 2020]</ref> had a ∆ 1.8% higher F1 score in the balanced dataset but Bi-ISCA was able to show drastic improvement of ∆ 17.6% in the imbalanced dataset, which demonstrated the ability of Bi-ISCA to handle class imbalance.</p><p>The current state-of-the-art on the SARC dataset is achieved by CASCADE. Even though CASCADE uses personalitybased features and contextual information along with large sentences of average length ≈55-62 (very large compared to our dataset, which gives them the advantage of using a lot more contextual information), Bi-ISCA was able to achieve an F1 score comparable to it (despite using relatively short text). In comparison with CASCADE that only uses discoursebased features, Bi-ISCA performed drastically better with an increase of ∆ 9.7% in F1 and ∆ 4.3% in accuracy score for the balanced dataset.</p><p>Bi-ISCA clearly demonstrated its capabilities to robustly handle an imbalance in the dataset, although it was unable to outperform both the CASCADE models. This slightly poor performance in the imbalanced dataset can be explained by the length of sentences used by CASCADE, which are significantly (≈5 times) greater than the ones on which Bi-ISCA was tested. Longer sentences result in increased contextual information which improves performance especially in the case of imbalance where little extra information can lead to a drastic increase in performance. Models P R F1 Baseline (LST M a ttn) 70.0 66.9 68.0 BERT-Large+BiLSTM+SVM <ref type="bibr" target="#b1">[Baruah et al., 2020]</ref> 73.4 73.5 73.4 BERT+CNN+LSTM <ref type="bibr" target="#b11">[Srivastava et al., 2020]</ref> 74.2 74.6 74.1 RoBERTa+LSTM <ref type="bibr" target="#b6">[Kumar and Anand, 2020]</ref> 77 Table <ref type="table">3</ref>: Results on the FigLang 2020 workshop Twitter dataset.</p><p>Table <ref type="table">3</ref> reports Precision (P), Recall (R), and F1-score (F1) of different models from the leaderboard of FigLang 2020 sarcasm detection shared task using the Twitter dataset. In this case, not only Bi-ISCA was able to outperform the baseline model <ref type="bibr" target="#b5">[Ghosh et al., 2020]</ref> (improvement of ∆ 19.4%, ∆ 27.9% &amp; ∆ 23.7% in precision, recall, and F1 score respectively), but was also able to perform comparably to the state-of-the-art [Lee et al., 2020] with a ∆ 1.2% increase in recall, which further validates the performance of the proposed model. Even though all the models other than the baseline in Table <ref type="table">3 are</ref>  The attention scores generated by the attention mechanism makes the proposed model highly interpretable. Table <ref type="table">4</ref> showcases the distribution of the attention scores over four sarcastic (correctly predicted by Bi-ISCA) comment-reply pairs from the SARC dataset. Not only the proposed model was correctly able to detect sarcasm in these pairs of sentences but was also able to correctly identify words responsible for contextual, explicit, or implicit incongruity which invokes sarcasm.</p><p>For example in Pair 1, Bi-ISCA correctly identified explicitly incongruous words like "amazing" and "force" in the reply sentence which were responsible for the sarcastic nature of the reply. Interestingly the word "traumatized" in the parent comment also had a high attention weight value, which shows that the proposed attention mechanism was able to learn the contextual incongruity between the opposite sentiment words like "traumatized" &amp; "amazing" in the comment-reply pair. Pair 2 demonstrates the model's ability to capture words responsible for invoking sarcasm by making sentences implicitly incongruous. Sarcasm due to implicit incongruity is usually the toughest to perceive. Despite this, Bi-ISCA was able to give high attention weights to words like "announces" and "crashes &amp; security holes". Not only this, but the proposed intra-sentence attention mechanism was also able to learn a link between "microsoft" and "m" (slang for microsoft) without having any prior knowledge related to slangs. Pair 3 is also an example of an explicitly and contextually incongruous comment-reply pair, where the model was successfully able to capture opposite sentiment words &amp; phrases like "blind drunk", "cautious" and "behind the wheel" that made the reply sarcastic in nature. Pair 4 is an example of sarcasm due to implicit incongruity between the words, "pause" &amp; "watch", and contextual incongruity simultaneously between "reported" &amp; "enjoyable", both of which were successfully captured by Bi-BISCA.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Conclusion</head><p>In this paper, we introduce a novel Bi-directional Inter-Sentence Attention mechanism based model (Bi-ISCA) for detecting sarcasm. The proposed model not only was able to capture both intra and inter-sentence dependencies but was able to achieve state-of-the-art results in detecting sarcasm in the user-generated short text using only the conversational context. Further investigation of attention maps illustrated Bi-ISCA's ability to capture explicitly, implicitly, and contextually incongruous words &amp; phrases responsible for invoking sarcasm. The success of the proposed model is achieved due to the use of character-based embeddings that takes care of slang/shortened &amp; out of vocabulary words, Bi-LSTMs that captures intra-sentence dependencies between words in the same sentence, and Bi-ISCA that captures inter-sentence dependencies between words of different sentences.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Bi-ISCA: Bi-Directional Inter-Sentence Contextual Attention Mechanism for Sarcasm Detection.</figDesc><graphic coords="3,155.99,53.84,300.02,241.12" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>These sets of inter-sentence attention scores are used to generate new inter-sentence contextualized hidden representations. Then (α − → Cu , α ← − Cu ) are calculated using the hidden state representations of BiLST M r along with the forward and backward final states ( − → C u , ← − C u ) of BiLST M c (as shown in equations 5 &amp; 6), similarly (α − → Cv , α ← − Cv ) are calculated using the hidden state representations of BiLST M c along with the forward and backward final states ( − → C v , ← − C v ) of BiLST M r (as shown in equations 7 &amp; 8). In the equations below (•) represents a dot product between two vectors.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>α</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>∈</head><label></label><figDesc>) In the next step, the above calculated sets of inter-sentence attention scores α multiplied back with the hidden state representations of BiLST M r to generate two new set of hidden representations h R n×d of the reply sentence namely, reply contextualized on comment (forward) &amp; reply contextualized on comment (backward) respectively (as shown in equations 9 &amp; 10). Similarly α − → Cv , α ← − Cv are multiplied back with the hidden state representations of BiLST M c to generate two new set of hidden representations h − → Cv u , h ← − Cv u ∈ R n×d of the comment sentence namely, comment contextualized on reply (forward) &amp; comment contextualized on reply (backward) respectively (as shown in equations 11 &amp; 12</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>21) The proposed model uses the binary cross-entropy as the training loss function as shown in equation 22. Here (L) is the cost function, ŷi ∈ R represents the output of the proposed model, y i ∈ R represents the true label and N ∈ N represents the number of training samples.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head></head><label></label><figDesc>a transformer-based model, Bi-ISCA was able to outperform them all. Attension weight distribution in reddit comment-reply pairs. Here CcR represents "Comment contextualized on Reply" whereas RcC represents "Reply contextualized on Comment"; (R) &amp; (L) represents forward &amp; backward attention.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Statics of the SARC dataset and FigLang 2020 workshop Twitter dataset.</figDesc><table><row><cell>4.1 Dataset</cell></row><row><cell>This paper focuses on detecting sarcasm in the user-generated short text using only the conversational context. Social media platforms like Reddit and Twitter are widely used by users for posting opinions and replying to other's opinions. They have proved to be of a great source for extracting conversational data. So the experiments were conducted on two publicly available benchmark datasets (Reddit &amp; Twitter) used for the sarcasm detection task. Both the datasets consist of comments and reply pairs. SARC 4 Reddit [Khodak et al., 2018] is the largest dataset available for sarcasm detection containing millions of sarcastic/non-sarcastic comments-reply pairs from the so-cial media site Reddit. This dataset was generated by scraping</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 :</head><label>2</label><figDesc>Results on the SARC dataset. Models haveing only ‡ uses only contextual text for detecting sarcasm.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1">Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">https://nlp.cs.princeton.edu/SARC/2.0/ Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">sites.google.com/view/figlang2020</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">https://www.urbandictionary.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_5">https://fasttext.cc/ Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Modelling context with user embeddings for sarcasm detection in social media</title>
		<author>
			<persName><surname>Amir</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Ninth International AAAI Conference on Web and Social Media</title>
				<imprint>
			<publisher>CoNLL</publisher>
			<date type="published" when="2015">2016. 2016. 2015. 2015</date>
		</imprint>
	</monogr>
	<note>Proceedings of the Conference on Natural Language Learning</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Learning phrase representations using RNN encoder-decoder for statistical machine translation</title>
		<author>
			<persName><surname>Barbieri</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</title>
				<meeting>the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis<address><addrLine>Baltimore, Maryland; Doha, Qatar</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014-06">2014. June 2014. 2020. July 2020. 2014. October 2014</date>
			<biblScope unit="page" from="1724" to="1734" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Transformer-based context-aware sarcasm detection in conversation threads from social media</title>
		<author>
			<persName><forename type="first">Dong</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on Figurative Language Processing</title>
				<meeting>the Second Workshop on Figurative Language Processing</meeting>
		<imprint>
			<date type="published" when="2020-07">2020. July 2020</date>
			<biblScope unit="page" from="276" to="280" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Reactions to irony in discourse: evidence for the least disruption principle</title>
		<author>
			<persName><surname>Eisterhold</surname></persName>
		</author>
		<idno>CoRR, abs/1809.03051</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</title>
				<editor>
			<persName><forename type="first">Prasad</forename><surname>Tadepalli</surname></persName>
		</editor>
		<meeting>the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis<address><addrLine>San Diego, California</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2006">2006. 2006. 2016. July 2016. 2018. 2018. June 2016</date>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="161" to="169" />
		</imprint>
	</monogr>
	<note>Attentional multi-reading sarcasm detection</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Magnets for sarcasm: Making sarcasm detection timely, contextual and very personal</title>
		<author>
			<persName><forename type="first">Veale</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aniruddha</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tony</forename><surname>Veale</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2017 Conference on Empirical Methods in Natural Language Processing<address><addrLine>Copenhagen, Denmark</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2017-09">2017. September 2017</date>
			<biblScope unit="page" from="482" to="491" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">CASCADE: Contextual sarcasm detection in online discussion forums</title>
		<author>
			<persName><surname>Ghosh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 27th International Conference on Computational Linguistics</title>
				<editor>
			<persName><forename type="first">Schmidhuber</forename><surname>Hochreiter</surname></persName>
		</editor>
		<meeting>the 27th International Conference on Computational Linguistics<address><addrLine>Santa Fe, New Mexico, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="1997">2020. July 2020. 2018. August 2018. 1997. 1997. 2020. July 2020</date>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="77" to="82" />
		</imprint>
	</monogr>
	<note>Proceedings of the Second Workshop on Figurative Language Processing</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Your sentiment precedes you: Using an author&apos;s historical tweets to predict sarcasm</title>
		<author>
			<persName><surname>Joshi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</title>
				<meeting>the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing<address><addrLine>Beijing, China</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="2015-07">2015. July 2015. 2015. 2015. 2018. 2020. July 2020</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="88" to="92" />
		</imprint>
	</monogr>
	<note>Proceedings of the Second Workshop on Figurative Language Processing</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Sarcasm detection using multi-head attention based bidirectional lstm</title>
		<author>
			<persName><surname>Kumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="6388" to="6397" />
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Augmenting data for sarcasm detection with unlabeled conversation context</title>
		<author>
			<persName><surname>Lecun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on Figurative Language Processing</title>
				<meeting>the Second Workshop on Figurative Language Processing</meeting>
		<imprint>
			<date type="published" when="1998">1998. 1998. 2020. July 2020</date>
			<biblScope unit="volume">86</biblScope>
			<biblScope unit="page" from="12" to="17" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The perfect solution for detecting sarcasm in tweets #not</title>
		<author>
			<persName><surname>Liebrecht</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis</title>
				<meeting>the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis<address><addrLine>Atlanta, Georgia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013-06">2013. June 2013</date>
			<biblScope unit="page" from="29" to="37" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A deeper look into sarcastic tweets using deep convolutional neural networks</title>
		<author>
			<persName><surname>Poria</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers</title>
				<meeting>COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers<address><addrLine>Osaka, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016-12">2016. December 2016</date>
			<biblScope unit="page" from="1601" to="1612" />
		</imprint>
	</monogr>
	<note>The COLING 2016 Organizing Committee</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Sarcasm as contrast between a positive sentiment and negative situation</title>
		<author>
			<persName><forename type="first">Rajadesingan</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013</title>
				<meeting>the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013<address><addrLine>New York, NY, USA; Grand Hyatt Seattle, Seattle, Washington, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computational Linguistics</publisher>
			<date type="published" when="1997-07">2015. 2015. 2013. 2013. 18-21 October 2013. 2013. 1997. 1997. July 2020</date>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="page" from="93" to="97" />
		</imprint>
	</monogr>
	<note>Proceedings of the Second Workshop on Figurative Language Processing</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Reasoning with sarcasm by reading in-between</title>
		<author>
			<persName><surname>Tay</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</title>
				<meeting>the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)<address><addrLine>Melbourne, Australia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018-07">2018. July 2018</date>
			<biblScope unit="page" from="1010" to="1020" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Icwsm-a great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews</title>
		<author>
			<persName><surname>Tsur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">fourth international AAAI conference on weblogs and social media</title>
				<editor>
			<persName><forename type="first">I</forename><surname>Guyon</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">U</forename><forename type="middle">V</forename><surname>Luxburg</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bengio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Wallach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Fergus</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Vishwanathan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Garnett</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2010">2010. 2010. 2017. 2017</date>
			<biblScope unit="page" from="5998" to="6008" />
		</imprint>
	</monogr>
	<note>Advances in Neural Information Processing Systems 30</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">The pragmatics of verbal irony: Echo or pretence?</title>
		<author>
			<persName><forename type="first">Wilson</forename><forename type="middle">;</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Deirdre</forename><surname>Wilson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lingua</title>
		<imprint>
			<biblScope unit="volume">116</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page" from="1722" to="1743" />
			<date type="published" when="2006">2006. 2006</date>
		</imprint>
	</monogr>
	<note>Language in Mind: A Tribute to Neil Smith on the Occasion of his Retirement</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
