DANKMEMES @ EVALITA 2020: The Memeing of Life: Memes, Multimodality and Politics Martina Miliani1,2 and Giulia Giorgi3 and Ilir Rama3 and Guido Anselmi3 and Gianluca E. Lebani4 1 University for Foreigners of Siena 2 CoLing Lab, Department of Philology, Literature, and Linguistics, University of Pisa 3 Department of Social and Political Sciences, University of Milan 4 Department of Linguistics and Comparative Cultural Studies, Ca’ Foscari University of Venice martina.miliani@fileli.unipi.it, giulia.giorgi@unito.it ilir.rama@unimi.it, guido.anselmi@unimi.it, gianluca.lebani@unive.it Abstract necessity to handle massive quantities of visual data (Tanaka et al., 2014) by leveraging on au- DANKMEMES is a shared task proposed tomated approaches. Efforts in this direction fo- for the 2020 EVALITA campaign, focus- cused on the generation of memes (Peirson V and ing on the automatic classification of In- Tolunay, 2018; Gonçalo Oliveira et al., 2016) and ternet memes. Providing a corpus of on automated sentiment analysis (French, 2017), 2.361 memes on the 2019 Italian Gov- while stressing the need for a multimodal ap- ernment Crisis, DANKMEMES features proach able to contextually consider both visual three tasks: A) Meme Detection, B) Hate and textual information (Sharma et al., 2020; Speech Identification, and C) Event Clus- Smitha et al., 2018). tering. Overall, 5 groups took part in the As manual labelling becomes unfeasible on a first task, 2 in the second and 1 in the large scale, scholars require tools able to classify third. The best system was proposed by the huge amount of memetic content continuously the UniTor group and achieved a F1 score produced on the web. The main goal of our shared of 0.8501 for task A, 0.8235 for task B and task is to evaluate a range of technologies that can 0.2657 for task C. In this report, we de- be used to automatize the process of meme recog- scribe how the task was set up, we report nition and sorting with an acceptable degree of re- the system results and we discuss them. liability. 1 Introduction 2 Task Description Internet memes are understood as “pieces of cul- ture, typically jokes, which gain influence through The DANKMEMES task, presented at the 2020 online transmission” (Davison, 2012). Specifi- EVALITA campaign (Basile et al., 2020), encom- cally, a meme is a multimodal artefact manipu- passes three subtasks, aimed at: detecting memes lated by users, who merges intertextual elements (Task A), detecting the hate speech in memes to convey an ironic message. Featuring a visual (Task B) and clustering memes according to events format that includes images, texts or a combina- (Task C). Participants could decide to take part in tion of them, memes combine references to cur- one or more of these tasks, with the only recom- rent events or relatable situations and pop-cultural mendation that Task 1 functions as the compulsory references to music, comics and movies (Ross and preliminary step for the other two tasks. Rivers, 2017). The pervasiveness of meme production and cir- Task A: Meme Detection. The lack of consen- culation across different platforms increases the sus around what defines a meme (Shifman, 2013) led to different definitions, focusing on circulation Copyright © 2020 for this paper by its authors. Use per- mitted under Creative Commons License Attribution 4.0 In- (Davison, 2012; Dawkins, 2016), formal features ternational (CC BY 4.0). (Milner, 2016), or content (Gal et al., 2016; Kno- bel and Lankshear, 2007). For this dataset, manual Label Description coding focused both on formal aspects (such as 0 Residual category layout, multimodality and manipulation) as well 1 Beginning of the government crisis as content, e.g. ironic intent (Giorgi and Rama, Conte’s speech and beginning of con- 2 2019); the exponential increase in visual produc- sultations tion, however, warrants an automated approach, Conte is called to form a new govern- 3 which might be able to further tap into stable and ment generalizable aspects of memes, considering form, 5SM holds a vote on the platform 4 content and circulation. Given the dataset minus Rousseau the variable strictly related to memetic status, par- ticipants must provide a binary classification, dis- Table 1: Categories for Task C: Event Clustering. tinguishing memes (1) from non memes (0). niques to cluster the memes, so that memes pin- Task B: Hate Speech Identification. Hate pointing to the same events are classified in the speech became a relevant issue for social media same cluster. platforms. Even though the automatic classifi- cation of posts may lead to censorship of non- 3 Dataset offensive content (Gillespie, 2018), the use of ma- 3.1 Composition of the dataset chine learning techniques became more and more crucial, since manual filtering is a very time con- The DANKMEMES dataset is comprised of 2,361 suming task for the annotators (Zampieri et al., images (for each subtask a specific dataset was 2019b). Recent studies have also shown that mul- provided), automatically extracted from Instagram timodal analysis is fundamental in such a task through a Python script aimed at the hashtag (Sabat et al., 2019). In this direction, SemEval related to the Italian government crisis (“#cri- 2020 proposed the “Memotion Analysis” among sidigoverno”). The corpus includes 367 offensive its tasks, to classify sarcastic, humorous, and of- political memes unrelated to the government cri- fensive meme (Sharma et al., 2020). This kind sis, and aimed at augmenting and balancing the of analysis assumes a specific relevance when ap- dataset for task 2. plied to political content. Memes about political 3.2 Annotation of the dataset topics are a powerful tool of political criticism For each image of the dataset we provide both the (Plevriti, 2014). For these reasons, the proposed name of the .jpg image file, the date of publication task aims at detecting memes with offensive con- and the engagement, i.e. the number of comments tent. Following Zampieri (2019a) definition, an and likes of the post. The dataset also includes im- offensive meme contains any form of profanity or age embeddings. The vector representations are a targeted offense, veiled or direct, such as insults, computed employing ResNet (He et al., 2016), a threats, profane language or swear words. Thus, state-of-the-art model for image recognition based the second task consists in a binary classification, on Deep Residual Learning. Providing such image where systems have to predict whether a meme is representations allows the participants to approach offensive (1) or not (0). these multimodal tasks focusing primarily on its Task C: Event Clustering. Social media react NLP aspects (Kiela and Bottou, 2014). The anno- to the real world, by commenting in real-time to tation process involved two Italian native speak- mediatised events in a way that disrupts traditional ers, who study memes at an academic level, and usage patterns (Al Nashmi, 2018). The ability to focused on detecting and labelling 7 relevant cate- understand which events are represented and how, gories: then, becomes relevant in the context of an hyper- • Macro status: refers to meme layouts and productive Internet. their relation to diffused, conventionalised The goal of the third subtask is to cluster a set of formats called macros. The category has 0 memes that may be or may be not related to the and 1 as labels, where the value 1 represents 2019 Italian government crisis into five event cat- well-known memetic frames, characters and egories (see Table 1). layouts (e.g. Pepe the Frog). The identifica- Participants’ goal is to apply supervised tech- tion of macros relied both on external sources (e.g. the website ”Know Your Meme”) and the annotators’ literacy on memes. • Picture manipulation: entails the degree of visual modification of the images. Non- manipulated or low impact changes are la- beled 0 (e.g. the addition of a text or a logo). Heavily manipulated, impactful changes (e.g. images edited to include political actors) are labeled 1. • Visual actors: the political actors (i.e. politi- cians, parties’ logos) portrayed visually, re- gardless whether edited into the picture or portrayed in the original image. • Text: the textual content of the image has been extracted through optical character recognition (OCR) using Google’s Tesseract- OCR Engine, and further manually corrected. Figure 1: Two examples from the dataset for Meme Detection: the image at the top • Meme: binary feature, where 0 represents is a meme, whereas the image at the bot- non meme images and 1 meme images. This tom is not a meme. is the target label for Task A. Dataset for Meme Detection (Task A). The • Hate Speech: binary feature only for memes. whole dataset counts 2,000 images, half memes It differentiates memes with offensive lan- and half not (see Figure 1 for an example). We guage (1) from non offensive memes (0). split the dataset into training and test sets, in a pro- This is the target label for Task B. portion of 80-20% of items. Table 2 represents the • Event: it is a feature only for meme images, format of the training dataset. The test dataset has categorizing them according to 4 events (de- been provided without gold labels, i.e. without the scribed in 4), plus a residual category labeled “Meme” attribute. as 0. This is the target label for Task C. Dataset for Hate Speech Identification (Task B). The whole dataset counts 1,000 memes (see Fig- The final inter-annotator agreement (IAA) has ure 2 for an example). We split the dataset into been calculated by two of the authors on a subset training and test sets, in a proportion of 80-20% of of the dataset through Krippendorff’s alpha (Krip- items. Table 3 represents the format of the training pendorff, 2018). Four features have been consid- dataset. The test dataset has been provided without ered: Macro status (α = 0.755), Picture manipu- the gold label “Hate Speech” for testing purposes. lation (α = 0.930), Hate Speech (α = 0.741) and Meme (α = 0.884). Other features were either ob- Dataset for Event Clustering (Task C). The jective (i.e. Visual and textual actors) or inferred whole dataset counts 1,000 memes (see Figure 3 from external data (i.e. events). for an example). We split the dataset into training Participants were allowed to use external re- and test sets, in a proportion of 80-20% of items. sources, lexicons or independently annotated data. Table 4 shows the format of the training set. The Given that, although we provided ResNet image test set has been provided without gold labels (i.e. embeddings, participants could make use of any without the “Event” attribute) for testing purposes. other image representations. 3.4 Data release 3.3 Training and Test Data Both the training and the test sets were released on The initial dataset was split into three datasets, one our website and protected with a password. As de- for each task, structured as follows: scribed in Section 3.3, the development data con- File Engagement Date Manip. Visual Text Meme 1.jpg 21,053 22/08/19 1 Conte aiuto 0 56.jpg 114 22/08/19 0 Salvini alle solite 1 Table 2: An excerpt from the dataset for Task A, Meme Detection. File Engagement Manip. Visual Text Hate Speech 62.jpg 21,053 1 Conte aiuto 0 114.jpg 12,572 1 Salvini merdman 1 Table 3: An excerpt from the dataset for Task B, Hate Speech Identification. File Engagement Date Macro Manip. Visual Text Event 43.jpg 21,053 22/08/19 1 1 Conte aiuto 1 23.jpg 114 22/08/19 1 0 Salvini alle solite 0 114.jpg 12,572 25/08/19 0 1 Salvini merdman 2 Table 4: An excerpt from the dataset for Task C, Event Clustering. Team Name Affiliation Task DMT RN Podar School A Keila Dipartimento di Matematica e Informatica di Perugia A UniTor Università degli Studi di Roma ”Tor Vergata” A,B,C UPB Univesity Politehnica of Bucharest A,B SNK ETI3 A Table 5: Participants along with their affiliations and the tasks they participated in. sisted of three distinct datasets, one for each task. All material was released for non-commercial The participants could download a distinct folder research purposes only under a Creative Common for each task, which contained: license (BY-NC-ND 4.0). Any use for statistical, propagandistic or advertising purposes of any kind • A UTF-8 encoded comma separated “.csv” is prohibited. It is not possible to modify, alter or file with 800 items (1,600 for task A), con- enrich the data provided for the purposes of redis- taining the metadata described in Section 3.3; tribution. • A folder containing the images in .jpg format; 4 Evaluation Measures • A .csv file containing the relative image em- beddings. For all tasks, the models have been evaluated with P recision, Recall, and F1 scores defined as fol- As for the test data, we released three folders lows: whose structure is similar to the ones of the train- ing sets. Each folder for the train sets contains: TP P recision = TP + FP • A UTF-8 encoded comma separated “.csv” file with 200 items (400 for Task A), which TP features the same metadata of the corre- Recall = TP + FN sponding training set minus the golden label (i.e. “Meme” for Task A, “Hate speech” for P recision × Recall Task B and “Event” for Task C); F1 = 2 × P recision + Recall • A folder containing the images in .jpg format; where T P are true positives, and F N and F P • A .csv file containing the relative image em- are false negatives and false positives, respec- beddings. tively. We computed P recision, Recall, and F1 contains at least a swear word1 . Task C: Event Clustering. The baseline is given by the performance of a classifier labeling every meme as belonging to the most numerous class (i.e. the residual one). 5 Participants and Results In total, 16 teams registered for DANKMEMES, and five of them participated in at least one of the tasks: DankMemesTeam (DMT) (Setpal and Sarti, 2020), Keila, UPB (Vlad et al., 2020), SNK (Fiorucci, 2020), and UniTor (Breazzano et al., 2020). All of the 5 teams participated in Task A, while 2 teams participated in Task B and 1 in Task C. Participants could submit up to two runs per task: all of the teams did so consistently across tasks, with the exception of one team submitting a single run in Task A. This amounts to 9 runs for Task A, 4 for Task B and 2 for Task C, as detailed in Table 5. Task A: Meme Detection. Task A consisted in differentiating between a meme and a not-meme. Five teams presented a total of 9 runs, as detailed in Table 6. The best scores have been achieved by the UniTor team with an F1 -measure of 0.8501 (with a Precision score of 0.8522 and a Recall measure of 0.848). The SNK and UPB teams fol- Figure 2: Two examples from the dataset lowed closely, but all teams consistently showed a for Hate Speech Identification: the meme drastic improvement over the baseline. at the top is classified as hate speech con- tent, whereas the meme at the bottom is Team Run Recall Precision F1 not. Unitor 2 0.8522 0.848 0.8501 SNK 1 0.8515 0.8431 0.8473 for Task A and Task B considering only the pos- UPB 2 0.8543 0.8333 0.8437 itive class. For what concerns Task C, which is Unitor 1 0.839 0.8431 0.8411 a multiclass classification task, we computed the SNK 2 0.8317 0.848 0.8398 performance for each class and then calculated the UPB 1 0.861 0.7892 0.8235 macro-average over all classes. DMT 1 0.8249 0.7157 07664 Keila 1 0.8121 0.6569 0.7263 Different baselines were used for the different Keila 2 0.7389 0.652 0.6927 tasks: baseline 1 0.525 0.5147 0.5198 Task A: Meme Detection. The baseline is given Table 6: Results of Task A. by the performance of a random classifier, which labels 50% of images as meme. Task B: Hate Speech Identification. Task B consisted in the identification of whether a meme 1 Task B: Hate Speech Identification. The base- The list of swear words was downloaded from: https://www.freewebheaders.com/ line is given by the performance of a classifier la- italian-bad-words-list-and-swear-words/ beling a meme as offensive when the meme text (last access: 2nd November 2020). (a) (b) (c) (d) Figure 3: Examples of memes from the dataset for Event Clustering task. Each meme refers to an event: (a) Beginning of the governement crisis; (b) Conte’s speech and beginning of consultations; (c) Conte is called to form a new government; (d) 5SM holds a vote on the platform Rousseau. is offensive or not. As detailed in Table 7, 2 teams cation framework, exploitation of available fea- participated in this task for a total of 4 runs (2 tures, multimodality of the adopted approaches, each). The best scores are achieved by the UniTor exploitation of further annotated data, and use of team for the F1 -measure at 0.823 and the Recall external resources. Since this is the first task score of 0.8667, while the UPB team scored the about memes within the EVALITA campaign, we best Precision measure at 0.8056. The scores im- could not compare the obtained results with those prove over the baseline consistently across teams achieved in any previous edition. A task about for what concerns the Recall score and the F1 - memes, Memotion, has been organized under Se- measure, while the Precision measure was not mEval 2020 (Sharma et al., 2020). However, reached by any participant. the Memotion subtasks (Sentiment Classification, Humor Classification, and Scales of Semantic Team Run Recall Precision F1 Classes) are quite different from those presented UniTor 2 0.7845 0.8667 0.8235 in DANKMEMES, and the results are hardly com- UniTor 1 0.7686 0.8857 0.823 parable. UPB 1 0.8056 0.8286 0.8169 UPB 2 0.8333 0.7143 0.7692 baseline 1 0.8958 0.4095 0.5621 System architecture. All the submitted runs to DANKMEMES leverage on neural networks, in- Table 7: Results of Task B. cluding very simple but equally efficient architec- tures. Multi-Layer Perceptrons (MLP) have been Task C: Event Clustering. Task C consisted in adopted by UniTor and SNK, ranked first and sec- clustering memes into 5 events using supervised ond in the the Meme Detection task, respectively. classification. As seen in Table 8, a single team UPB adopted a Vocabulary Graph Convolutional participated with 2 runs: the best score is there- Network (VGCN) combined with BERT contex- fore that of the UniTor team, with an F1 -score of tual embeddings for text analysis. This team em- 0.2657. ployed this architectural design within a Multi- Team Run Recall Precision F1 Task Learning (MTL) technique, based on two UniTor 1 0.2683 0.2851 0.2657 main neural network components: one for the text UniTor 2 0.2096 0.2548 0.2183 and the other for the image analysis. The out- baseline 1 0.096 0.2 0.1297 puts of these two elements were concatenated and used to feed a Dense layer. The system in DMT is Table 8: Results of Task C. composed of three 8-layer feed-forward networks, each taking as input a different image vector repre- 6 Discussion sentation. Finally, Keila exploited Convolutional We compare the participating systems accord- Neural Networks (CNN) in each of the submitted ing to the following main dimensions: classifi- run. External resources. All the presented models Data Augmentation. Several participants chose employed external resources to feed their neu- to adopt a data augmentation technique. Uni- ral architecture with image and text representa- Tor successfully manipulated the provided images tions. The text contained in the images was en- by horizontally mirroring them. On the contrary, coded by using different flavours of word embed- DMT created nine versions of each image at first, dings. Most of the participants exploited one of editing brightness, rotation, and zoom, but then the available BERT contextual embeddings model dropped them due to the overfitting caused by the for the Italian language (AlBERTo, UmBERTo, unmodified metadata associated with each image. or GilBERTo). However, with its first run, SNK Keila augmented textual data by firstly translating achieved the second position in the Meme Detec- the image texts in English and then back to Italian. tion task using the pre-trained FastText embed- Regarding the second task on Hate Speech Identi- dings for the Italian language. Similarly, Keila fication, UniTor trained for a few epochs the Um- adopted pre-trained Word2Vec for the Italian lan- BERTo embeddings on a dataset made available guage, though achieving lower results. As for the within the Hate Speech Detection (HaSpeeDe) visual channel, the DANKMEMES datasets pro- task (Bosco et al., 2018) before training it on the vided a state-of-the-art representation of images, DANKMEMES dataset. obtained with the ResNet50 architecture. Most of the participants experimented the use of other Exploited features. SNK encoded and concate- image vector representations as well: DMT used nated in a single vector picture manipulation, vi- three different image vector: AlexNet, ResNet, sual, and engagement, along with the sentence and and DenseNet; UniTor and UPB examined sev- the image representation of each meme. Keila em- eral models, among which: EfficientNET, VGG- ployed engagement and manipulation features as 16, YOLOv4, ResNet50, and ResNet152. Un- well. DMT normalized engagement and repre- iTor chose EfficientNet for their final models, sented dates with the count of days from a selected while UPB based their ssystems on ResNet50 and reference date. Along with the other provided ResNet152. data, temporal features were exploited by UPB as well, through the computation of complementary Multimodality. The exploitation of both images sine and cosine distances, in order to preserve the and text turned out to be fundamental for the task cyclic characteristics of days and months. Finally, of Meme Detection. Since memes adhere to spe- UniTor relied only on visual and textual informa- cific visual conventions, participants tried to ex- tion. ploit visual data at their best. The first run of Un- Event Clustering. The goal of this task was to iTor only relied on an image classifier, whereas assign each meme to the event it refers to. Only DMT exploited the information resulting from UniTor participated in this task, modeling it as a three different image classification models, then classification problem in two distinguished runs. combined with word embeddings. Nevertheless, The first model only exploited textual data rep- the best results were obtained by the combina- resentation provided by the Transformer architec- tion of text and image information. In its sec- ture to feed the MLP classifier. Furthermore, Uni- ond run, UniTor concatenated the image repre- Tor submitted a second run. The team mapped the sentation returned by their first model with pre- original classification problem, which counted five trained contextual word embeddings fine-tuned on different labels (each corresponding to an event) DANKMEMES data. Similarly, SNK and UPB over a binary classification one. After pairing a leveraged both textual and image data. Keila was meme to each event, a pair was labeled as positive the only participant who did not combine text and if the association was correct, negative otherwise. image information in any of the submitted runs. However, this run did not overpass the first one, For what concerns the second task, the first Uni- the outcome of which doubled the provided base- Tor run only relied on textual data and was slightly line. overcame only by their second run. As observed by the team, in the Hate Speech Identification task, 7 Final Remarks textual data heavily impact the classification re- sults. Finally, UPB combined both image and tex- The paper describes a task for the detection tual data for this task. and analysis of memes in the Italian language. DANKMEMES is the first task of this kind in Di Maro, and Lucia C. Passaro, editors, Proceedings the EVALITA campaign. Although memes are of Seventh Evaluation Campaign of Natural Lan- guage Processing and Speech Tools for Italian. Fi- widespread on the Web, it is still hard to define nal Workshop (EVALITA 2020), Online. CEUR.org. them precisely. However, DANKMEMES high- lighted the fundamental role of multimodality in Jean French. 2017. Image-based memes as sentiment predictors. In 2017 International Conference on In- memes detection, mainly the combined use of formation Society (i-Society). texts and images for their classification. There- fore, we could say that memes share peculiar lin- Noam Gal, Limor Shifman, and Zohar Kampf. 2016. “it gets better”: Internet memes and the construc- guistic features, other than conventional layouts. tion of collective identity. New media & society, Future work will focus on the extension of the 18(8):1698–1714. dataset, which showed some limitations, espe- Tarleton Gillespie. 2018. Custodians of the Internet: cially for its reduced size and for the unbalanced Platforms, content moderation, and the hidden de- representation of some events. This is due to the cisions that shape social media. Yale University difficulty of meme collection, especially when fil- Press. tered in relation to a specific event (e.g., the 2019 Giulia Giorgi and Ilir Rama. 2019. “one does not Italian government crisis). simply meme”. framing the 2019 italian government crisis through memes. In La comunicazione po- litica nell’ecosistema dei media digitali Convegno References dell’Associazione Italiana di Comunicazione Polit- ica (ASSOCOMPOL). Eisa Al Nashmi. 2018. From selfies to media events: How instagram users interrupted their routines af- Hugo Gonçalo Oliveira, Diogo Costa, and Alexandre ter the charlie hebdo shootings. Digital Journalism, Pinto. 2016. One does not simply produce funny 6(1):98–117. memes! – explorations on the automatic generation of internet humor. In Proceedings of the Seventh In- Valerio Basile, Danilo Croce, Maria Di Maro, and Lu- ternational Conference on Computational Creativity cia C. Passaro. 2020. Evalita 2020: Overview (ICCC 2016). of the 7th evaluation campaign of natural language processing and speech tools for italian. In Valerio Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Basile, Danilo Croce, Maria Di Maro, and Lucia C. Sun. 2016. Deep residual learning for image recog- Passaro, editors, Proceedings of Seventh Evalua- nition. In The IEEE Conference on Computer Vision tion Campaign of Natural Language Processing and and Pattern Recognition (CVPR), pages 770–778. Speech Tools for Italian. Final Workshop (EVALITA Douwe Kiela and Léon Bottou. 2014. Learning image 2020), Online. CEUR.org. embeddings using convolutional neural networks for improved multi-modal semantics. In Proceedings of Cristina Bosco, Dell’Orletta Felice, Fabio Poletto, the 2014 Conference on Empirical Methods in Nat- Manuela Sanguinetti, and Tesconi Maurizio. 2018. ural Language Processing (EMNLP), pages 36–45. Overview of the evalita 2018 hate speech detection task. In EVALITA 2018-Sixth Evaluation Campaign Michele Knobel and Colin Lankshear. 2007. Online of Natural Language Processing and Speech Tools memes, affinities, and cultural production. A new for Italian, pages 1–9. literacies sampler, 29:199–227. Claudia Breazzano, Edoardo Rubino, Danilo Croce, Klaus Krippendorff. 2018. Content analysis: An intro- and Roberto Basili. 2020. Unitor @ dankmemes: duction to its methodology. Sage publications. Combining convolutional models and transformer- based architectures for accurate meme management. Ryan M. Milner. 2016. The World Made Meme: Pub- In Valerio Basile, Danilo Croce, Maria Di Maro, and lic Conversations and Participatory Media. MIT Lucia C. Passaro, editors, Proceedings of Seventh Press. Evaluation Campaign of Natural Language Pro- Abel L. Peirson V and E. Meltem Tolunay. 2018. Dank cessing and Speech Tools for Italian. Final Work- learning: Generating memes using deep neural net- shop (EVALITA 2020), Online. CEUR.org. works. CoRR, abs/1806.04510. Patrick Davison. 2012. The language of internet Vasiliki Plevriti. 2014. Satirical user-generated memes memes. The Social Media Reader, pages 120–134. as an effective source of political criticism, extend- ing debate and enhancing civic engagement. Richard Dawkins. 2016. The Selfish Gene. Oxford University Press. Andrew S. Ross and Damian J. Rivers. 2017. Digital cultures of political participation: Internet memes Stefano Fiorucci. 2020. Snk @ dankmemes: Lever- and the discursive felegitimization of the 2016 us aging pretrained embeddings for multimodal meme presidential candidates. Discourse, Context & Me- detection. In Valerio Basile, Danilo Croce, Maria dia, 16:1–11. Benet Oriol Sabat, Cristian Canton Ferrer, and Xavier Giro-i Nieto. 2019. Hate speech in pixels: Detec- tion of offensive memes towards automatic modera- tion. arXiv preprint arXiv:1910.02334. Jinen Setpal and Gabriele Sarti. 2020. Dankmemesteam @ dankmemes: Archimede: A new model architecture for meme detection. In Valerio Basile, Danilo Croce, Maria Di Maro, and Lucia C. Passaro, editors, Proceedings of Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), Online. CEUR.org. Chhavi Sharma, Deepesh Bhageria, William Scott, Srinivas PYKL, Amitava Das, Tanmoy Chakraborty, Viswanath Pulabaigari, and Bjorn Gamback. 2020. Semeval-2020 task 8: Memotion analysis – the visuo-lingual metaphor! arXiv preprint arXiv:2008.03781. Limor Shifman. 2013. Memes in a digital world: Rec- onciling with a conceptual troublemaker. Journal of computer-mediated communication, 18(3):362–377. E. S. Smitha, Selvaraju Sendhilkumar, and G. S. Ma- halaksmi. 2018. Meme classification using textual and visual features. In Computational Vision and Bio Inspired Computing, pages 1015–1031. Emi Tanaka, Timothy Bailey, and Uri Keich. 2014. Improving meme via a two-tiered significance anal- ysis. Bioinformatics, 30:1965–1973, 03. George-Alexandru Vlad, George-Eduard Zaharia, Dumitru-Clementin Cercel, and Mihai Dascalu. 2020. Upb @ dankmemes: Italian memes analysis: Employing visual models and graph convolutional networks for meme identification and hate speech detection. In Valerio Basile, Danilo Croce, Maria Di Maro, and Lucia C. Passaro, editors, Proceedings of Seventh Evaluation Campaign of Natural Lan- guage Processing and Speech Tools for Italian. Fi- nal Workshop (EVALITA 2020), Online. CEUR.org. Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019a. Predicting the type and target of offensive posts in social media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1415–1420. Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019b. SemEval-2019 task 6: Identifying and cat- egorizing offensive language in social media (Of- fensEval). In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 75–86.