=Paper=
{{Paper
|id=Vol-2130/paper3
|storemode=property
|title=Learning Emoji Embeddings using Emoji Co-occurrence Network Graph
|pdfUrl=https://ceur-ws.org/Vol-2130/paper3.pdf
|volume=Vol-2130
|authors=Anurag Illendula,Manish Reddy Yedulla
}}
==Learning Emoji Embeddings using Emoji Co-occurrence Network Graph==
Learning Emoji Embeddings using Emoji Co-occurrence Network Graph Anurag Illendula Manish Reddy Yedulla Department Of Mathematics Department of Engineering Science IIT Kharagpur IIT Hyderabad aianurag09@iitkgp.ac.in es15btech11012@iith.ac.in language and facial expressions during text conversa- tions. They are two-dimensional visual embodiments Abstract of everyday aspects of life which were standardized by the Unicode Consortium in 2010 as part of Unicode Usage of emoji in social media platforms has 6.0. Emoji proliferated throughout the globe and has seen a rapid increase over the last few years. particularly become a part of the popular culture in Majority of the social media posts are laden the west. It has been adopted by almost all social me- with emoji and users often use more than dia platforms and messaging services. Emojis serve one emoji in a single social media post to ex- many purposes during online communication, among press their emotions and to emphasize certain which conveying emotion is one of the primary uses. words in a message. Utilizing the emoji co- According to the latest statistics released by Emo- occurrence can be helpful to understand how jipedia in June 2017, the number of emojis has in- emoji are used in social media posts and their creased to 2,666, posing challenges to applications that meanings in the context of social media posts. list them in small hand-held devices such as mobile In this paper, we investigate whether emoji co- phones. To overcome this challenge, emoji keyboards occurrences can be used as a feature to learn in most of the smartphones contains categorizes emoji emoji embeddings which can be used in many into several categories listed in Table 1. downstream applications such sentiment anal- ysis and emotion identification in social me- Many recent Natural Language Processing (NLP) dia text. We utilize 147 million tweets which systems rely on word representations in finite- have emojis in them and build an emoji co- dimensional vector space. These NLP systems occurrence network. Then, we train a net- mainly use pre-trained word embeddings obtained work embedding model to embed emojis into from word2vec [MSC+ 13] or GloVe [PSM14] or fast- a low dimensional vector space. We evalu- Text [BGJM16]. Earlier GloVe embeddings were used ate our embeddings using sentiment analysis for training most NLP systems, but fastText trained and emoji similarity experiments, and experi- word embeddings could achieve much higher accura- mental results show that our embeddings out- cies of NLP systems involving social media data be- perform the current state-of-the-art results for cause the fastText model could learn sub-word in- sentiment analysis tasks. formation. Emoji embeddings have been of funda- mental importance to improve the accuracies of many emoji understanding tasks. Recent research proved 1 Introduction that emoji embeddings could enhance the performance Emojis are the 21st century's successor to the emoti- of emoji prediction [FMS+ 17, BBS17], emoji simi- con. They arose from the need to communicate body larity [WBSD17b], and emoji sense disambiguation tasks [WBSD17a, SPWT17]. These emoji represen- Copyright c 2018 held by the author(s). Copying permitted for tations have also been efficient in understanding the private and academic purposes. behavior of emojis in different contexts. The need to In: S. Wijeratne, E. Kiciman, H. Saggion, A. Sheth (eds.): Pro- ceedings of the 1st International Workshop on Emoji Under- learn emoji representations for improving the perfor- standing and Applications in Social Media (Emoji2018), Stan- mance of social NLP systems has been recognized by ford, CA, USA, 25-JUN-2018, published at http://ceur-ws.org Eisner et al. [ERA+ 16] and Francesco et al. [BRS16] among others, where they used traditional approaches which include skip-gram and CBOW model to learn emoji embeddings. Information networks such as publication networks, World Wide Web are characterized by the interplay between various content and a sophisticated under- lying knowledge structure. Graph embedding mod- els are helpful to scale information from large-scale information networks and embed them into a finite- dimensional vector space, and these embeddings have shown great success in various NLP tasks such as node Figure 1: Distribution of Tweets across various emojis classification [BCM11], link prediction [LNK07] and in lakhs classification [YRS+ 14] tasks. These graph embedding . models have been of crucial importance and have en- hanced the performance of word similarity and word learn emoji representations from emoji co-occurrence analogical reasoning tasks using language networks network graph. Section 5 reports the accuracies ob- [TQW+ 15]. The analysis of emoji co-occurrence net- tained by our emoji embeddings on the gold-standard work graphs can help us understand emojis from dif- dataset for sentiment analysis task, emoji similarity ferent perspectives. We hypothesize that emojis which tasks. We discuss the reason behind high accuracies co-occur in a tweet contains the same sentiment as the obtained for sentiment analysis task in Section 6 fol- overall sentiment of the tweet. Consider a tweet, “I got lowed by plans for future work in Section 7. betrayed by , I want to kill you ”, here both the emojis , contain negative sentiment, the overall 2 Related Work sentiment of the tweet is also negative. Hence we in- vestigate whether emoji co-occurrence could be a bet- One of the exciting work by Wijeratne ter feature to learn emoji representations to improve et al. [WBSD16, WBSD17a] in the the accuracy of classification tasks. In this paper, we field of emoji understanding is EmojiNet introduce an approach to learn emoji representations (http://emojinet.knoesis.org/home.php), the largest using emoji co-occurrence network graph and large- machine readable emoji sense inventory, this inventory scale information network embedding model and eval- helps computers understand emojis. In this work uate our embeddings using the gold-standard dataset Wijeratne et al. tried to connect emojis and their for sentiment analysis task. senses to corresponding words in babelnet ([NP12]) using their respective babelnetId. EmojiNet opened Table 1: Emoji Categories doors to many of the emoji understanding tasks like emoji similarity, emoji prediction, emoji sense Category Emoji Examples disambiguation. Smiley and The other interesting work done by Wijeratne et , , People al. [WBSD17b] addressed the challenge of measuring Animals and emoji similarity using the semantics of emoji. They , , Nature defined two types of semantics embeddings using the Food and Drink , , textual senses and the textual descriptions of emojis. Activity , , Prior work by Francesco et al. ([BRS16]) and Eisner Travel and et al. ([ERA+ 16]) used traditional approaches to learn , , emoji embeddings. The semantic embeddings have Places Objects , , achieved accuracies which outperformed the previous state-of-the-art results in sentiment analysis task; this Symbols , , high accuracy is due to the fact that semantic embed- Flags , , dings can learn syntactic, semantic, sentiment features of emojis. This paper is organized as follows. Section 2 dis- Seyednezhad et al. ([SM17]) created a network us- cusses the related work done by other researchers in ing the emoji co-occurrences in the same tweet; they the field of emoji understanding and learning network claim that each edge weight can help us understand representations. Section 3 discusses the process of cre- the user’s context to use multiple emojis. This emoji ating an emoji co-occurrence network using our twitter network also enabled them to justify the use of co- corpus. Section 4 explains our model architecture to occurred emojis in different perceptions. This also en- Table 2: Most frequently co-occurring emoji pairs I got betrayed by a I would die for No of Emoji Pair Co-occurrences ( , ) 230957 I can’t trust so its Hi & Bye flow, I still them though ( , ) 196970 ( , ) 135595 ( , ) 102612 Baby text back , waiting for your reply ( , ) 102408 Without respect there is no Table 3: Least frequently co-occurring emoji pairs No of Emoji Pair Figure 2: Construction of Emoji polygons Co-occurrences . ( , ) 1 abled them to understand emoji usage by understand- ( , ) 1 ing possible relations between these special characters ( , ) 1 in common text. Fede et al. ([FHSM17]) studied dif- ( , ) 1 ferent characteristics of this emoji co-occurrence net- ( , ) 1 work graph which include studying user’s behavior to use a sequence of emojis in different contexts. Information networks have been of primary use to ing to these two emojis is considered as 2. Similarly, store large amounts of information. Many researchers the weight of all the edges in the emoji network is cal- have proposed different graph embedding models in culated. machine learning literature which allow us to embed The emoji co-occurrence network created using the nodes of large information networks into a low dimen- tweets in Figure 2 is represented in Figure 3. We input sional vector space ([PARS14] [GL16] [CLX15]). These the emoji co-occurrence network graph to our graph embeddings helped address many tasks such as node embedding model to learn 300-dimensional emoji em- classification, visualization, and link prediction tasks. beddings, and we evaluate our embeddings using the gold-standard dataset for sentiment analysis. We use the gold-standard dataset ([NSSM15]) to evaluate our 3 Data and Network embeddings because the current state-of-the-art re- The emoji network is constructed using a twitter cor- sults [ERA+ 16] for sentiment analysis were obtained pus of 147 million tweets crawled through a period of 2 on this dataset. months (from 6th August 2016 to 8th September 2016) by Wijeratne et al. [WBSD17a]. We filter the tweets 4 Model and only consider the tweets which have multiple emo- 4.1 Description jis embedded in a tweet. This reduces the number of distinct tweets in the dataset to 14.3 million. Fig- Here we discuss two different types of measures which ure 1 shows the distribution of the number of tweets signify the proximity between two nodes of the co- of the most frequently occurring emojis. Each tweet occurrence network graph, and the model developed generates a polygon of n sides where n is the number by Jian et al. [TQW+ 15] to learn the node represen- of emojis embedded in the tweet. The construction tations of a network graph. of emoji network is straightforward and Figure 2 ex- plains the construction of emoji polygons with the help First Order Proximity : The first order proxim- of different examples. ity is defined as the local pairwise proximity which can be related to the weight of the edge formed by joining The weight of an edge signifies the number of co- the two vertices. The first order proximity between occurrences of the emojis sharing the edge considering an edge (u,v) is the weight Wuv of the edge formed the complete twitter corpus. For example in the case by vertices u, v. It can also be inferred from the defi- of tweets shown in Figure 2 the emoji pair ( , ) ap- nition that the first-order proximity between any two peared twice hence the weight of the edge correspond- non-connected vertices is zero. where d(pe1 (i, j), p1 (i, j)) is defined as the distance between the two probability distributions. Replacing Tweet 1 d(·, ·) by KL-divergence, the objective function reduces to Tweet 3 X O1 = − O1 (i, j) (4) Tweet 4 (i,j)∈E Tweet 2 X O1 = − wij log p1 (vi , vj ) (5) Negative sentiment tweet Positive sentiment tweet (i,j)∈E 4.1.2 Network embedding using second order Figure 3: Construction of Emoji Network proximity: . The second order proximity of two nodes (vi , vj ) mea- sures the similarity of the neighbourhood network Second Order Proximity : The second or- structures of nodes (vi , vj ). This measure is applica- der proximity is defined as the similarity between ble for both directed and undirected graphs. Hence neighbourhood network structures. For exam- our objective, in this case, is to look at the vertex and ple, consider u to be an emoji node, let pu = the “context” of the vertex which can also be related (w(u,1) , w(u,2) , ......., w(u,|V |) ) denote the first order to the distribution of neighbours of the given vertex. proximity of the emoji node “u” with all the vertices Hence for each edge (vi , vj ) the probability of “con- then the second order proximity is defined as the sim- text” is defined by ilarity between pu and pv . If there exists no common vertex between u and v, then second-order proximity T exp(u~0 j · ~ui ) is zero. p2 (vj |vi ) = P (6) |V | ~0 T ui ) k=1 exp(u k · ~ 4.1.1 Network embedding using first order Where |V | is the number of vertices. As mentioned proximity: before, the second order proximity assumes that ver- Let ui and uj represent the network embedding in d tices with similar distribution over the contexts as sim- dimensional vector space, where (i,j) is an undirected ilar vertices. To maintain the second order proximity, edge in the network graph. The joint probability which the similarity distance between the contexts p2 (·|vi ) signifies the proximity between vertices vi , vj is defined represented in the low dimensional vector space and as the empirical distribution pe2 (·|j) must be optimized. Hence our objective function (O2 ) in this case is 1 p1 (vi , vj ) = (1) X 1 + exp(~uTi · ~uj ) O2 = λi d(pe2 (·|vi ), p2 (·|vi )) (7) d vi ∈V where u~i ∈ R is a low dimensional representation also called as embedding for emoji node vi , wij repre- where d(·, ·) is the distance between two probability sents the weight of the edge between the nodes vi and distributions, here the variable λi is used to consider vj The probability distribution between different pair the importance of the vertex vi during the process of of vertices is defined as p(.,.) over the vector space V optimization. As defined in the previous case the em- x V and the empirical probability is defined as pe1 (i, j) pirical distribution is defined as wij X wij X pe1 (i, j) = and W = wij (2) pe2 (i, j) = and di = wik (8) W di (i,j)∈E k∈N (i) To maintain the first Order proximity between the wij is the weight of edge (vi , vj ) and di is defined as vertices of the network graph, the objective function the out-degree of vertex and N(i) is the set of neigh- (O1 ) which is the distance between the empirical prob- bours of vi . Considering λi = di for the purpose of ability function and the proximity function is to be simplicity, and replacing d(·, ·) with KL-divergence optimized. X O2 = − wij log p2 (vj |vi ) (9) O1 (i, j) = d(pe1 (i, j), p1 (i, j)) (3) (i,j)∈E 4.2 Model Optimization which contain emoji. In both training and testing sets, 29% are labelled as positive, 25% are labelled as neg- The approach of negative sampling proposed by ative, and 46% are labelled as neural. We use the Mikolov et al. [MSC+ 13] is used to optimize the objec- pre-trained FastText word embeddings3 [MGB+ 18] to tive function which helps us to represent every vertex embed words into a low dimensional vector space. We of the network graph in the low dimensional vector calculate the bag of words vector for each tweet and space. Hence the objective function simplifies to: then use this vector as a feature to train a support vec- tor machine and a random forest model on the training T K X T set, and evaluate the accuracies obtained for classifica- log σ(u~0 j · ~ui )) + Evn Pn (v) [log σ(u~0 j · ~ui )] (10) tion task on whole testing dataset consisting of 12920 i=1 tweets. The accuracies obtained for classification task using the first order embeddings surpass the current where σ(x) = 1/(1 + exp(−x)) is the sigmoid func- state-of-the-art [WBSD17b] results. tion. We use the stochastic gradient descent algorithm [RRWN11] for optimizing the objective function and Table 4: Accuracy of Sentiment Analysis task we update the model parameters on a batch of edges. Thus after completion of the training process, we get Classification Classification the embeddings corresponding to each vertex. The Word accuracy accuracy gradient with respect to an embedding u~i of vertex vi Embeddings using RF using SVM will be: State-of-the- ∂O1 ∂ log(p1 )(vi , vj ) art 60.7 63.6 = wij . (11) results ∂ u~i ∂ u~i First Order 62.1 65.2 ∂O2 ∂ log(p2 )(vj |vi ) Embedding = wij . (12) ∂ u~i ∂ u~i Second Order 58.7 61.9 We learn the node embeddings (u~i ) by optimizing Embedding the objective function in both cases and call the em- beddings as first order embeddings and second order embeddings respectively.The model is trained using the Tensorflow ([ABC+ 16]) library on a cuda GPU. 5.2 Emoji Similarity Model is trained using RMS Propagation gradient de- Emoji similarity4 is one of the important challenges scent algorithm with learning rate as 0.025, and we which should be addressed for the development of used a batch size as 128, the number of batches = emoji keyboards since the current emoji keyboard con- 300000 and 300-dimensional embeddings. The code sists of 2666 emojis, and the complete list cannot be is made available on Github1 , 300-dimensional emoji accommodated in a small screen. These emoji embed- embeddings learned using the emoji co-occurrence net- dings learned using the emoji co-occurrence network work can also be accessed at this link. graph could be helpful to calculate the similarity be- tween emojis using cosine distance as the similarity 5 Experiments measure and group emojis which have high similarity 5.1 Sentiment Analysis values. This grouping of emojis can decrease the num- ber of distinct emojis and helps us accommodate this In this section, we report our accuracies obtained grouped emojis on a small screen. In this section, we for the sentiment analysis task on the gold-standard report the emoji similarity values found considering dataset developed by Novak et al. [NSSM15]. Our ex- the first order embeddings and second order embed- periments have achieved accuracies which outperform dings. the current state-of-the-art results for sentiment anal- We consider the cosine distance to be the similarity ysis on the gold-standard dataset. The gold-standard measure between two embeddings. Let ~a and ~b be two dataset2 consists of 64599 manually labelled tweets vectors which represent embeddings of emojis e1 and classified into positive, negative, neutral sentiment. e2 respectively, the similarity measure between these The dataset is divided into training set that consists two emojis (e1 and e2 ) is calculated as 51679 tweets, 9405 out of which contain emoji and 3 https://bit.ly/2FMTB4N testing set that consists of 12920 tweets, 2295 out of 4 Our main objective is not to address the emoji similarity 1 https://bit.ly/2I5hYNd task. Our main objective is to demonstrate the usefulness of 2 https://bit.ly/2pLaKVZ our emoji embeddings for sentiment analysis task. Table 7: Spearman's Rank Correlation Results ~a · ~b similarity(e1 , e2 ) = (13) |a| · |b| Emoji Embeddings ρ ∗ 100 Table 5 and Table 6 reports the most similar emo- First Order Embeddings 74 jis found considering the first order embeddings and Second Order Embedding 66 second order embeddings respectively.The observed re- sults are explained in Section 6. analogy, ( : ) = ( : ?), we fill the gap (repre- sented by “?”) by finding an emoji from the complete Table 5: Emoji Similarity Measured using first order list of emojis whose embedding(represented by vec(x)) embeddings is closest to vec( ) - vec( ) + vec( ). Table 8 re- ports some of the interesting analogies found using first Similarity Semantic order and second order embeddings. Emoji Pair Measure Similarity ( , ) 0.921 0.442 Table 8: Emoji to Emoji Analogical Reasoning using ( , ) 0.916 0.598 Emoji Embeddings ( , ) 0.911 0.623 ( , ) 0.909 0.546 Second Emoji First Emoji Pair ( , ) 0.856 0.723 Pair ( , ) 0.889 0.702 ( : ) ( : ) ( , ) 0.881 0.737 ( : ) ( : ) ( : ) ( : ) ( : ) ( : ) ( : ) ( : ) 5.3 Analogical Reasoning The analogical reasoning task introduced by Mikolov et al. [MSC+ 13], defines the syntactic and semantic 6 Discussion analogies. For example, consider the semantic analogy such as USA : Washington = India : ? where we The high accuracy for classification task using the first fill the gap (represented by “?”) by finding a word order embedding model is due to the fact that all co- from the vocabulary whose embedding(represented by occurring emojis in a tweet possess the same sentiment vec(x)) is closest to vec(Washington) - vec(USA) + feature, hence during classification these embeddings vec(India). Here cosine distance is considered as the would increase the accuracy of the classification model. similarity measure between the two vectors. Consider the tweet, “Who uses this emoji , I miss the one that had this mouth and these eyes ! ... Table 6: Emoji Similarity Measured using second or- where did he go?! Why did he leave?!” , in this tweet der embeddings we observe the overall sentiment to be positive, and we also observe that all the emojis embedded in the Similarity Semantic tweet possess the same sentiment. Hence co-occurring Emoji Pair emojis would be better attribute to learn emoji em- Measure Similarity beddings which can increase the accuracy of sentiment ( , ) 0.646 0.662 analysis and other related classification tasks. ( , ) 0.606 0.598 We use the Spearman's rank correlation coefficient ( , ) 0.596 0.623 to evaluate the emoji similarity ranks obtained using ( , ) 0.556 0.622 first order and second order embeddings learned using ( , ) 0.546 0.916 emoji co-occurrence network with the emoji similar- ( , ) 0.540 0.945 ity ranks of gold-standard dataset5 . Table 7 reports the Spearman's correlation coefficient obtained by our emoji embeddings. According to the correlation coef- ficients the first emoji embeddings show a strong cor- 5.3.1 Emoji to emoji analogy relation (0.6 < ρ < 0.79). We extrapolate the semantic analogy task introduced The top 6 most similar emoji pairs observed consid- by Mikolov et al. [MSC+ 13] in the context of emojis, ering the first order embeddings are reported in Table by replacing words with emojis. Consider an emoji 5 https://bit.ly/2GztSR2 5. As we see from Table 5 the most similar emoji pair References observed is ( , ) with similarity measure of 0.921. [ABC+ 16] Martı́n Abadi, Paul Barham, Jianmin The usage of the first emoji would be in the con- Chen, Zhifeng Chen, Andy Davis, Jef- text where the user wishes to express his dis-concern frey Dean, Matthieu Devin, Sanjay Ghe- over certain issue through an act of hitting, the usage mawat, Geoffrey Irving, Michael Isard, of the other emoji would be in the context where et al. Tensorflow: A system for large-scale the user wishes to express his dis-concern over certain machine learning. In OSDI, volume 16, issue through an expression of uneasiness. Hence the pages 265–283, 2016. high similarity measure has sound even if we consider the context of use of the emojis. The results show that [BBS17] Francesco Barbieri, Miguel Ballesteros, our embeddings give higher similarity measures than and Horacio Saggion. Are emo- the semantic similarity6 measure. jis predictable? arXiv preprint The top 6 most similar emoji pairs observed con- arXiv:1702.07285, 2017. sidering the second order embeddings are reported in [BCM11] Smriti Bhagat, Graham Cormode, and Table 6. As we see from Table 6 the most similar S Muthukrishnan. Node classification in emoji pair observer is ( , ) with similarity measure social networks. In Social network data of 0.586. The usage of the first emoji would be in analytics, pages 115–148. Springer, 2011. the context where the user wishes to generate a sound or ring a bell or in the context of celebration, the usage [BGJM16] Piotr Bojanowski, Edouard Grave, of the second emoji would be in the context of cel- Armand Joulin, and Tomas Mikolov. ebration. EmojiNet lists “celebration” as a sense form Enriching word vectors with sub- for both the emojis, hence the observed similarity has word information. arXiv preprint sound even if we consider the context of use of this arXiv:1607.04606, 2016. emojis. [BGL14] Jiang Bian, Bin Gao, and Tie-Yan Liu. Knowledge-powered deep learning for 7 Future Work word embedding. In Joint European Con- Usage of external knowledge has improved the ac- ference on Machine Learning and Knowl- curacies of various natural language processing tasks edge Discovery in Databases, pages 132– and outperformed many state-of-the-art results. Jian 148. Springer, 2014. et al. [BGL14] have worked on leveraging external [BRS16] Francesco Barbieri, Francesco Ronzano, knowledge in learning word embeddings which gave and Horacio Saggion. What does this better accuracies in word similarity and word anal- emoji mean? a vector space skip-gram ogy tasks. The first set of examples in EmoSim5087 model for twitter emojis. In LREC, 2016. dataset look more convincing than the results in Table 5 and Table 6; the reason being semantic knowledge [CLX15] Shaosheng Cao, Wei Lu, and Qiongkai helps to us compare the similarity between different Xu. Grarep: Learning graph representa- emojis efficiently. Using Jian et al.'s work as a refer- tions with global structural information. ence, we could work on incorporating external knowl- In Proceedings of the 24th ACM Interna- edge from EmojiNet to our network embedding model tional on Conference on Information and which might further improve the accuracies of senti- Knowledge Management, pages 891–900. ment analysis and emoji similarity tasks. ACM, 2015. [ERA+ 16] Ben Eisner, Tim Rocktäschel, Isabelle Acknowledgement Augenstein, Matko Bošnjak, and Sebas- We are grateful to Sanjaya Wijeratne and Amit Sheth tian Riedel. emoji2vec: Learning emoji for thought-provoking discussions on the topic. We ac- representations from their description. knowledge support from the Indian Institute of Tech- arXiv preprint arXiv:1609.08359, 2016. nology Kharagpur. Any opinions, findings, and con- [FHSM17] Halley Fede, Isaiah Herrera, SM Mahdi clusions/recommendations expressed in this material Seyednezhad, and Ronaldo Menezes. are those of the author(s) and do not necessarily reflect Representing emoji usage using directed the views of Indian Institute of Technology Kharagpur. networks: A twitter case study. In In- 6 The semantic similarity is the similarity measure obtained ternational Workshop on Complex Net- using semantic embeddings developed by Wijeratne et al. works and their Applications, pages 829– 7 https://bit.ly/2GztSR2 842. Springer, 2017. [FMS+ 17] Bjarke Felbo, Alan Mislove, Anders [PSM14] Jeffrey Pennington, Richard Socher, and Søgaard, Iyad Rahwan, and Sune Christopher Manning. Glove: Global vec- Lehmann. Using millions of emoji tors for word representation. In Proceed- occurrences to learn any-domain rep- ings of the 2014 conference on empiri- resentations for detecting sentiment, cal methods in natural language process- emotion and sarcasm. arXiv preprint ing (EMNLP), pages 1532–1543, 2014. arXiv:1708.00524, 2017. [RRWN11] Benjamin Recht, Christopher Re, [GL16] Aditya Grover and Jure Leskovec. Stephen Wright, and Feng Niu. Hogwild: node2vec: Scalable feature learning for A lock-free approach to parallelizing networks. In Proceedings of the 22nd stochastic gradient descent. In Advances ACM SIGKDD international confer- in neural information processing systems, ence on Knowledge discovery and data pages 693–701, 2011. mining, pages 855–864. ACM, 2016. [SM17] SM Mahdi Seyednezhad and Ronaldo [LNK07] David Liben-Nowell and Jon Kleinberg. Menezes. Understanding subject-based The link-prediction problem for social emoji usage using network science. In networks. journal of the Association Workshop on Complex Networks Com- for Information Science and Technology, pleNet, pages 151–159. Springer, 2017. 58(7):1019–1031, 2007. [SPWT17] Amit Sheth, Sujan Perera, San- jaya Wijeratne, and Krishnaprasad [MGB+ 18] Tomas Mikolov, Edouard Grave, Piotr Thirunarayan. Knowledge will propel Bojanowski, Christian Puhrsch, and Ar- machine understanding of content: mand Joulin. Advances in pre-training Extrapolating from current examples. distributed word representations. In Pro- In Proceedings of the International ceedings of the International Conference Conference on Web Intelligence, Leipzig, on Language Resources and Evaluation Germany, August 23-26, 2017, pages (LREC 2018), 2018. 1–9, 2017. [MSC+ 13] Tomas Mikolov, Ilya Sutskever, Kai [TQW+ 15] Jian Tang, Meng Qu, Mingzhe Wang, Chen, Greg S Corrado, and Jeff Dean. Ming Zhang, Jun Yan, and Qiaozhu Distributed representations of words and Mei. Line: Large-scale information net- phrases and their compositionality. In work embedding. In Proceedings of the Advances in neural information process- 24th International Conference on World ing systems, pages 3111–3119, 2013. Wide Web, pages 1067–1077. Interna- tional World Wide Web Conferences [NP12] Roberto Navigli and Simone Paolo Steering Committee, 2015. Ponzetto. BabelNet: The automatic con- struction, evaluation and application of a [WBSD16] Sanjaya Wijeratne, Lakshika Balasuriya, wide-coverage multilingual semantic net- Amit Sheth, and Derek Doran. Emojinet: work. Artificial Intelligence, 193:217–250, Building a machine readable sense inven- 2012. tory for emoji. In International Confer- ence on Social Informatics, pages 527– [NSSM15] Petra Kralj Novak, Jasmina Smailović, 541. Springer, 2016. Borut Sluban, and Igor Mozetič. Sentiment of emojis. PloS one, [WBSD17a] Sanjaya Wijeratne, Lakshika Balasuriya, 10(12):e0144296, 2015. Amit Sheth, and Derek Doran. Emo- jinet: An open service and api for emoji [PARS14] Bryan Perozzi, Rami Al-Rfou, and sense discovery. In 11th International Steven Skiena. Deepwalk: Online learn- AAAI Conference on Web and Social Me- ing of social representations. In Proceed- dia (ICWSM), pages 437–446, Montreal, ings of the 20th ACM SIGKDD interna- Canada, May 2017. tional conference on Knowledge discovery and data mining, pages 701–710. ACM, [WBSD17b] Sanjaya Wijeratne, Lakshika Balasuriya, 2014. Amit P. Sheth, and Derek Doran. A semantics-based measure of emoji sim- ilarity. In Proceedings of the Interna- tional Conference on Web Intelligence, Leipzig, Germany, August 23-26, 2017, pages 646–653, 2017. [YRS+ 14] Xiao Yu, Xiang Ren, Yizhou Sun, Quan- quan Gu, Bradley Sturt, Urvashi Khan- delwal, Brandon Norick, and Jiawei Han. Personalized entity recommendation: A heterogeneous information network ap- proach. In Proceedings of the 7th ACM international conference on Web search and data mining, pages 283–292. ACM, 2014.