<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>“Is Depression Related to Cannabis?”: A Knowledge-Infused Model for Entity and Relation Extraction With Limited Supervision.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kaushik Roy</string-name>
          <email>kaushikr@email.sc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Usha Lokala</string-name>
          <email>Nlokala@email.sc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vedant Khandelwal</string-name>
          <email>vedant@mailbox.sc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amit Sheth</string-name>
          <email>amit@sc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Institute, University of South Carolina</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>With strong marketing advocacy of the benefits of cannabis use for improved mental health, cannabis legalization is a priority among legislators. However, preliminary scientific research does not conclusively associate cannabis with improved mental health. In this study, we explore the relationship between depression and consumption of cannabis in a targeted social media corpus involving personal use of cannabis with the intent to derive its potential mental health benefit. We use tweets that contain an association among three categories annotated by domain experts - Reason, Efect, and Addiction. The state-of-the-art Natural Langauge Processing techniques fall short in extracting these relationships between cannabis phrases and the depression indicators. We seek to address the limitation by using domain knowledge; specifically, the Drug Abuse Ontology for addiction augmented with Diagnostic and Statistical Manual of Mental Disorders lexicons for mental health. Because of the lack of annotations due to the limited availability of the domain experts' time, we use supervised contrastive learning in conjunction with GPT-3 trained on a vast corpus to achieve improved performance even with limited supervision. Experimental results show that our method can significantly extract cannabis-depression relationships better than the state-of-the-art relation extractor. High-quality annotations can be provided using a nearest neighbor approach using the learned representations that can be used by the scientific community to understand the association between cannabis and depression better.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Mental Health</kwd>
        <kwd>Depression</kwd>
        <kwd>Cannabis Crisis</kwd>
        <kwd>Legalization</kwd>
        <kwd>knowledge infusion</kwd>
        <kwd>Relation Extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Many states in the US have legalized the medical use of cannabis for therapeutic relief in those
afected by Mental Illness [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. The use of cannabis for depression, however, is not
authorized yet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Depression is ubiquitous among the US population, and some even use cannabis
to self treat their depression[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ][
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Therefore, scientific research that can help understand the
association between depression and cannabis consumption is of the utmost need given the fast
increasing cases of depression in the US and consequent cannabis consumption [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Twitter can provide crucial contextual knowledge in understanding the usage patterns of
cannabis consumption concerning depression [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. Conversations on social media such as
Twitter provide unique insights as tweets are often unfiltered and honest in disclosing
consumption patterns due to the anonymity and private space aforded to the users. For now,
even with several platforms available to aid the analysis of depression concerning cannabis
consumption, this understanding remains ambiguous [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ]. Still, encouragingly there is
support in the literature to show that cannabis use can be potentially associated with
depressive patterns. Hence, we aim to extract this association as one of three relationships annotated
by domain experts: Reason, Efect, and Addiction (Table 1).
      </p>
      <p>Relationship
Reason
Efect
Addiction</p>
      <p>Tweet
“-Not saying im cured, but i feel less depressed lately,
could be my #CBD oil supplement."
“-People will smoke weed and be on antidepressants. It’s
a clash!Weed is what is making you depressed."
“-The lack of weed in my life is depression as hell."</p>
      <p>
        The paper studies mental health and its relationship with cannabis usage, which
is a significant research problem. The study will help address several health
challenges such “as the investigation of cannabis for the treatment of depression”,” as
a reason for depression” or “as an addictive phenomenon that accompanies
depression”. Extracting relationships between any concepts/slang-terms/synonyms/street-names
related to ‘cannabis,’ and ‘depression,’ from text is a tough problem. This task is challenging for
traditional Natural Language Processing (NLP) because of the immense variability with which
tweets mentioning depression and cannabis are described. Here, we make use of the Drug
Abuse Ontology (DAO) [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ] which is a domain-specific hierarchical framework containing
315 entities (814 instances) and 31 relations defining concepts about drug-abuse. The ontology
has been used in prior work to analyze the efects of cannabis [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
        ]. DAO was
augmented using Patient Health Questionnaire 9th edition (PHQ-9), Systematized Nomenclature
of Medicine - Clinical Terms (SNOMED-CT), International Classificatio of Diseases 10th edition
(ICD-10), Medical Subject Headings (MeSH) Terms, and Diagnostic and Statistical Manual for
Mental Disorders (DSM-5) categories to infuse knowledge of mental health-related phrases in
association with drugs such as cannabis [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ][
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Some of the triples extracted from the DAO
are as follows: (1) SCRA → subClassOf → Cannabis; (2) Cannabis_Resin → has_slang_term
→ Kif; (3) Marijuana_Flower → type → Natural_Cannabis.
      </p>
      <p>
        For entity and relationship extraction (RE), previous approaches generally adopt deep
learning models [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. However, these models require a high volume of annotated data and are
hence unsuitable for our setting. Several pre-trained language representation models have
recently advanced the state-of-the-art in various NLP tasks across various benchmarks [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ].
GPT-3 [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], BERT [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] are such language models [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Language models benefit from the
abundant knowledge that they are trained on and, with minimal fine-tuning, can tremendously help
in downstream tasks under limited supervision. Hence, we exploit the representation from
GPT-3 and employ supervised contrastive learning to deal with limited supervision in terms of
quality annotations for the data. We propose a knowledge-infused deep learning framework
based on GPT-3 and domain-specific DAO ontology to extract entities and their relationship.
Then we further enhance the utilization of limited supervision through the use of supervised
contrastive learning. It is well known that deep understanding requires many examples to
generalize. Metric Learning frameworks such as Siamese networks have previously shown
how limited supervision can help use contrastive learning with triplet loss [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
Combinatorially this method leads to an increase in the number of training examples from an order of
 to  3, which helps with generalizability. The technique can also exploit the learned
metric space representations to provide high-quality annotations over unlabeled data. Therefore,
the combination of knowledge-infusion [
        <xref ref-type="bibr" rid="ref28 ref29 ref30">28, 29, 30</xref>
        ], pre-trained GPT-3, and supervised
contrastive learning presents a very efective way to handle limited supervision. The proposed
model has two modules: (1) Phrase Extraction and Matching Module, which utilizes the
DAO ontology augmented with the PHQ-9, SNOMED-CT, ICD-10, MeSH Terms, and
Diagnostic and Statistical Manual for Mental Disorders (DSM-5) lexicons to map the input word
sequence to the entities mention in the ontology by computing the cosine similarity between
the entity names (obtained from the DAO) and every n-gram token of the input sentence. This
step identifies the depression and cannabis phrase in the sentence. Distributed representation
obtained from GPT-3 of the depression phrase and cannabis phrase in the sentence is used to
learn the contextualized syntactic and semantic information that complement each other. (2)
Supervised Contrastive Learning Module, uses a triplet loss to learn a representation space
for the cannabis and depression phrases through supervised contrastive learning. Phases with
the correct relationship are trained to be closer in the learned representation space, and phrases
with incorrect relationships are far apart.
      </p>
      <p>Contributions:
(1) In collaboration with domain experts who provide limited supervision on real-world data
extracted from Twitter, we learn a representation space to label the relationships between
cannabis and depression entities to generate a cannabis-depression relationship dataset.
(2) We propose a knowledge-infused neural model to extract cannabis/depression entities and
predict the relationship between those entities. We exploited domain-specific DAO ontology,
which provides better coverage in entity extraction.
(3)Further, we use GPT-3 representations in a supervised contrastive learning approach to learn
a representation space for the diferent cannabis and depression phrase relationships due to
limited supervision.
(4) We evaluated our proposed model on the real world twitter dataset. The experimental
results show that our model significantly outperforms the state-of-the-art relation extraction
techniques by &gt; 11% points on the F1 score.
1.1. Novel Contributions of the paper
(1) Semantic filtering: We use DAO, DSM-5, to extract contextual phrases expressed implicitly
in the user tweet, mentioning Depression and Cannabis. This is required for noise-free domain
adaptation of the model, as evident in our results.
(2) We develop a weak supervision pipeline to label the remaining 7000 tweets with three
relationships (Reason, Efect, Addiction).
(3) We learn a domain-specific distance metric between the phrases, leveraging pre-trained
GPT-3 embeddings of the extracted phrases and their relationship, in a supervised contrastive
loss training setup.
(4) 7000 tweets were annotated and evaluated using the learned metric against the expert
annotation with clustering (TSNE).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Based on the techniques and their application to health, we discuss recent existing works.
Standard DL approaches based on Convolutional Neural Networks (CNN), and Long Term Short
Term Memory (LSTM) networks have been proposed for RE [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] [
        <xref ref-type="bibr" rid="ref32 ref33">32, 33</xref>
        ]. Hybrid models that
combine CNN and LSTM have also been proposed [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. More recently, Graph Convolutional
Neural Networks (GCN)’ s have been utilized to leverage additional structural information from
dependency trees towards the RE task [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ]. [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] guide the structure of the GCN by
modifying the attention mechanism. Adversarial training has also been explored to extract entities
and their relationships jointly [37]. Due to the variability in the specification of entities and
relationships in natural language, [38, 39] have exploited entity position information in their
DL framework. Models have demonstrated state of the art in RE based on the popular BERT
language model, BioBERT [40], SciBERT [41], and XLNet Yang et al. [42]. Task-specific
adaptations of BERT have been used to enhance RE in Shi and Lin [43] and Xue et al. [44]. Wang et al.
[45] augment the BERT model with a structured prediction layer to predict multiple relations
in one pass. In all the approaches discussed so far, knowledge has not been a component of the
architecture [46].
      </p>
      <p>Chan and Roth [47] show the importance of using knowledge to improve RE on sentences
by showing an improvement of 3.9% of F1-score incorporating knowledge in an Integer Linear
Programming (ILP) formulation.Wen et al. [48] use the attention weights between entities to
guide traversal paths in a knowledge graph to assist RE. Distiawan et al. [49] use knowledge
graph TransE embeddings in their approach to improving performance. Some of the other
prominent work utilizing knowledge graph for relation extraction is [50, 51, 52].</p>
      <p>These methods, however, do not consider a setting in which the availability of high-quality
annotated data is scarce. We use knowledge to extract relevant parts of the sentence [53, 54]
and pre-trained GPT-3 representations trained over a massive corpus in conjunction with
supervised contrastive learning to achieve a high degree of sample eficiency with limited
supervision.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Our Approach</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>The dataset we have used for our study consists of 11,000 Tweets collected using the twitris
API from Jan 2017 to Feb 2019 - determined by three substance use epidemiologists as a
period of heightened Cannabis consumption. The experts annotated 3000 tweets (due to time
constraints) with one of 3 relationships that they considered essential to identify: “Reason,”
“Efect,” and “Addiction.” The annotation had a Cohen Kappa Agreement of 0.8. Example from
each of these diferent relationships already shown in Table 1</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Phrase Extraction and Matching</title>
        <p>We exploit the domain-specific knowledge base to replace the phrases in social media text
with the knowledge base concepts under this method. The Phrase Extraction and Matching
are performed in several steps, which are:
• Depression and cannabis Lexicon: We have exploited the state of the art Drug Abuse
Ontology (DAO) to extract various medial entities and slang terms related to cannabis
and depression. We further expand the entities using entities extracted from PHQ-9,
SNOMED-CT, ICD-10, MeSH Terms, and Diagnostic and Statistical Manual for Mental
Disorders (DSM-5).
• Extracting N-Grams from Tweets: The N-Grams are extracted from the tweets are
considered to better understand the context of the terms by taking into consideration
the words around it. For example, from the tweet whole world emotionally depressed
everybody needs smoke blunt to relax We living nigga, we will obtain ngrams such as whole
world, emotionally depressed, depressed everybody, need smoke, need smoke blunt,
living nigga.
• GPT-3:Generative Pre-Trained Transformer 3 is an autoregressive language model. We
have used GPT-3 to generate the embedding of the N-Grams generated and the cannabis
and depression Phrase because of the vast dataset it is trained on, which provides us the
phrases’ embeddings based on its global understanding. Cosine Similarity: It is a measure
of similarity between two non-zero vectors in a similar vector space. This metric is often
used to get a semantic similarity between two phrase embeddings obtained in the same
vector space.
• Phrase Matching: We use the cosine similarity metric to understand the semantic
similarity between the phrases. We have taken a threshold of 0.75 or more cosine similarity.
Once the phrase obtains a similarity value more than or equals the threshold, the original
N-Grams from the tweet text are replaced by the matched cannabis/depression Phrase.
The above steps are repeated for all the tweets. For example, we would obtain
emotionally depressed as the depression phrase, whereas need smoke blunt is found to be the
cannabis phrase.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Supervised Contrastive Learning</title>
        <p>The proposed model architecture is divided into two sub-parts: A) Transformer Block, B) Loss
Function. The input tweet is first sent through a block of 12 transformers, and later the
embedding is passed through a triplet loss function. The label associated with the tweet input is
used to extract a complimentary sample (a tweet with the same label) and a negative sample (a
tweet associated with a diferent label). These positive and negative samples are sent through
a block of 12 transformers to obtain their embedding, which is further passed on to the triplet
loss function. Under the loss, the function tries to achieve a low cosine similarity between the
tweet and its negative sample as close to 0. At the same time, it tries to achieve a high cosine
similarity between the tweet and its positive sample as close to 1.</p>
        <p>(,  ) −  (,  ) +  ≤ 0
Where A is the anchor (the initial data point), P is a positive data point which is of the same class
as the anchor, N is a negative data point which is of the diferent class as the anchor. CoSim(X,
Y) is the cosine similarity between the two data points, and  is the margin. For the example
shown in Section 3.2, if we consider the anchor sample to be "whole world emotionally
depressed everybody needs smoke blunt relax We living nigga", corresponding to that
positive sample is "Depressionarmy weed amp sleep I awake I depressed" and the negative sample
would be "This weird rant like weed makes anxiety depression worse Im soooo sick ppl like jus".
(1)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup and Results</title>
      <p>In this section, we discuss the results of the task of cannabis and depression RE tasks. After
that, we will also provide a technical interpretation of the results.</p>
      <sec id="sec-4-1">
        <title>4.1. Results</title>
        <p>The dataset utilized in our experiment is described in Section-3 [55] . We have used Precision,
Recall, and F1 score as the metric to compare our proposed methodology with the
state-of-theart relation extractor. As a baseline model, we used BERT, BioBERT, and its various variations
such as [56]:
• BERTPE: We expand the BERT as a position embedding along with the BERT embedding
with the position data (the relative distance of the word with cannabis and depression
entity) obtained via domain-specific knowledge resource.
• BERTPE+PA: This consists of an additional component of position-aware mechanism
along with BERTPE.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Ablation Study</title>
        <p>We have performed the ablation study by removing one component from the proposed model,
evaluate its performance to understand the impact of various components. Based on the study,
we found that, as we remove the contrastive loss function from our learning approach, the
model performance significantly drops by 6.46 % F1 Score, 6.53 % Recall, and 6.4 % Precision.
The significant decrease in the model’s performance shows that generating an embedding for
two samples of the same class similar and of diferent classes dissimilar brings in a great
advantage to the training of the model. The contrastive loss function allows us to learn the
representation of the same classes to be closer to each other in vector space and hence allows
us in generating the representation of unlabelled data from the dataset (discussed further, later
in this section )</p>
        <p>We also observe that domain-specific knowledge resource with contextualized embedding
trained over a large corpus (GPT-3) is very important. As we further remove the second
component from our model, we see a total decrease of 9.01 % F1 score, 8.92 % Precision, and 9.11 %
recall in the proposed model’s performance. This component (Knowledge Infusion) was
majorly responsible for removing the data’s ambiguity using the phrases from human-curated
domain-specific knowledge bases (such as DAO, DSM-5, SNOMED-CT, and others). Also, the
contextualized embedding helped us consider the global representation of the entities present
in the dataset and hence contribute to improving the model’s performance.</p>
        <p>Model
Proposed Model
(-) Contrastive learning loss
(-) knowledge infusion</p>
        <p>Thus, this shows that every component of the proposed model is necessary for the best
performing results.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Cluster Analysis</title>
        <p>After training the model, we annotate the unlabelled data in the dataset by classifying among
the three relationships. We parse the unlabelled tweets from the first module to extract the
phrases from the knowledge bases using contextualized embedding. Later the embeddings are
pushed into the proposed model architecture to obtain a representation of the tweets. The
representation is used to create a cluster of the tweet data points and determine the label of
the un-labeled tweets based on the majority of the data points present. The representation of
the cluster after labeling unlabelled tweets is shown in Figure 3. Some examples from each of
the cluster are as follows:
• Reason: 1) Depressionarmy weed amp sleep I awake I depressed, 2) mood smoke blunt
except the fact I depressed, 3) weed hits ya RIGHT depression, 4) I smoked weed drank alcohol
drowning sorrows away, 5) whole world emotionally depressed everybody need smoke blunt
relax We livin nigga
• Efect: 1) marijuana bad marijuana makes feel depressed low mmk, 2) Unemployed stoners
are the most depressed on the planet, 3) guess depression took long time discover marijuana
makes VERY DEPRESSED alcohol doesnt help either, 4) This weird rant like weed makes
anxiety depression worse Im soooo sick ppl like jus, 5) waking weed naps nigga feeling
depressed hell
• Addiction: 1) I feel like weed calm someone sufer depression anxiety psychosis predisposed
either, 2) Small trigger warning Blaine sufers anxiety depression occasionally smoke pot,
3) need blunt accompany depression, 4) This bot crippling depression ok weed lol, 5) Violate
blunt distraction possibly despair bask commitment This would never happen</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Reproducibility</title>
      <p>From this study, we will be delivering high quality annotated dataset of 3000 tweets along with
the full annotated dataset (by our method) of 11000 tweets, will be made publicly available 1 to
support research on the psychological impact of cannabis and depression use during
COVID19, as Cannabis use related to depression is seeing a rise once more. Also, the trained model
will be shared for reproducibility of the results and annotation of tweets. Unfortunately, we
cannot release the code used for training at this time as recently, Microsoft bought the rights
to the GPT-3 model. Therefore, to use the learning method proposed in this paper, GPT-3 will
need to be substituted with an alternative language model such as BERT, GPT-2, etc. 2</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this study, we present a method to determine the relationship between depression and
cannabis consumption. We motivate the necessity of understanding this issue due to the rapid
increase in cases of depression in the US and across the world and subsequent increase in
cannabis consumption. We utilize tweets to understand the relationship as tweets are typically
unfiltered expressions of simple usage patterns among cannabis users who use it in association
with their depressive moods or disorder. We present a knowledge aware method to determine
the relationship significantly better than the state-of-the-art efectively, show the quality of the
learned relationship through visualization on TSNE based clusters, and annotate the unlabeled
parts of the dataset. We show by training on this new dataset (human-labeled and estimated
label) that the model’s prediction quality is improved. We present this high-quality dataset
for utilization by the broader scientific community in better understanding the relationship
between depression and cannabis consumption.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Broader Impact</title>
      <p>Although we develop our method to handle relationship extraction between depression and
cannabis consumption specifically, we generally develop a domain knowledge infused
relationship extraction mechanism that uses state-of-the-art language models, few shot machine
learning techniques (contrastive loss) to achieve eficient and knowledge guided extraction. We
see the improved quality in the results over transformer models. We believe that for
applications with real-life consequences such as these, it is crucial to infuse domain knowledge as a
human would combine with language understanding obtained from language models to
identify relationships eficiently. Humans typically can learn from very few examples. Motivated
by this and the lack of availability of examples, we develop our relation extraction method.
We hope our significantly improved results will encourage scientists to explore further the use
of domain knowledge infusion in application settings that demand highly specialized domain
expertise.
v1/P19-1024.
[37] G. Bekoulis, J. Deleu, T. Demeester, C. Develder, Adversarial training for multi-context
joint entity and relation extraction, in: Proceedings of the 2018 Conference on
Empirical Methods in Natural Language Processing, Association for Computational
Linguistics, Brussels, Belgium, 2018, pp. 2830–2836. URL: https://www.aclweb.org/anthology/
D18-1307. doi:10.18653/v1/D18-1307.
[38] S.-P. Choi, Extraction of protein–protein interactions (ppis) from the literature by deep
convolutional neural networks with various feature embeddings, Journal of Information
Science 44 (2018) 60–73.
[39] Y. Peng, Z. Lu, Deep learning for extracting protein-protein interactions from biomedical
literature, arXiv preprint arXiv:1706.01556 (2017).
[40] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, J. Kang, Biobert: pre-trained
biomedical language representation model for biomedical text mining, arXiv preprint
arXiv:1901.08746 (2019).
[41] I. Beltagy, K. Lo, A. Cohan, Scibert: A pretrained language model for scientific text, in:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language
Processing and the 9th International Joint Conference on Natural Language Processing
(EMNLPIJCNLP), 2019, pp. 3606–3611.
[42] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q. V. Le, Xlnet: Generalized
autoregressive pretraining for language understanding, arXiv preprint arXiv:1906.08237
(2019).
[43] P. Shi, J. Lin, Simple bert models for relation extraction and semantic role labeling, arXiv
preprint arXiv:1904.05255 (2019).
[44] K. Xue, Y. Zhou, Z. Ma, T. Ruan, H. Zhang, P. He, Fine-tuning bert for joint entity and
relation extraction in chinese medical text, arXiv preprint arXiv:1908.07721 (2019).
[45] H. Wang, M. Tan, M. Yu, S. Chang, D. Wang, K. Xu, X. Guo, S. Potdar, Extracting
multiplerelations in one-pass with pre-trained transformers, arXiv preprint arXiv:1902.01030
(2019).
[46] M. Gaur, K. Faldu, A. Sheth, Semantics of the black-box: Can knowledge graphs
help make deep learning systems more interpretable and explainable?, arXiv preprint
arXiv:2010.08660 (2020).
[47] Y. S. Chan, D. Roth, Exploiting background knowledge for relation extraction,
Proceedings of the 23rd International Conference on (2010).
[48] D. Wen, Y. Liu, K. Yuan, S. Si, Y. Shen, Attention-Aware Path-Based relation extraction
for medical knowledge graph, in: Smart Computing and Communication, Springer
International Publishing, 2018, pp. 321–331.
[49] B. Distiawan, G. Weikum, J. Qi, R. Zhang, Neural relation extraction for knowledge base
enrichment, in: Proceedings of the 57th Annual Meeting of the Association for
Computational Linguistics, 2019, pp. 229–240.
[50] J. Li, G. Huang, J. Chen, Y. Wang, Dual cnn for relation extraction with
knowledgebased attention and word embeddings, Computational intelligence and neuroscience 2019
(2019).
[51] H. Zhou, C. Lang, Z. Liu, S. Ning, Y. Lin, L. Du, Knowledge-guided convolutional networks
for chemical-disease relation extraction, BMC bioinformatics 20 (2019) 260.
[52] P. Li, K. Mao, X. Yang, Q. Li, Improving relation extraction with knowledge-attention,
arXiv preprint arXiv:1910.02724 (2019).
[53] A. Alambo, M. Gaur, U. Lokala, U. Kursuncu, K. Thirunarayan, A. Gyrard, A. Sheth, R. S.</p>
      <p>Welton, J. Pathak, Question answering for suicide risk assessment using reddit, in: 2019
IEEE 13th International Conference on Semantic Computing (ICSC), IEEE, 2019, pp. 468–
473.
[54] U. Kursuncu, M. Gaur, C. Castillo, A. Alambo, K. Thirunarayan, V. Shalin, D. Achilov, I. B.</p>
      <p>Arpinar, A. Sheth, Modeling islamist extremist communications on social media using
contextual dimensions: religion, ideology, and hate, Proceedings of the ACM on
HumanComputer Interaction 3 (2019) 1–22.
[55] U. Lokala, R. Daniulaityte, R. Carlson, F. Lamy, S. Yadav, A. Sheth, Social media data for
exploring the association between cannabis use and depression, https://doi.org/10.6084/
m9.figshare.14067122.v2, 2021. Accessed: 2019-2-28.
[56] S. Yadav, U. Lokala, R. Daniulaityte, K. Thirunarayan, F. Lamy, A. Sheth, " when they say
weed causes depression, but it’s your fav antidepressant": Knowledge-aware attention
framework for relationship extraction, arXiv preprint arXiv:2009.10155 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hanson</surname>
          </string-name>
          , A. Garcia,
          <article-title>State medical marijuana laws</article-title>
          ,
          <source>in: National Conference of State Legislatures</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Room</surname>
          </string-name>
          ,
          <article-title>Legalizing a market for cannabis for pleasure: Colorado, washington</article-title>
          , uruguay and beyond,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Volkow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Baler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. M.</given-names>
            <surname>Compton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R. B.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <article-title>Adverse health efects of marijuana use</article-title>
          ,
          <source>N. Engl. J. Med</source>
          .
          <volume>370</volume>
          (
          <year>2014</year>
          )
          <fpage>2219</fpage>
          -
          <lpage>2227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Bridgeman</surname>
          </string-name>
          , D. T. Abazia,
          <article-title>Medicinal cannabis: History, pharmacology, and implications for the acute care setting</article-title>
          , P T
          <volume>42</volume>
          (
          <year>2017</year>
          )
          <fpage>180</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Klap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shoai</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. B. Wells</surname>
          </string-name>
          ,
          <article-title>Persistent depression and anxiety in the united states: prevalence and quality of care, Psychiatr</article-title>
          . Serv.
          <volume>59</volume>
          (
          <year>2008</year>
          )
          <fpage>1391</fpage>
          -
          <lpage>1398</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Lankenau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ataiants</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohanty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schrager</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Iverson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. F.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <article-title>Health conditions and motivations for marijuana use among young adult medical marijuana patients and non-patient marijuana users</article-title>
          ,
          <source>Drug Alcohol Rev</source>
          .
          <volume>37</volume>
          (
          <year>2018</year>
          )
          <fpage>237</fpage>
          -
          <lpage>246</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>K. M. Keyes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cerdá</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Schulenberg</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. M. O'Malley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Galea</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Feng</surname>
            ,
            <given-names>D. S.</given-names>
          </string-name>
          <string-name>
            <surname>Hasin</surname>
          </string-name>
          ,
          <article-title>How does state marijuana policy afect US youth? medical marijuana laws, marijuana use</article-title>
          and perceived harmfulness:
          <fpage>1991</fpage>
          -
          <lpage>2014</lpage>
          , Addiction
          <volume>111</volume>
          (
          <year>2016</year>
          )
          <fpage>2187</fpage>
          -
          <lpage>2195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>O.</given-names>
            <surname>Corazza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Assi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Simonato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Corkery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. S.</given-names>
            <surname>Bersani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Demetrovics</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pezzolesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pasinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Deluca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Drummond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Davey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Blaszko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Moskalewicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mervo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Furia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Farre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Flesland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pisarska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Shapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Siemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Skutle</surname>
          </string-name>
          , E. Sferrazza,
          <string-name>
            <given-names>M.</given-names>
            <surname>Torrens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sambola</surname>
          </string-name>
          , P. van der Kreeft,
          <string-name>
            <given-names>N.</given-names>
            <surname>Scherbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schifano</surname>
          </string-name>
          ,
          <article-title>Promoting innovation and excellence to face the rapid difusion of novel psychoactive substances in the EU: the outcomes of the ReDNet project</article-title>
          ,
          <source>Hum. Psychopharmacol</source>
          .
          <volume>28</volume>
          (
          <year>2013</year>
          )
          <fpage>317</fpage>
          -
          <lpage>323</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Burns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roxburgh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bruno</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Van Buskirk</surname>
          </string-name>
          ,
          <article-title>Monitoring drug markets in the internet age and the evolution of drug monitoring systems in australia, Drug Test</article-title>
          .
          <source>Anal. 6</source>
          (
          <year>2014</year>
          )
          <fpage>840</fpage>
          -
          <lpage>845</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Cavazos-Rehg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zewdie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Krauss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Sowles</surname>
          </string-name>
          ,
          <article-title>“no high like a brownie high”: A content analysis of edible marijuana tweets</article-title>
          ,
          <source>Am. J. Health Promot</source>
          .
          <volume>32</volume>
          (
          <year>2018</year>
          )
          <fpage>880</fpage>
          -
          <lpage>886</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Lamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Nahhas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. G.</given-names>
            <surname>Carlson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Boyer</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Sheth, “
          <article-title>retweet to pass the blunt”: Analyzing geographic and content features of Cannabis-Related tweeting across the united states</article-title>
          ,
          <source>J. Stud. Alcohol Drugs</source>
          <volume>78</volume>
          (
          <year>2017</year>
          )
          <fpage>910</fpage>
          -
          <lpage>915</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Lamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zatreh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Nahhas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. W.</given-names>
            <surname>Boyer</surname>
          </string-name>
          , R. G. Carlson, “
          <article-title>you got to love rosin: Solventless dabs, pure, clean, natural medicine.” exploring twitter data on emerging trends in rosin tech marijuana concentrates</article-title>
          ,
          <source>Drug Alcohol Depend</source>
          .
          <volume>183</volume>
          (
          <year>2018</year>
          )
          <fpage>248</fpage>
          -
          <lpage>252</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Anand,
          <string-name>
            <given-names>R.</given-names>
            <surname>Carlson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Z.</given-names>
            <surname>Watkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Falck</surname>
          </string-name>
          ,
          <article-title>PREDOSE: a semantic web platform for drug abuse epidemiology using social media</article-title>
          ,
          <source>J. Biomed. Inform</source>
          .
          <volume>46</volume>
          (
          <year>2013</year>
          )
          <fpage>985</fpage>
          -
          <lpage>997</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>U.</given-names>
            <surname>Lokala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Lamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <surname>Dao:</surname>
          </string-name>
          <article-title>An ontology for substance use epidemiology on social media and dark web</article-title>
          ,
          <source>JMIR Public Health and Surveillance</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>U.</given-names>
            <surname>Lokala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Lamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Nahhas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. I.</given-names>
            <surname>Roden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yadav</surname>
          </string-name>
          , R. G. Carlson,
          <article-title>Global trends, local harms: availability of fentanyl-type drugs on the dark web and accidental overdoses in ohio</article-title>
          ,
          <source>Comput. Math. Organ. Theory</source>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Anand,
          <string-name>
            <given-names>R.</given-names>
            <surname>Carlson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Z.</given-names>
            <surname>Watkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Falck</surname>
          </string-name>
          ,
          <article-title>Predose: a semantic web platform for drug abuse epidemiology using social media</article-title>
          ,
          <source>Journal of biomedical informatics 46</source>
          (
          <year>2013</year>
          )
          <fpage>985</fpage>
          -
          <lpage>997</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Lokala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Illendula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. B.</given-names>
            <surname>Arpinar</surname>
          </string-name>
          ,
          <article-title>What's ur type? contextualized classification of user types in marijuana-related communications using compositional multiview embedding</article-title>
          , in: 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), IEEE,
          <year>2018</year>
          , pp.
          <fpage>474</fpage>
          -
          <lpage>479</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alambo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Daniulaityte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pathak</surname>
          </string-name>
          ,
          <article-title>Let me tell you about your mental health!: Contextualized classification of reddit posts to DSM-5 for web-based intervention</article-title>
          ,
          <source>in: Proceedings of the 27th ACM International Conference on Information and Knowledge Management</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>753</fpage>
          -
          <lpage>762</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alambo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Sain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Thirunarayan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kavuluru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Welton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pathak</surname>
          </string-name>
          ,
          <article-title>Knowledge-aware assessment of severity of suicide risk for early intervention</article-title>
          ,
          <source>in: The World Wide Web Conference</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>514</fpage>
          -
          <lpage>525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Luan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Neural relation extraction with selective attention over instances</article-title>
          ,
          <source>in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <year>2016</year>
          , pp.
          <fpage>2124</fpage>
          -
          <lpage>2133</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Seo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <article-title>Semantic relation classification via bidirectional lstm networks with entity-aware attention using latent entity typing</article-title>
          ,
          <source>Symmetry</source>
          <volume>11</volume>
          (
          <year>2019</year>
          )
          <fpage>785</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Maillard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yogatama</surname>
          </string-name>
          ,
          <article-title>Jointly learning sentence embeddings and syntax with unsupervised tree-lstms</article-title>
          ,
          <source>Natural Language Engineering</source>
          <volume>25</volume>
          (
          <year>2019</year>
          )
          <fpage>433</fpage>
          -
          <lpage>449</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>A.</given-names>
            <surname>Akbik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bergmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Vollgraf</surname>
          </string-name>
          ,
          <article-title>Pooled contextualized embeddings for named entity recognition, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <year>2019</year>
          , pp.
          <fpage>724</fpage>
          -
          <lpage>728</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
          </string-name>
          , et al.,
          <article-title>Language models are few-shot learners</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>14165</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://www.aclweb.org/ anthology/N19-1423. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dligach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bethard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Savova</surname>
          </string-name>
          ,
          <article-title>A bert-based universal model for both within-and cross-sentence clinical temporal relation extraction</article-title>
          ,
          <source>in: Proceedings of the 2nd Clinical Natural Language Processing Workshop</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>71</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ailon</surname>
          </string-name>
          ,
          <article-title>Deep metric learning using triplet network</article-title>
          ,
          <source>in: International Workshop on Similarity-Based Pattern Recognition</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>84</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          , U. Kursuncu,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wickramarachchi</surname>
          </string-name>
          ,
          <article-title>Shades of knowledge-infused learning for enhancing deep learning</article-title>
          ,
          <source>IEEE Internet Computing</source>
          <volume>23</volume>
          (
          <year>2019</year>
          )
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <article-title>Knowledge infused learning (k-il): Towards deep incorporation of knowledge in deep learning</article-title>
          , arXiv preprint arXiv:
          <year>1912</year>
          .
          <volume>00512</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gaur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Kursuncu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wickramarachchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <article-title>Knowledge-infused deep learning</article-title>
          ,
          <source>in: Proceedings of the 31st ACM Conference on Hypertext and Social Media</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>309</fpage>
          -
          <lpage>310</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>C.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chao</surname>
          </string-name>
          , W. Che,
          <article-title>Convolution neural network for relation extraction</article-title>
          ,
          <source>in: International Conference on Advanced Data Mining and Applications</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>231</fpage>
          -
          <lpage>242</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>M.</given-names>
            <surname>Miwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <article-title>End-to-end relation extraction using lstms on sequences and tree structures</article-title>
          ,
          <source>arXiv preprint arXiv:1601.00770</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          ,
          <article-title>Feature assisted stacked attentive shortest dependency path based bi-lstm model for protein-protein interaction</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>166</volume>
          (
          <year>2019</year>
          )
          <fpage>18</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>D.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Combining word-level and character-level representations for relation classification of informal text</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Representation Learning for NLP</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Souza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , C. Fifty,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Weinberger</surname>
          </string-name>
          ,
          <article-title>Simplifying graph convolutional networks</article-title>
          , in: K. Chaudhuri, R. Salakhutdinov (Eds.),
          <source>Proceedings of the 36th International Conference on Machine Learning</source>
          , volume
          <volume>97</volume>
          <source>of Proceedings of Machine Learning Research</source>
          ,
          <string-name>
            <surname>PMLR</surname>
          </string-name>
          , Long Beach, California, USA,
          <year>2019</year>
          , pp.
          <fpage>6861</fpage>
          -
          <lpage>6871</lpage>
          . URL: http://proceedings.mlr. press/v97/wu19e.html.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , W. Lu,
          <article-title>Attention guided graph convolutional networks for relation extraction, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>251</lpage>
          . URL: https://www.aclweb.org/anthology/P19-1024. doi:
          <volume>10</volume>
          .18653/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>