<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Semantic Similarity of Side Effect and Indication Relations of Drugs Inferred from Neural Embedding1</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Keyuan Jiang</string-name>
          <email>kjiang@pnw.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tingyu Chen</string-name>
          <email>chen2694@pnw.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liyuan Huang</string-name>
          <email>huanglydd@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gelareh Karbaschi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gordon R.</string-name>
          <email>gordon.bernard@vanderbilt.edu</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Purdue University Northwest</institution>
          ,
          <addr-line>Hammond, IN 46323</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Vanderbilt University</institution>
          ,
          <addr-line>Nashville, IN 37232</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Patient-reported information on medication-effect experience can contribute to pharmacovigilance, and nowadays patients share their experience on social media which have been investigated for an alternative data source. Extracting the relations between pairs of medication-effect terms from social media data is a challenging task, but inferring the medication-effect relations from known (base) relations using the neural embedding technique seems to be a promising solution. This study aimed at understanding how the similar semantics is carried over from the base relations to inferred relations in the neural embedding of Twitter data. From a set of 99 randomly chosen inferred medicationeffect relations whose associated tweets were manually annotated, we observed that the accuracies of having the inferred relations with the similar semantics to the base relations are 0.586 for medication-side effect relations and 0.688 for medication-indication relations. This demonstrated the utility of inference through relational similarity based upon neural embedding technique.</p>
      </abstract>
      <kwd-group>
        <kwd>Medication-effect Relations</kwd>
        <kwd>Semantic Similarity between Relations</kwd>
        <kwd>Neural Embedding</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Bernard¶
Pharmaceutical products are widely used in modem medical practices, and it is known
that they may have unwanted side effects on human subjects. Typically, some side
effects are identified in pre-market clinical trials, while others are observed after the
medications are put on the market. Some side effects can cause harmful effects to
patients, while others may generate effects with benefit of therapeutic treatments of
unintended symptoms, syndromes or diseases. Reporting of discovery of medication
effects may come from physician’s notes which can be kept in electronic medical
records or from published literature of clinical research. Only those effects of adverse
1 Copyright © 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
nature are reported to regulatory agencies mandatorily by manufacturers and
voluntarily by healthcare professionals and consumers.</p>
      <p>
        Patients are the consumer of the pharmaceutical products and they have the
firsthand experience of medication effects. However, their venues of reporting side effects
are very limited albeit the voluntary nature of reporting adverse events to regulatory
agencies. Knowing medication effects directly from consumers of pharmaceutical
products can help advance medical sciences and improve healthcare. Studies show
that information reported by patients is different than that by healthcare professionals,
in terms of better understanding of adverse experience, better explanation, and more
detailed information [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        The emergence of online social media provides a platform where patients can
easily share experiences including the ones related to medication effects. Various studies
have been conducted in leveraging social media data for possible use in
pharmacovigilance. In 2015, Golder and colleagues collected over 3,000 published articles on
investigating social media data for pharmacovigilance [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Among the published efforts,
much was focused on identifying expressions of adverse events in social media text
data or pairs of medication and effect, but little has been done in understanding how
the identified effects are related to the medications within the same context.
Understanding such relations can help generate hypotheses that may discern the association
between the medications and effects, thus enhancing our understanding of medication
effects.
      </p>
      <p>
        However, identifying the relations between a pair of words (medicine and effect in
our case) is a challenging task in natural language processing (NLP). Posts on general
purpose social media, Twitter in particular, do not necessarily follow the spelling and
grammatical rules, making methods and tools, such as SemRep [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and dependency
parsing [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], designed for formal writing, behave unsatisfactorily.
      </p>
      <p>Inferring potential medication effect relations through the use of relational
similarity, which reasons for less known or unknown relations from known relations, seems
to be a promising approach. This approach does not require formal writing, and it
bases upon the similarity of the relations expressed in the text. For example, to
understand any potential medication-effect relations of Humira (adalimumab), one may
uncover the potential relations by inferring (reasoning) from similar known relations
of medicines other than Humira.</p>
      <p>
        Development in neural embedding of word representations demonstrated
state-ofart results in discovering similar relations between word pairs [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], based upon the
similarities known as linguistic regularities or relational similarities in that the
similarities are between relations [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8">5-8</xref>
        ]. Neural embedding is a technique of generating
vector representations of text by learning from a large corpus of unlabeled data, and
vectors are the input weights of a neural network and embed the semantic and
syntactic information from the context.
      </p>
      <p>
        However, as described in [
        <xref ref-type="bibr" rid="ref5 ref7">5, 7</xref>
        ], relational similarities can be computed by simple
vector operations: offset of two vectors and cosine similarity between the two offset
vectors. The mathematical operations on the vectors do not intuitively demonstrate
how the similar semantics is carried over or inferred from the known (base) relations
to the less known or unknown relations. In this study, we seek to understand how
relational similarities of medication-effect relations handles the semantic similarities
from the neural embedding of Twitter data. The outcome will help determine the
utility of inferring potential medication-effect relations from neural embedding of text
data.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Research of semantic similarity has mainly been focused on medical concepts/terms
rather than relations. Pakhomov and colleagues [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] developed a reference standard of
medical terms annotated by 8 medical residents, and their results indicated the
existence of a measurable mental representation of semantic relatedness between medical
terms which is distinct from similarity and independent of the context. Leveraging the
neural embedding generated by Google’s word2vec2, Zhu and colleagues [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
investigated semantic relatedness and similarities of biomedical terms by examining the
effects of recency, size and section. Fathiamini and colleagues investigated discovery
of therapeutically relevant drug-gene relationships from unstructured text of Medline
abstracts [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and their results demonstrated better performance of the method of
relational similarity than that of attributional similarity.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Method</title>
      <p>In this research, we first infer medication-effect relations through relational similarity
from neural embedding of Twitter data, and later examine the semantic similarity
between the base relations and inferred relations.
3.1</p>
      <sec id="sec-3-1">
        <title>Relational similarity</title>
        <p>If we have the knowledge of known medicine-effect relations, the task of inferring
potential medication-effect relations becomes finding similar relations of the
medicines of interest. For example, if we want to answer a question like: what is the word
or phrase that is to Adderall in the same sense as seizure is to Gabapentin? Here, the
relation between Gabapentin and seizure is known, and we wish to solve (or find) an
effect of Adderall that has a relation similar to Gabapentin-seizure relation. If we
denote medicine:effect as a medicine-effect relation, and use :: for similarity, the
example can be expressed as</p>
        <p>Gabapentin:seizure :: Adderall:?</p>
        <p>
          Mikolov and colleagues [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] demonstrated that such task can be accomplished by
simple algebraic operations of vectors embedding the words: offset and cosine
similarity. The relation of a pair of words can be represented as the offset of the two word
2 https://code.google.com/archive/p/word2vec/
vectors, and the most similar relation to the known one can be determined by
choosing the relation with the highest cosine similarity to the known relation.
        </p>
        <p>Therefore, we have
medicinebase:effectbase :: medicinepotential:effectpotential
(1)
to represent that base relation medicinebase:effectbase and target (or inferred) relation
medicinepotential:effectpotential are similar. Our goal is to find effectpotential of
medicinepotential such that their relation is most similar to the known relation medicinebase:effectbase.
In the vector space model of neural embedding, where each term is a vector, (1) above
becomes</p>
        <p>v(medicinebase) – v(effectbase ) ≈ v(medicinepotential) - v(effectpotential)
which can be rearranged as
or
v(effectbase ) - v(medicinebase) ≈ v(effectpotential) - v(medicinepotential)
(2)
(3)
(4)
v(effectpotential) ≈ v(effectbase) - v(medicinebase) + v(medicinepotential)</p>
        <p>Therefore, the task of inferring medication-effect relations becomes finding effect
vectors which are most similar to the any of v(effectbase) - v(medicinebase) +
(medicinepotential). In implementation, we use all possible known relations
medicinebase:effectbase except those for medicinepotential to infer potential relations for
medicinepotential. Utilization of multiple base relations can help cover more linguistic
variations of expressing the same relation, and increase the confidence of inference.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Semantic similarity</title>
        <p>
          In NLP research, there exists a broad spectrum of relations such as class-inclusion,
part-whole, contrast and cause-purpose, and they have been used in shared tasks such
as SemEval [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Many of the relations are irrelevant to our interest of studying
medication-effect relations. In medical and healthcare domain, there also exist a large
number of relations. U.S. National Library of Medicine published a list of hierarchical
semantic relations in its Unified Medical Language System® (UMLS®)3, and many of
the UMLS Semantic Relations pertain to medication-effect relations, such as treats,
causes, occurs_in, disrupts, exhibit, and produces. An ambitious repository of
semantic predicates4 extracted from the sentences of all Medline citations based upon the
UMLS Semantic Relations has been developed (and regularly updated) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ],
reflecting the fact that there are various linguistic ways of expressing a single semantic
relation.
        </p>
        <p>In this study, we focus on two specific types of semantic relations: medication-side
effect (SE) and medication-indication relations (IND). This treatment is based upon
3 https://www.nlm.nih.gov/research/umls/META3_current_relations.html
4 https://skr3.nlm.nih.gov/SemMedDB/index.html
the available data and facilitates our data analysis tasks. The known relations for base
relations come from the SIDER database (more discussions below) which only
contains the SE and IND relations. The SE relations may be considered for adverse effect
relations whereas IND relations for beneficial effect relations.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Measuring semantic similarity</title>
        <p>
          Given the nature of our data, accuracy of the semantic similarities between base
relations and inferred relations was measured using a method modified from the one
described in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. If an inferred relation whose base relations are of the same type as
itself – for example, an SE inferred relation and its base relations being SE relations,
then the base and inferred relations are said to have a similar semantic relation. If an
inferred relation is different from its corresponding base relations, then they are
semantically dissimilar. Some inferred relations may be yielded from both SE and IND
base relations, they are also considered to have a similar semantic relation. In the case
where a single type of base relations yields inferences of both types of relations, they
are still considered to have similar relations. This treatment helps measure the degree
of semantic similarity between relations. If they are of the same type, they are
semantically similar. Otherwise, they are dissimilar. The definition of our accuracy is as
follows
        </p>
        <p>Accuracyt = It / Bt
(5)
where t is the relation type, side effect (SE) or indication (IND); It is the count of
inferred relations of type t, whereas Bt is the count of inferred relations whose base
relations containing type t.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Annotation</title>
        <p>To study the semantic similarity between the base relations and the inferred relations,
a subset of inferred relations was randomly chosen and their tweets were annotated to
determine their relation type – it was cost prohibitive and almost impractical to
annotate tweets associated with all the inferred relations. If a tweet pertains to a
medication-side effect relation, it is labeled as SE, and if it describes a medication-indication
(or beneficial effect), it is marked as IND. If a tweet is neither about SE nor IND
relation, it is labeled as “_” (underscore). A draft of annotation guideline was developed,
and 100 tweets were first annotated based upon the draft guideline. The guideline and
annotation were refined to establish a good standard of annotation, with which the rest
of tweets were annotated and reviewed.
3.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Data</title>
        <p>Several sets of data were utilized in this study: a list of medication names, a corpus of
unannotated tweets related to medications, a collection of known medicine-effect
pairs, and the Consumer Health Vocabulary (CHV).</p>
        <p>Twitter data were chosen for the rationale that in many instances, medication and
effect(s) can be found in a single post, and hence a relation can be contained in an
individual post. The Twitter data were gathered by searching for tweets with
medication names as keywords. Two lists of top 100 drugs, by sales and by units, were
obtained from drugs.com, and they were combined by removing the duplicates. The
combined drug list was further expanded by including generic and brand names of
these medications to facilitate querying related tweets.</p>
        <p>A collection of unlabeled tweets, related to medication names discussed above,
was retrieved through the use of a home-made crawler of twitter.com. Twitter has its
own spam filter for its web interface, and Twitter posts gathered at twitter.com seem
to be cleaner than those collected via Twitter APIs. In the summer of 2017, a total of
53 million tweets were collected with the time span between the inception of
twitter.com (March of 2006) and the time of collection. After preprocessing which
removes non-English, duplicate tweets, and tweets with a URL (which are considered
mostly commercial), there were 12 million “clean” tweets. Phrases in the tweets were
learned with GenSim5 to treat multiple word terms as single unites. This set of tweets
was further filtered by a list effect terms to ensure that each tweet contains at least a
medication name and an effect expression. The effect term list was created by
compiling Consumer Health Vocabulary (CHV) terms related to effects listed in the SIDER
database. The resultant corpus of 3.6 million “clean” and filtered tweets served as the
data for learning the neural embedding representation with word2vec.</p>
        <p>
          SIDER is an online resource of side effects, hosted at the European Molecular
Biology Laboratory (EMBL), and contains side effect information of marketed
pharmaceutical products [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Two sets of data from SIDER were compiled. The first one
contains all the terms for medication effects and their corresponding CHV
expressions. The alignment of the SIDER and CHV terms was mapped through the UMLS
CUIs (Concept Unique Identifiers). This set was used to filter out tweets without any
effect expressions. The second set is a collection of lists of medication-effect pairs for
each study medicine that exists in SIDER. In SIDER, a medicine has a list of side
effects and a list of indications. This collection of medication-effect pairs served as
the guidance for base relations, which are known, to infer potential medicine-effect
relations.
        </p>
        <p>
          The Consumer Health Vocabulary (CHV), a collection of words and phrases which
consumers use to express health concepts and represent the mapping between the
consumer expressions and technical terms used by healthcare professionals [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], was
utilized to cover various ways of expressing concepts related to medication effects.
Each individual effect concept in SIDER was expanded by including the
corresponding CHV terms. Mapping between the SIDER terms and CHV terms was done by
linking the identical CUIs. The expanded version of effect terms was then used to
identify base relations and infer potential relations.
5 https://radimrehurek.com/gensim/
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>Our inference using the data described above generated a total of 5,182 potential
medicine-effect relations from a collection of 3,184 unique base relations. Among inferred
relations, 1,448 relations are known, meaning that they are found in the SIDER
database, and 3,734 relations do not exist in the SIDER database. In inferring potential
relations, the 3,184 unique base relations were utilized in a total of 78,369 times,
indicating that many relations were used multiple times for inference for different
medications.</p>
      <p>To verify the semantic similarity, a collection of 100 inferred relations was
randomly chosen using the random number generator at random.org – one of the 100
relations selected was dropped due to the ambiguity of the medication, leaving 99
inferred relations for annotation. A total of 3,492 unique tweets related to this set of
inferred relational were annotated to determine the relation type of each inference.</p>
      <p>Shown in Table 2 are the counts of inferred relations from different base relations
by inference type: SE only, IND only, both SE and IND, and neither SE and IND.
Forty-one (41) inferred relations are SE only and their corresponding base relations
contain SE relations. And three (3) inferred relations are SE and their corresponding
base relations contains IND relations, indicating that the inferred relations are
semantically dissimilar to the base relations.</p>
      <p>Numbers in the first data row of Table 3 come from combining the boldfaced
numbers of the corresponding column of Table 2. In other words, 58 = 41 + 17,
representing the counts of inferred relations containing SE relations. Figures in the second row
are the counts of inferred relations whose base relations are of the same type. For
example, there are sixteen (16) inferred relations whose base relations contain IND
relations. That is to say that there are supposed to be 16 inferred IND relations, but
the results show that there are only 11 inferred IND relations. The accuracy of
semantic similarity for IND relations is 0.688 (=11/16). Similarly, the accuracy of semantic
similarity for SE relations is 0.586 (=58/99).
5</p>
    </sec>
    <sec id="sec-5">
      <title>Discussions</title>
      <p>For 99 inferred relations, all of them are associated with base SE relations, and only
16 of them are associated with base IND relations (Table 1). This implies that in an
ideal situation, there would be 99 inferred SE relations and 16 inferred IND relations.
Please note that for the 16 inferred relations, their corresponding base relations
contain both SE and IND relations. Or in other words, both SE and IND base relations
were used to draw the same inferred relations.</p>
      <p>Either type of base relations does not always generate the same (correct) type of
inferred relation. For SE relation inferences (Table 2), SE relations are the base
relations for all 99 inferred relations, but only 41 inferences are solely SE relations, and
21 have a mixture of both SE and IND relations. Interestingly, there are 26 inferred
IND relations which are based upon known base SE relations, and 15 are neither SE
nor IND relations. The inferences from base IND relations are similar. Sixteen
inferences are based upon known IND relations: 4 are solely IND relations, 3 are SE
relations, 7 are a mixture of SE and IND relations, and 2 are neither.</p>
      <p>If we combine inferred SE only and both SE and IND relations for SE relations
(boldfaced numbers Table 2), then 58 out of 99 relations were inferred correctly, and
for the IND relations, 11 out of 16 inferred IND relations were correct. This yields the
accuracy of semantic similarity for SE relations (0.586) and that for IND relations
(0.688), demonstrating the utility of the approach of relational similarity.</p>
      <p>There may be two possible reasons why opposite (dissimilar) relation types are
observed. First, the information of negation may not be embedded properly in the neural
embedding, yielding inferences of opposite (dissimilar) relation type. Another
possible reason may come from the fact that there does not exist a practical way to extract
tweets by any particular relations because a relation is a vector of real values which
do not corresponding to particular tweets. Instead, we extracted tweets associated with
a particular relation by string match of the medication-effect pair. This may cause
extraction of some unrelated tweets.</p>
      <p>For the observation of inferred relations which are neither SE nor IND, this may be
attributed to the nature of inference based upon the vector manipulations: offset and
cosine similarity. They are pure mathematic operations whose results may not
correspond to any relations in the data.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this study, we investigated the accuracy of semantic similarity between the known
base relations and inferred relations. Accuracies for both SE and IND relations
demonstrated the utility of the approach using relational similarity to infer potential
medication-effect relations, although further improvement will be needed to improve
the accuracy and human annotation will be needed to remove false inferences.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>Authors wish to thank anonymous reviewers for their critiques and constructive
comments which helped improve the manuscript. This work was supported in part by
the U.S. National Institutes of Health Grant 1R15LM011999-01.</p>
    </sec>
    <sec id="sec-8">
      <title>Ethics Compliance</title>
      <p>The protocol of this project was reviewed and approved for compliance with the
human subject research regulation by the Institutional Review Board of Purdue
University.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Härmark</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raine</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leufkens</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>I. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moretti</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarinic</surname>
            ,
            <given-names>V. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kant</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Patient-reported safety information: a renaissance of pharmacovigilance?</article-title>
          .
          <source>Drug safety</source>
          ,
          <volume>39</volume>
          (
          <issue>10</issue>
          ),
          <fpage>883</fpage>
          -
          <lpage>890</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Golder</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Norman</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Loke</surname>
            ,
            <given-names>Y. K.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Systematic review on the prevalence, frequency and comparative value of adverse events data in social media</article-title>
          .
          <source>British journal of clinical pharmacology</source>
          ,
          <volume>80</volume>
          (
          <issue>4</issue>
          ),
          <fpage>878</fpage>
          -
          <lpage>888</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Rindflesch</surname>
            ,
            <given-names>T. C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Fiszman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text</article-title>
          .
          <source>Journal of biomedical informatics</source>
          ,
          <volume>36</volume>
          (
          <issue>6</issue>
          ),
          <fpage>462</fpage>
          -
          <lpage>477</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>De Marneffe</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MacCartney</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          (
          <year>2006</year>
          , May).
          <article-title>Generating typed dependency parses from phrase structure parses</article-title>
          .
          <source>In LREC (Vol. 6</source>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>454</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yih</surname>
            ,
            <given-names>W. T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zweig</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Linguistic regularities in continuous space word representations</article-title>
          .
          <source>In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          (pp.
          <fpage>746</fpage>
          -
          <lpage>751</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Efficient estimation of word representations in vector space</article-title>
          .
          <source>International Conference on Learning Representations</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Levy</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Linguistic regularities in sparse and explicit word representations</article-title>
          .
          <source>In Proceedings of the eighteenth conference on computational natural language learning</source>
          (pp.
          <fpage>171</fpage>
          -
          <lpage>180</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Turney</surname>
            ,
            <given-names>P. D.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Similarity of semantic relations</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ),
          <fpage>379</fpage>
          -
          <lpage>416</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Pakhomov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McInnes</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adam</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedersen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Melton</surname>
            ,
            <given-names>G.B.</given-names>
          </string-name>
          ,
          <year>2010</year>
          .
          <article-title>Semantic similarity and relatedness between clinical terms: an experimental study</article-title>
          .
          <source>In AMIA annual symposium proceedings</source>
          (Vol.
          <year>2010</year>
          , p.
          <fpage>572</fpage>
          ). American Medical Informatics Association.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec. BMC medical informatics and decision making</article-title>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <fpage>95</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fathiamini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holla</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>N.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meric-Bernstam</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bernstam</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <year>2019</year>
          .
          <article-title>Rapamycin-mTOR+ BRAF=? Using relational similarity to find therapeutically relevant drug-gene relationships in unstructured text</article-title>
          .
          <source>Journal of biomedical informatics</source>
          ,
          <volume>90</volume>
          , p.
          <fpage>103094</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Jurgens</surname>
            ,
            <given-names>D. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turney</surname>
            ,
            <given-names>P. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Holyoak</surname>
            ,
            <given-names>K. J.</given-names>
          </string-name>
          (
          <year>2012</year>
          , June). Semeval
          <article-title>-2012 task 2: Measuring degrees of relational similarity</article-title>
          .
          <source>In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation</source>
          (pp.
          <fpage>356</fpage>
          -
          <lpage>364</lpage>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kilicoglu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosemblat</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fiszman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rindflesch</surname>
            ,
            <given-names>T. C.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Constructing a semantic predication gold standard from the biomedical literature</article-title>
          .
          <source>BMC bioinformatics</source>
          ,
          <volume>12</volume>
          (
          <issue>1</issue>
          ),
          <fpage>486</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Letunic</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>L. J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bork</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The SIDER database of drugs and side effects</article-title>
          .
          <source>Nucleic acids research</source>
          ,
          <volume>44</volume>
          (
          <issue>D1</issue>
          ),
          <fpage>D1075</fpage>
          -
          <lpage>D1079</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Zeng</surname>
            ,
            <given-names>Q. T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Tse</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Exploring and developing consumer health vocabularies</article-title>
          .
          <source>Journal of the American Medical Informatics Association</source>
          ,
          <volume>13</volume>
          (
          <issue>1</issue>
          ),
          <fpage>24</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>