A Novel Approach for Adverse Events Detection

            Jia Zhu1 , Yuzhi Liang2 , Changqin Huang*1 , Min Yang2 , Jing Xiao1 , Yong Tang1
             1
               School of Computer Science, South China Normal University, Guangzhou, China
                                         *cqhuang@scnu.edu.cn
              2
                School of Computer Science, The University of Hong Kong, Hong Kong, China


                         Abstract                                      incidence of an adverse event or medication error in the U.S.
                                                                       population.
     Adverse events detection is critical in many fields,                 Since there is no sufficient medical data or authoritative ev-
     e.g., adverse drug event (ADE) detection in medi-                 idence to identify the correlations between drugs and adverse
     cal field. ADE is an unexpected and harmful con-                  events quickly, it is challenging but extremely necessary to
     sequence of drug usage. Many researchers found                    detect ADE effectively using other methods. In this paper,
     that identifying the correlation between the use of               we focus on using the data extracted from biomedical liter-
     drugs and adverse events from biomedical literature               ature according to [Winnenburg and Shah, 2016]. We pro-
     can contribute a lot to drug safety monitoring. In                pose a novel graph-based approach called G-ADE to detect
     this paper, we propose a novel approach based on                  ADE. In G-ADE, we first construct a graph using candidate
     biomedical literature to detect ADE. We first con-                ADE extracted from biomedical literature. We then propose
     struct a graph using candidate ADE extracted from                 a novel method to select important vertices from the graph
     biomedical literature, and then propose a method                  as core vertices, and design an algorithm using these core
     to select critical vertices from the graph as core                vertices for clustering in order to build a set of subgraphs.
     vertices with a clustering algorithm to group these               This step is our main contribution in this paper because find-
     core vertices to build subgraphs. Lastly, the cor-                ing core vertices in a graph has been proved extremely use-
     relations between drugs and events are calculated                 ful for later processing [Yang et al., 2016; Min et al., 2009;
     based on the subgraphs for ADE detection. Exper-                  Yang et al., 2017; Yang and Chow, 2014]. Lastly, the cor-
     iments show that our approach is highly feasible.                 relation between each drug-event pair is calculated based on
                                                                       the subgraphs we constructed in the previous step to identify
1   Introduction                                                       ADE. We have also perform a few experiments to validate our
Adverse events detection, such as adverse drug event (ADE)             work using a gold standard of drug-adverse event correlations
detection, is very important in drug safety assessment. It en-         spanning 159 drugs and four events.
compasses all scientific and data gathering activities relating           The rest of this paper is organized as follows. In Section
to the detection, assessment, and understanding of adverse             2, we describe details of G-ADE including a core vertices se-
events of medical products through the product life cycle              lection algorithm with subgraphs generation. In Section 3, we
[Saless, 2005]. ADE refers to any unfortunate medical and              describe our experiments with evaluation methods and result
health events that occur during the course of drug treatment,          analysis, and compare G-ADE with other methods. We also
and this event does not necessarily have a causal relationship         conclude and discuss this study in Section 4.
with drug therapy [Liu et al., 2016; Cai et al., 2017]. ADE
might be caused by several drugs interacting with each other           2 Proposed Approach
when administered concomitantly. Though drug standards                 The overview of G-ADE is described in Figure 1. The process
are guided for the production and usage of drugs, it will still        can be summarized as follows. Firstly, we extract all associ-
lead to the occurrence of ADE even if we obey the criteria             ated pairs between all adverse events and drugs from MED-
drug standards.                                                        LINE according to [Winnenburg and Shah, 2016], and use
   As we all know, traditionally drug safety monitoring re-            these pairs as basic knowledge to construct a fully connected
lies on data from spontaneous reporting systems (SRS), such            drug-event graph. These pairs represent a list of potential
as the US Food and Drug Administration’ (FDA) Adverse                  ADE to be validated. Secondly, we propose a core vertices
Event Reporting System (FAERS), which contains reports of              selection algorithm to select core vertices in the drug-event
suspected ADE submitted by health care providers, manu-                graph we built. Thirdly, we adopt an algorithm to generate
factures, and patients. However, FAERS data do have some               a set of subgraphs based on the core vertices obtained in the
limitations. For example, FDA does not receive reports for             previous step based on the idea from [Min et al., 2009] for
every adverse event or medication error that occurs in a prod-         the purpose of key information extraction. Finally, we con-
uct. Therefore, FAERS data is not sufficient to calculate the          nect all the subgraphs into a graph that in fact is a smaller size

                                                                  41
of the original graph, and compute the similarity of all drug-            Algorithm 1 Core Vertices Selection Algorithm
event pairs based on this new graph. If the drug-event pair’s             Require: G = (V, E), a drug-event graph, t, a threshold
similarity is higher than the threshold we set up, we will rec-               used to filter
ognize it as ADE. We will introduce the details of each step              Ensure: List, a sorted list of core vertices
in the following sections.                                                 1: learns the vectors of the V , vectors are the DeepWalk
                                                                              Vectors
                                                                           2: for each v 2 V do
                                                                           3:     for each Hi 2 H do
                                                                           4:         compute Distv,Hi , the distance between v and
                                                                              Hi using vectors
                                                                           5:     end for
                                                                           6:     compute v’score using Equation (1)
                                                                           7: end for
                                                                           8: sorted v by its score, and put v which score is higher than
                                                                              t into List


                Figure 1: Overview of G-ADE

2.1   Core Vertices Selection
As we described earlier, the key is to select critical vertices as                    Figure 2: Example of Personal Rank
core vertices in order to construct subgraphs. We propose a
method that computes each vertex’s importance based on vec-
tors generated by DeepWalk [Perozzi et al., 2014]. DeepWalk                  where P Ri represents the score of importance of vertex i,
has been proved successfully in social networks and graph                 ↵ is the probability of walking, outi represents the out de-
analysis. It learns the latent representations by modeling a              gree of vertex i and inj represents the in degree of vertex j,
stream of short random walk, and then encodes it in a con-                vcore represents the core vertex. As shown in Figure 2, we set
tinuous vector space with low dimensions. In our purposed                 P R of core vertices(A,B,C) to 1 and other vertices(a,b,c,d)
method, we apply DeepWalk to the drug-event graph to get                  to 0 initially. Therefore, we have P R(A) = P R(B) =
a 64-dimensions vectors for each vertex. Let G = (V, E) to                P R(C) = 1, and P R(a) = P R(b) = P R(c) = P R(d) = 0
be the drug-event graph, v 2 V , which represents a drug or               in the beginning, then we starts to walk from the vertex with
an event. H is the set of neighbor vertices of v, Hi 2 H, n               P R 6= 0. The probability of walking is ↵, correspondingly,
is the number of the neighbor vertices. We have Equation (1)              the probability of stopping is 1 ↵. For example, the proba-
to compute the score of importance for each vertex. The core              bility of walking from A to a or c is 0.5 as a and c share the
vertices selection algorithm to adopt the score is described in           importance of A, that is P R(a) = P R(c) = P R(A) ⇤ 0.5.
Algorithm 1.                                                              The process then continues to walk with probability of ↵ or
                                                                          stop and go back to A with probability of 1 ↵. The process
                                P
                                n
                                                                          will keep running until the P R of each vertex is stable. We
                                    Distv,Hi
                                                                          select up to top-100 highest PR vertices that attached to each
                    score = i=1                               (1)         core vertex to construct subgraphs. In other words, we will
                                     n
                                                                          get a set of clusters/subgraphs in the end of this process.
2.2   Personal Rank for Clustering
Once we have a list of core vertices, we adopt Personal                   2.3 Mining Adverse Drug Events
Rank(PR) [Haveliwala, 2003] algorithm to generate associ-
                                                                          As now we have a set of clusters(subgraphs), we next want
ated subgraphs. PR is a graph clustering algorithm based
                                                                          to transform all the clusters into a graph for further pro-
on random walk [Haveliwala, 2003], which performs well in
                                                                          cessing. We define Sg is the set of clusters we got in the
recommendation system and social network [Li et al., 2012;
                                                                          previous step, and G = (V, E) is the original graph. If
Shen et al., 2016]. The basic idea of PR is similar to PageR-
                                                                          Ga = (Va , Ea ) and Gb = (Vb , Eb ) are two clusters in Sg ,
ank [Bianchini et al., 2005]. Firstly, the algorithm computes
                                                                          we put Ga and Gb into a new graph if there is an edge be-
the score of importance of each vertex in the network, and
                                                                          tween va and vb in G, and va 2 Va , vb 2 Vb . We con-
then sorts each node according to the score of the importance.
                                                                          tinue this process until all possible clusters are checked, and
Eventually, the algorithm outputs Top-N vertices. The score
                                                                          a new graph is generated eventually. We then calculate the
of importance is defined in Equation (2):
                                                                          similarity between drug vertices and event vertices in this
                    X P Rj             ⇢                                  new graph according to their associated vectors generated by
                                         1         i = vcore
 P Ri = (1 ↵)ri +↵              , ri =                         (2)        DeepWalk. Note that the reason we run DeepWalk again on
                        |outi |          0        i 6= vcore ,            the graph to generate new vector for each vertex because the
                   j2in     i


                                                                     42
structure of the graph is changed. If two vertices’ associated                                                          Drugs
vectors are very similar, then means these two vertices has                 Aggregation Outcome
very closed information in the graph [Perozzi et al., 2014;                                                     Positive    Negative
Cunchao Tu, 2016; Yang and Liu, 2015; Zhu et al., 2016;
2012]. In other words, this drug-event pair has higher proba-               Acute kidney injury                    23           59
bility to be an ADE. Therefore, according to the similarity of
each pair, we can mine ADE from the graph if the similarity                 Acute liver injury                     79           33
of drug-event pair is higher than a certain threshold we set.
                                                                            Acute myocardial infarction            33           61

3       Experiments                                                         GI bleed                               24           62
In this section, we will introduce the evaluation we have done              Total                                 159           215
on OMOP standard data set [Ryan et al., 2013]. The exper-
imental results show evidence of significant improvement of
our proposed approach over baseline methods.                                                Table 2: OMOP data set

3.1       Datasets
                                                                         3.2 Baseline Methods
Literature Data Set
                                                                         In order to evaluate the feasibility of G-ADE and the core
The data set used in this paper can be downloaded from the               vertices selection algorithm, we use the following two meth-
website1 . It consists of candidate ADE pairs extracted from             ods(BG and CVTn) as the baseline methods.
MeSH term indexes of all 366k articles in MEDLINE that
are indexed with certain combinations of MeSH terms and                    * Basic Graph(BG): In this approach, We firstly con-
qualifiers according to [Winnenburg and Shah, 2016]. The                     struct a graph using all candidate ADE extracted from
creation of this data set is described in detail in [Winnenburg              biomedical literature as described in Section 3.1. We
et al., 2015]. We extract PMID-Drug pairs (paper id and drug)                obtain the corresponding vector of event and drug
and PMID-Event pairs (paper id and event) from data set. The                 through the DeepWalk algorithm. Finally, their corre-
detail data set is shown in Table 1.                                         lations are calculated by cosine similarity, namely, basic
                                                                             graph(BG).
                      Name                            Number               * Core Vertices from Top n Nodes(CVTn): In this
                                                                             method, we construct the graph using the same measure
                       Pmid                             366k                 as BG. We then simply select top N nodes as the core
                                                                             vertices according to the degree of the node. We rebuild
                       Drug                             3416                 new subgraphs using the core vertices with a clustering
                      Event                             1602                 algorithm. Lastly, the correlation/similarity between the
                                                                             drug and the event is calculated based on the subgraphs.
                Pmid - Drug pairs                       418k                 We name it as CVTn.

                Pmid - Event pairs                      552k             3.3 Evaluation Standard
                                                                         In the experiment, we calculate the similarity between event
                   Table 1: Experimental data set                        and drug to according to the associated vector of each ver-
                                                                         tex. The cosine similarity measure [On, 2008] between two
                                                                         vectors is used that calculates the cosine of the angle between
Gold Standard Data Set                                                   them, and the cosine similarity formula is defined as:
We evaluate the experimental performance against the drug                                                         !          !
safety reference set established by the observational medical                                                 VEvent ⇥ VDrug
                                                                               SimilarityEvent,Drug =            !            !        (3)
outcome partnership (OMOP) [Ryan et al., 2013]. This set                                                   ||VEvent || ⇥ ||VDrug ||
contains 399 drug-outcome pairs, covering 183 drugs from
several drug classes and four significant and actively moni-                The higher the similarity is, the more this event relates to
tored adverse event outcomes: acute myocardial infarction,               the drug. At the same time, we verify the accuracy of the stan-
acute renal failure, acute liver injury, and upper gastrointesti-        dard data set by different thresholds. If the similarity between
nal bleeding. Here we only select the positive part that indi-           drugs and events is higher than the threshold, we think they
cates ADE as the ground truth to validate our approach. Note             are relevant. That is, the drug has an adverse effect on the
that we have removed a few drugs that cannot be found in our             event. Therefore, we can calculate the predication accuracy
graph. The data set is shown in Table 2.                                 as: Accuracy = N/M , where N is the number of the drug-
                                                                         even pair matched the ADE in OMOP and M is the number
    1
        ftp://nlmpubs.nlm.nih.gov/online/mesh/2015/meshtrees/            of ADE in OMOP.

                                                                    43
                         CVTn                   G-ADE                   side effects from medication and hence could lead to interest-
      Thr   BG                                                          ing clinical results.
                    PR      K-means        PR      K-means
      0.1   39.6% 42.1%      42.8%       50.3%      48.4%               5 Acknowledgement
      0.2   74.8% 75.4%      76.1%       79.2%      78.0%               This work was partially supported by the National
      0.3   85.5% 86.8%      86.1%       91.1%      89.9%               Natural Science Foundation of China (61370229),
                                                                        the Natural Science Foundation of Guangdong
                                                                        Province, China(2015A030310509), the Public Re-
      Table 3: Prediction Accuracy in the OMOB data set.                search and Capacity Building in Guangdong Province,
                                                                        China(2016A030303055), the S&T Projects of Guangdong
                                                                        Province, China(2015A030401087, 2016B030305004,
3.4     Evaluation Results                                              2016B010109008), and the S&T Project of Guangzhou
The overall performance of G-ADE and two base-                          Municipality(201604010054).
lines(BG,CVTn) methods in terms of accuracy at different
thresholds using the OMOP data set as reference is summa-
rized in Table 3. Note that since the third step for clustering
of CVTn and G-ADE can both be replaced by other clustering
algorithms, e.g., K-means [Kanungo et al., 2002]. Therefore,
we have also provided the evaluation results produced by K-
means. In our experiments, we set K = 10 as this setting can
achieve the best performance on our dataset.
   From the Table 3, we can clearly see that G-ADE outper-
forms the BG and CVTn methods with different clustering
algorithms (Personal Rank and K-means) at different thresh-
olds. Obviously, the prediction accuracy is improved for all
approaches with higher threshold. On the other hand, though
CVTn method performs better than the BG method but not
as good as G-ADE with both clustering algorithms, which
demonstrates that the core vertices selection plays an impor-
tant role to improve prediction accuracy. Overall, our pro-
posed approach based on core vertices selection and PG clus-
tering for the detection of adverse drug events is feasible as
91.1% prediction accuracy is achieved. Therefore, we can
conclude that through the core vertices selection and personal
rank clustering algorithm, the accuracy of adverse drug event
detection is effectively improved because the robust perfor-
mance is shown in our experiments.

4     Conclusions
Adverse drug event (ADE), defined as adverse patient out-
comes caused by medications, is a common issue but diffi-
cult to detect. In this paper, We propose an approach called
G-ADE based on biomedical literature to detect ADE. We
first construct a graph using candidate ADE extracted from
biomedical literature. We then propose a core vertices selec-
tion algorithm to select important vertices from the graph as
core vertices, and design a PR algorithm using the core ver-
tices for clustering to build subgraphs. Last but not least, the
correlation between the drug and the event is calculated using
the vector generated by DeepWalk based on the subgraphs. If
the correlation value is sufficiently high, we take this pair as
ADE. Our experimental results show that G-ADE performs
better than two baseline methods in a few scenarios. In the
future, we will pay more attention to build subgraphs on en-
semble different clustering methods to further improve the
performance and attempt to gain data from social networking
as this is where a lot of patients may talk about uncommon

                                                                   44
References                                                            [Winnenburg and Shah, 2016] R. Winnenburg and NH Shah.
[Bianchini et al., 2005] Monica Bianchini, Marco Gori, and              Generalized enrichment analysis improves the detection of
   Franco Scarselli. Inside pagerank. Acm Transactions on               adverse drug events from the biomedical literature. BMC
   Internet Technology, 5(1):92–128, 2005.                              Bioinformatics, 17(1):1–17, 2016.
[Cai et al., 2017] Ruichu Cai, Mei Liu, Yong Hu, Brittany L.          [Winnenburg et al., 2015] Rainer Winnenburg, Alfred Sor-
   Melton, Michael E. Matheny, Hua Xu, Lian Duan, and                   bello, Anna Ripple, Rave Harpaz, Joseph Tonning, Ana
   Lemuel R. Waitman. Identification of adverse drug-drug               Szarfman, Henry Francis, and Olivier Bodenreider. Lever-
   interactions through causal association rule discovery from          aging medline indexing for pharmacovigilance - inherent
   spontaneous adverse event reports. Artificial Intelligence           limitations and mitigation strategies. Journal of Biomedi-
   in Medicine, 76:7–15, 2017.                                          cal Informatics, 57(C):425, 2015.
[Cunchao Tu, 2016] Zhiyuan Liu Maosong Sun Cunchao Tu,                [Yang and Chow, 2014] Min Yang and Kam-Pui Chow. Au-
   Weicheng Zhang. Max-margin deepwalk: Discriminative                  thorship attribution for forensic investigation with thou-
   learning of network representation. In IJCAI, pages 1–7,             sands of authors. In IFIP International Information Se-
   2016.                                                                curity Conference, pages 339–350. Springer Berlin Hei-
                                                                        delberg, 2014.
[Haveliwala, 2003] Taher H Haveliwala. Topic-sensitive
   pagerank. In International Conference on World Wide                [Yang and Liu, 2015] Cheng Yang and Zhiyuan Liu. Com-
   Web, pages 517–526, 2003.                                            prehend deepwalk as matrix factorization. Computer Sci-
[Kanungo et al., 2002] T. Kanungo, D. M. Mount, N. S. Ne-               ence, pages 1–7, 2015.
   tanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu. An              [Yang et al., 2016] Min Yang, Jincheng Mei, Fei Xu, Went-
   efficient k-means clustering algorithm: Analysis and im-             ing Tu, and Ziyu Lu. Discovering author interest evolution
   plementation. IEEE Trans. Pattern Analysis and Machine               in topic modeling. In Proceedings of the 39th International
   Intelligence., 24:881–892, 2002.                                     ACM SIGIR conference on Research and Development in
[Li et al., 2012] Xin Li, Xin Su, and Mengyue Wang. So-                 Information Retrieval, pages 801–804. ACM, 2016.
   cial network-based recommendation:a graph random walk              [Yang et al., 2017] Min Yang, Dingju Zhu, Yong Tang, and
   kernel approach. In Proceedings of the 12th ACM/IEEE-                Jingxuan Wang. Authorship attribution with topic drift
   CS joint conference on Digital Libraries, pages 409–410,             model. In The Thirty-First AAAI Conference on Artificial
   2012.                                                                Intelligence, pages 1–4. AAAI, 2017.
[Liu et al., 2016] Jing Liu, Songzheng Zhao, and Xiaodi               [Zhu et al., 2012] Jia Zhu, Qing Xie, and Eun Jung Chin. A
   Zhang. An ensemble method for extracting adverse                     hybrid time-series link prediction framework for large so-
   drug events from social media. Artificial Intelligence in            cial network. Database and Expert Systems Applications,
   Medicine, 70:62–76, 2016.                                            pages 345–359, 2012.
[Min et al., 2009] Wu Min, Xiaoli Li, Chee Keong Kwoh,                [Zhu et al., 2016] Jia Zhu, Qing Xie, Shoou-i Yu, and
   and See Kiong Ng. A core-attachment based method to                  Wai Hung Wong. Exploiting link structure for web page
   detect protein complexes in ppi networks. BMC Bioinfor-              genre identification. Data Mining and Knowledge Discov-
   matics, 10(1):169, 2009.                                             ery, 30(3):550–575, 2016.
[On, 2008] Byung Won On. Social network analysis on
   name disambiguation and more. In International Confer-
   ence on Convergence and Hybrid Information Technology,
   pages 1081–1088, 2008.
[Perozzi et al., 2014] Bryan Perozzi, Rami Al-Rfou, and
   Steven Skiena. Deepwalk: online learning of social repre-
   sentations. In ACM SIGKDD International Conference on
   Knowledge Discovery and Data Mining, pages 701–710,
   2014.
[Ryan et al., 2013] P. B. Ryan, M. J. Schuemie, E Welebob,
   J Duke, S Valentine, and A. G. Hartzema. Defining a refer-
   ence set to support methodological research in drug safety.
   Drug Safety, 36(1):33–47, 2013.
[Saless, 2005] Fatieh Saless. Review of fda guidance on
   risk management: Good pharmacovigilance practices and
   pharmacoepidemiologic assessment. Genetic Engineering
   News, 25(18):14–0, 2005.
[Shen et al., 2016] B Shen, B Hu, and H Zhang. Method for
   the analysis of the preferences of network users. Networks
   Iet, 5(1):8–12, 2016.

                                                                 45