-

Open Drug Knowledge Graph

Mark Mann

Filip Ilievski

Mohammad Rostami

Aastha

aastha@usc.edu 1

Basel Shbita

shbitag@isi.edu 0 0 Information Sciences Institute , Marina del Rey, CA 90292 , USA 1 University of Southern California , Los Angeles CA 90007 , USA

Automatic knowledge-based systems can assist medical professionals in making more informed recommendations and decisions. Unfortunately, as no comprehensive knowledge base (with both medical and non-medical) knowledge exists today, much manual e ort is required to consolidate knowledge across sources heterogeneous in content and formats. This paper proposes a knowledge-based method that aims to harmonize four such heterogeneous sources into a single drug-centric knowledge graph. The graph is based on the drugs found in Wikidata and extended with specialized sources through an extraction and transformation pipeline, including data acquisition, entity resolution, and semantic modeling. Our analyses show that the resulting graph and its embeddings can capture drug similarity through their associated symptoms and address common, knowledge-intensive medical search scenarios. As such, it holds the promise to be adapted for drug recommendation in the future. Given the modular setup of our method, new sources can be included to accommodate healthcare object use cases relating to diagnoses and claims. We make the resulting knowledge source available in both relational database and property graph format.

Healthcare systems heavily rely on the knowledge and the experiences of the physicians for drug prescription based on diagnosed symptoms of patients. Despite being dominant, this traditional process is limited to the knowledge scope of one person and faces several challenges. 1. Several di erent types of drugs may be appropriate to treat the same disease.

Other (non-medical) factors such as price, accessibility, and insurance policy may help healthcare professions reach optimal decisions in such situations.

mons License Attribution 4.0 International (CC BY 4.0). 2. Many healthcare professionals who are not physicians are not supposed to prescribe drugs in normal situations. Yet, they may need to act upon symptoms that they can diagnose in emergencies to initiate treatment before an accurate examination can be performed by a doctor. 3. When a novel disease emerges, clinical data and standard treatment protocols are limited in the beginning, as in COVID19. Physicians may want to search for all potential existing drugs which may have a positive e ect given the observed symptoms of disease and then re-purpose them for potential early-stage treatment options [ 9 ]. 4. Patients may desire to be more involved in the prescription process, e.g., knowing more about particular drugs and their side e ects to improve the prescription process. Patients also may need to nd the right drug at a reasonable price to purchase, particularly in the case of over-the-counter drugs.

Automated knowledge-based systems could assist with such tasks that involve intelligent searching of a database to arrive at a valid conclusion. We observe that, while a set of very valuable sources is publicly available, no comprehensive database exists that can accommodate the listed challenges. Existing medicinal drug databases, e.g., DrugBank3, are helpful, but they are more similar to specialized encyclopedias. These databases are mostly unstructured with an abundant amount of thorough information about each entity, scattered across documents and not tailored to particular use cases. As a result, these sources are suboptimal for practicing medicine e ciently [ 11 ]. Consequently, the user must spend a considerable amount of time searching across disjointed databases and narrowing down to nd the right treatment and consider non-medical constraints such as price and avoid adverse interactions with the current medications. For example, GoodRx4 has structured drug prices and store availability for each medicine, but it can only be used for shopping after prescription as it lacks mapping of symptoms to drugs. WebMD5 contains structured treatment data for each symptom but does not inform what over-the-counter drugs could help. DrugBank is an open-source database that can help determine which drugs are safe to consume with the current medications the patient is taking. However, the average person or even physician is not a computer scientist and cannot query this rich resource. Having structured knowledge bases that integrate such existing distributed knowledge would help healthcare professionals transcend the above challenges and obtain accurate answers for their queries quickly. It can also assist patients in buying drugs at better prices and improving their shopping experience.

In this paper, we develop a structured database for drugs in terms of a knowledge graph (KG) [ 8 ]. KGs have been found helpful in AI-aided medicine, particularly for clinical decision support systems for diagnosis and treatment [3,

3 https://go.drugbank.com/releases/latest 4 https://www.goodrx.com/ 5 https://www.webmd.com/drugs/2/conditions/index

14]. Building KGs using unstructured medical data helps performing more complex tasks using AI, including adverse drug reactions [ 2 ], drug discovery [ 10 ], repropose [ 9 ], and predicting drug-drug interaction [ 5 ]. Our goal is to construct a KG to help the user nd potentially helpful drugs that can serve as potential treatments given a list of symptoms or a disease. Additionally, information about the availability of drugs at nearby stores is provided to the user. Building upon the existing healthcare literature [ 17, 16 ], our goal is to integrate existing sources to create a comprehensive, fast search experience for users who manage conditions, budget, and control adverse drug interactions for patients. By integrating multiple knowledge sources, we enable the users to have more expressive search results quickly. Our knowledge graph builds on the knowledge of symptoms to disease mapping. This helps to nd possible drugs that can be used to treat a symptom. It incorporates information on prices and drug availability. This helps the user to zero down and research the drugs that are a ordable and available.

We list the contributions of this paper as follows: 1. We present a pipeline for extraction and consolidation of relevant knowledge about symptoms, drugs, and their interaction, as well as non-medical information, such as drug prices. We apply our pipeline method to four relevant and complementary sources, resulting in an integrated knowledge base. (Section 2) 2. We make the resulting data publicly available, both in the form of a relational database and a knowledge graph.6 The two formats support complementary use cases. 3. We analyze the contents of the resulting database. We provide statistics of its constituting nodes and relations and run graph embedding-based queries to nd similar products or drugs. (Section 3) 4. We assess the applicability of our integrated KG by designing a user-friendly web interface and showing its utility in two representative scenarios. (Section 4) 2

Approach

The overall architecture of our approach is shown in Figure 1. We start by describing the data acquisition from the four sources that we will use in this paper: Wikidata [ 13 ], DrugBank, WebMD, and GoodRx (Section 2.1). We next describe their consolidation through entity linking and resolution between pairs of sources (Section 2.2). The resulting ontology of our data is described in Section 2.3. 2.1

Sources and data acquisition

We sought to construct a knowledge graph from several drug-centric data sources. Each source contributes with a particular set of information about drugs, prices, and relations to conditions, which can be complimentary. To ease the e ort

6 https://www.kaggle.com/mannbrinson/open-drug-knowledge-graph

required in entity linkage, we chose a well-adopted, drug-centric external id (Drugbank ID) as the primary key of our drug entity. For each data source, we identi ed the target features needed and devised methods for extraction of the data: 1. Wikidata [ 13 ] is one of the largest publicly available knowledge graphs, describing over 90 million entities with more than a billion statements. To retrieve relevant data from Wikidata, we query it for medication (Q12140) entities with any Drugbank ID (P715) that treats any condition (P2175). We also retrieve additional, optional features: the medication's active ingredient in (P3780), signi cant drug interaction (P769), and ATCCode (P267). The total amount of rows extracted from said query was 1,560. 2. Drugbank is a drug-centric database focused on drug-drug interactions and bioinformatics-related features. Its knowledge is provided as a data dump in XML format. We extracted all 2,166 drugs from Drugbank's XML dump, each with a maximum of 20 products and 100 interactions. 3. WedMD is a site focused on helping users search for treatments for a given condition. The site displays an index of all possible conditions, sorted alphabetically. From each condition, a list of drug treatments is provided. As the website provides no public API, we scraped its content programmatically. The crawler obtained a total of 58,921 condition-drug relations and 12,857 unique drugs. Features extracted include condition, product, user reviews, and prescription type. 4. GoodRx is a healthcare company that tracks prescription drug prices in the United States and provides free drug coupons for discounts on medications. As GoodRx does not provide a public API service, we extracted knowledge on GoodRx's drug products directly from their website, starting from a Wikidata-based seed list. The features of drug products that we extracted were: zipcode, store, price type, price, and price link. A total of 20,688 prices and 23 stores were extracted for the 997 matched drug products. Metric Value

All Pairs 178,519 Pairs matched 1,701

Pairs found 38

Recall 97.36

Precision 100 True positives 0.973 False positives 0 True Negatives 1

False negatives 0.026 An entity resolution step follows the data extraction step. As the entities across sources are originally disjoint, linking them is essential for constructing a wellconnected drug knowledge graph. To avoid introducing false positives, we rst perform entity resolution across sources based on their external drug identier (Drugbank ID). In this way, the Drugbank ID allowed us to link all data sources to Wikidata in a `hub-and-spoke' manner. This design choice enriched the information about the entities found in Wikidata but excludes the remaining entities in the other three sources, which are not mapped to Wikidata through the Drugbank ID. For this purpose, we consider further linking on these entities. Speci cally: Wikidata to Drugbank: Linkage occurred only between Wikidata drugs (containing a Drugbank ID) and the subset of Drugbank drugs with matching Drugbank ID. In this case, we did not perform fuzzy matching, as we found it to decrease the overall quality of matching. Drugbank IDs were found on 787 wikidata medications.

Wikidata to GoodRx: We matched Wikidata and GoodRx based on an exact matching query on the GoodRx website. URL requests to GoodRx return a result if the drug name is exactly matched and otherwise give a 404 error. Of all 1560 wikidata drug products, we found 997 matches in GoodRx (recall of 63.9%).

Wikidata to WebMD: Due to the absence of a shared identi er between Wikidata and WebMD, we resorted to fuzzy matching between their drug products. A scoring function was leveraged to create matches for pairs if the pair had a Jaccard similarity greater than 0.7. For each search term, bi-gram sets were generated before Jaccard similarity was calculated. A development set of 50 true pairs was manually compiled to enable the evaluation of this matching approach.

Hash-based blocking upon the entity's rst two characters was utilized to reduce candidate pairs from 20M to 178k. Our scoring function obtained 97.3% recall and 100% precision on these development pairs. Detailed results are shown in Table 1. We judge this level of error to be acceptable; thus, we proceed with this linkage strategy.

An overview of the number of entities mapped between Wikidata and each of the three other sources is shown in Figure 2. We allowed a one-to-one match for Wiki-Drugbank and Wiki-GoodRx matching tasks. However, we allowed one-tomany match for Wiki-WebMD drug product matching. This is because WebMD displayed many su x variations for a product (ex: Adriamycin vial, AdriamycinPfs Solution) that we wanted to include in our graph. This design choice allowed for more matches (1701) than products existing in Wikidata (1560). The ontology was designed in a top-down manner to t our ultimate goal of enabling queries to connect patients with treatment based on their search parameters. We preserved all binary relations: treatment, interaction, active ingredient in, and drug price, and used them to model information in all their suitable sources. To contain scope for our proof-of-concept, we selected these relations from Wikidata and Drugbank while using all extracted relations from WebMD and GoodRx. We decided to categorize drug-like entities into two nodes - drug and product - to represent the active ingredient and its name in the market. Our ontology map is described in gures 2 and 3. It is a simple yet powerful ontology, which allows us to achieve the project goals, including: (a) Store symptoms to drugs mapping (b) Capture drug interactions (c) Capture drug prices and variation across stores/zipcodes.

Figure 3 is an Entity-Relation diagram of the entities in our relational data model, created after entity linkage was completed. The relational data model was stored in a MySQL instance and used as a back-end for our front-end application (Section 4.1). Figure 4 is composed of the same data but expressed as a property graph and stored in Neo4j. The property graph format opened opportunities for us to leverage Neo4js robust graph-centric libraries for path- nding, centrality, and computation of embeddings. After extraction and entity linkage scripts steps, the resulting data model was loaded to a MySQL instance using another python script. The relational schema is displayed in Figure 3. The resulting relational database was then exported in .csv format and loaded to Neo4j. Neo4j import commands were utilized to load the data to create nodes and edges corresponding to the data model in Figure 4. 3

Analysis

3.1

Statistics

In this section, we analyze the contents of our knowledge base. First, we provide basic statistics (Section 3.1). Then, we compute drug embeddings and cluster them to investigate possible emerging patterns in the graph (Section 3.2). After we loaded the open drug knowledge graph into MySQL, we computed statistics of the coverage of each class across di erent sources. The results are shown in Table 2. Each source contributed a di erent pro le of features, and some sources contributed unique classes. Speci cally, Drugbank distinctly contributed the `Manufacturer' and `Interaction' classes while GoodRx contributed the `Store' and `Price' classes. Each source was linked back to Wikidata as a centralized source, based on the linkage methods described above. This shows the bene t of integrating sources with complementary foci in a single knowledge source, which is ultimately more than a sum of its parts. 3.2

Graph Embedding Analysis

We sought to further explore the higher-level structure of the extracted knowledge graph via graph embeddings. Our goal was to explore the relation treatment (drug, condition) embedding to con rm whether drugs that treat similar conditions are clustered together. If drugs are clustered in this fashion, the graph embeddings could enable drug recommendations, given a source drug, for providers in the future. Our embedding is built from all 3,654 instances in our treatment table, sourced from Wikidata. We then utilized the Ampligraph [ 6 ] and Tensorboard [ 1 ] libraries with TransE and [ 4 ] Complex [ 12 ] models to project our data into the 150-d embedding space. The training occurred for 200 epochs with an Adam optimizer. A training set was generated from 90% of the data, with the remaining 10% set aside as testing data.

In Table 3, our embedding models are evaluated using the following entity ranking tasks described by Wang et al. [ 15 ]: 1) mean reciprocal rank (MRR), and 2) Hits@K. MRR asks the embedding model to rank unseen test triples. A model that produces higher ranks for known true triples (i.e., test triples) is considered superior at predicting missing links. The Hits@K metric computes how many elements of a vector of rankings make it to the top K positions. When visualizing the embedding vectors, we utilized embeddings from the Complex model to perform best on our entity ranking tasks. To visualize the embedding, we reduced our embeddings into 3-d using T-SNE [ 7 ] as our dimensionality reduction method. We then inspected the result for nearest neighbors based on cosine similarity in the initial embedding space. In Figures 5 and 6, we selected results from our embedding visualization. The visualizations are from Tensorboard and displayed using the aforementioned model parameters and visualization settings. We found that drugs that treat similar conditions are somewhat clustered in this embedding space, while similar conditions are grouped together. In Figure 5a, we nd the 10 nearest cosine neighbors to source drug \insulin aspart" for drugs. Two of the neighbors are also insulin variants. However, more domain expertise is required to deem whether this clustering is a meaningful representation of drugs that treat similar conditions. For conditions, in Figure 5b, the 10 nearest cosine neighbors are located for \bipolar disorder". Many of the neighbors logically represent similar conditions such as \mood disorder", \schizophrenia", and \anxiety". Further experiments are required to con rm how meaningful these initial embeddings can be for recommending drug products. Other relations that may be helpful to include to achieve drug similarity embedding may be ICD-10 codes of the treatment's condition or the products of the treatment's drug. However, these embeddings show early signs of progress for achieving goals around drug recommendation via nearest neighbor search within a graph embedding space.

Fig. 5: Graph Embedding Visualization. Visualization of all entities within the reduced embedding created by the Complex embedding model. In Figure 5a, the source entity `insulin aspart' is selected. We observe clustering for this entity in the embedding space. In Figure 5b, `bipolar disorder' is selected, which also exists within an observable cluster of similar entities. In this Section, we present our web interface that allows user exploration of the relational data model. We also explore the associated property graph to gain motivation for future hypothesis and functionality. We prepare a web interface for our Intelligent Drug Shopper, shown in Figures 8 and 9. The web interface was developed using the Python Django framework. The user can input search parameters for a patient's condition, current medications, and price range. These parameters are inserted into a SQL query template that checks our data model for any matching results.

For example, in Figure 8 below, a patient is present with osteoarthritis and has a budget of 20 dollars to spend on medicine. These parameters are inputted and the query retrieves matching treatments, its active ingredient, and average price. The user can then navigate to di erent views of the Active Ingredient or Product entity via hyperlinks.

To demonstrate further searching capabilities our data model provides, consider Figure 9. Extending the same search from Figure 8, a patient may also be taking some current medication like Zyvox, an antibiotic. This parameter is added to the search, and we nd many of the previous recommended treatments from Figure 8 are removed as they interact with this antibiotic. This feature will enable users to nd treatments that avoid adverse drug interactions, while still treating a condition and adhering to the patient's budget. In addition to exploring the relational database via the web application, we also directed queries to the equivalent property graph stored in Neo4j. In Figure 9, we consider a user search for treatments of \medullary thyroid carcinoma" and all possible drug interactions with these treatments. The resulting visualization shows two possible treatments (cluster centers), with drug interactions branching outward. We can see there is an intersection of six drugs that interact with either treatment. Therefore, if a patient is currently prescribed a drug in this intersection, they cannot safely be prescribed either of the two treatments. Neo4j was utilized to perform this visualization. As the data model is loaded as a property graph in Neo4j, we can leverage Neo4j's wide set of graph analytics tools to compute such paths automatically. We also plan to use Neo4j to compute centrality metrics over our graph in the future. 5

Discussion and future work

Hub graph: While we were able to link data sources with Wikidata, there are some bene ts and drawbacks to the chosen design methodology. In our design, we link all sources back to Wikidata in a `hub-and-spoke' fashion. No other sources are permitted to link to each other. This design functions to extend the Wikidata knowledge graph, enabling new drug features (e.g., drug price) to be analyzed with all other connected nodes to medication (Q12140). The drawback to this approach is that Wikidata does not contain nearly as many drugs or drug product entities as Drugbank, thus bottle-necking the number of possible entity links made with other data sources. Depending on the application, the extension of Wikidata with drug-centric data may be less important. In this case, we would suggest using Drugbank as a centralized source for entity linkage to maximize the number of links on drug and drug product entities with other data sources.

Integration of more data sources: To answer even more healthcare

centric questions, we propose to extend the knowledge graph with additional healthcare datasets. These datasets could relate to healthcare objects, such as prescriptions, procedures, diagnoses, claims, providers, payers, and healthcare facilities. Many of these datasets are made publicly available by government-run healthcare agencies, such as Food and Drug Administration (FDA7), National Institutes of Health (NIH8), and Center for Medicare Services (CMS9). Standard identi ers for each healthcare object are very common and, therefore, reduce the amount of fuzzy matching required to extend the knowledge graph. For example, the FDA gives drugs a National Drug Code (NDC), representing labeler, product, and package size. CMS gives each healthcare provider a National Provider Identi er (NPI). ICD-10 codes can be used to label medical conditions.

Drug product similarity in embedding space: A future hypothesis to check whether our knowledge graph can enable drug product similarity searching via kNN search within graph embeddings. This application would enable healthcare workers to nd similar products for a source product. The embedding should be produced to cluster products together in embedding space if the products treat similar conditions (ex: mental disorders, nervous, ocular). Addi

7 https://www.fda.gov/home 8 https://www.nih.gov/ 9 https://www.cms.gov/

tional sources must be integrated to enable this work - such as ICD(condition, ICD code) - to enable ground truth checking of clusters.

Application features: Currently, our application does not support searching based on multiple conditions or current medications. In order to support this, our query templates must be updated to allow for these additional search parameters. Another improvement would be to enable eager fuzzy n-gram searching, triggered by characters inputted in real-time, to nd a matching indexed search term. This can be enabled via indexing of keywords and real-time searches upon the index. This feature would enable higher success with user searches compared to current functionality. 6

Conclusion

In this paper, we proposed the Open Drug Knowledge Graph: an integrated drugcentric data model used to enable customers to make well-informed purchasing decisions by including prices, availability, and drug interactions in a single view without referencing ne print about drug interactions. This data model leverages healthcare objects stored in pre-existing knowledge bases and integrates knowledge from previously disjoint systems. Our acquisition pipeline consists of three key steps: source data acquisition, entity resolution, and ontology mapping. When performing entity linkage, external drug identi ers such as Drugbank ID were heavily utilized to reduce the need for fuzzy matching. We created a web application to visualize the relational data model (MySQL), and showed its potential to be used by healthcare workers and patients to inform treatment decisions. The model was also loaded into a property graph (Neo4j), which was anecdotally shown to enable visualization and network analytics upon the graph. We computed graph embeddings upon the treatment class using the TransE and Complex models. Nearest neighbor search based on cosine distance over these embeddings showed their potential to aid product and condition searches. We expect that such a single integrated source can help users make medically safe and nancially smart decisions. Future work should investigate the graph's usefulness for customers, integrate additional sources, and explore novel ways to leverage the data through graph centrality and path- nding methods. To facilitate further exploration and development of drug knowledge graphs, the data model10 is made publicly available to the research community. 10 https://www.kaggle.com/mannbrinson/open-drug-knowledge-graph

1. Abadi , M. , Agarwal , A. , Barham , P. , Brevdo , E. , Chen , Z. , Citro , C. , Corrado , G.S. , Davis , A. , Dean , J. , Devin , M. , Ghemawat , S. , Goodfellow , I. , Harp , A. , Irving , G. , Isard , M. , Jia , Y. , Jozefowicz , R. , Kaiser , L. , Kudlur , M. , Levenberg , J. , Mane , D. , Monga , R. , Moore , S. , Murray , D. , Olah , C. , Schuster , M. , Shlens , J. , Steiner , B. , Sutskever , I. , Talwar , K. , Tucker , P. , Vanhoucke , V. , Vasudevan , V. , Viegas , F. , Vinyals , O. , Warden , P. , Wattenberg , M. , Wicke , M. , Yu , Y. , Zheng , X. : TensorFlow: Large-scale machine learning on heterogeneous systems ( 2015 ), http://tensor ow.org/, software available from tensor ow.org

2. Bean , D.M. , Wu , H. , Iqbal , E. , Dzahini , O. , Ibrahim , Z.M. , Broadbent , M. , Stewart , R. , Dobson , R.J. : Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records . Scienti c reports 7(1) , 1 { 11 ( 2017 )

3. Bisson , L.J. , Komm , J.T. , Bernas , G.A. , Fineberg , M.S. , Marzo , J.M. , Rauh , M.A. , Smolinski , R.J. , Wind , W.M.: Accuracy of a computer-based diagnostic program for ambulatory patients with knee pain . The American journal of sports medicine 42(10) , 2371 { 2376 ( 2014 )

4. Bordes , A. , Usunier , N. , Garcia-Duran , A. , Weston , J. , Yakhnenko , O. : Translating embeddings for modeling multi-relational data . In: Burges, C.J.C. , Bottou , L. , Welling , M. , Ghahramani , Z. , Weinberger , K.Q . (eds.) Advances in Neural Information Processing Systems . vol. 26 , pp. 2787 { 2795 . Curran Associates, Inc. ( 2013 ), https://proceedings.neurips.cc/paper/2013/ le/1cecc7a77928ca8133fa24680a88d2f9- Paper.pdf

5. Celebi , R. , Uyar , H. , Yasar , E. , Gumus , O. , Dikenelli , O. , Dumontier , M. : Evaluation of knowledge graph embedding approaches for drug-drug interaction prediction in realistic settings . BMC bioinformatics 20(1) , 1 { 14 ( 2019 )

6. Costabello , L. , Pai , S. , Van , C.L. , McGrath , R. , McCarthy , N. , Tabacof , P.:

AmpliGraph: a Library for Representation Learning on Knowledge Graphs (Mar

2019 ). https://doi.org/10.5281/zenodo.2595043, https://doi.org/10.5281/zenodo.2595043

7. Hinton , G.E. , Roweis , S. : Stochastic neighbor embedding . In: Becker, S. , Thrun , S. , Obermayer , K . (eds.) Advances in Neural Information Processing Systems . vol. 15 , pp. 857 { 864 . MIT Press ( 2003 ), https://proceedings.neurips.cc/paper/2002/ le/6150ccc6069bea6b5716254057a194efPaper.pdf

8. Li , L. , Wang , P. , Yan , J. , Wang , Y. , Li , S. , Jiang , J. , Sun , Z. , Tang , B. , Chang , T.H. , Wang , S. , et al.: Real-world data medical knowledge graph: construction and applications . Arti cial intelligence in medicine 103 , 101817 ( 2020 )

9. Mohanty , S. , Rashid , M.H.A. , Mridul , M. , Mohanty , C. , Swayamsiddha , S. : Application of arti cial intelligence in covid-19 drug repurposing . Diabetes & Metabolic Syndrome: Clinical Research & Reviews ( 2020 )

10. Sang , S. , Yang , Z. , Wang , L. , Liu , X. , Lin , H. , Wang , J. : Sematyp: a knowledge graph based literature mining method for drug discovery . BMC bioinformatics 19(1) , 1 { 11 ( 2018 )

11. Shen , Y. , Colloc , J. , Jacquet-Andrieu , A. , Lei , K. : Emerging medical informatics with case-based reasoning for aiding clinical decision in multi-agent system . Journal of biomedical informatics 56 , 307 { 317 ( 2015 )

12. Trouillon , T. , Welbl , J. , Riedel , S. , Eric

Gaussier

, Bouchard, G.: Complex embeddings for simple link prediction ( 2016 )

13. Vrandecic , D. , Krotzsch, M.: Wikidata: a free collaborative knowledgebase . Communications of the ACM 57 ( 10 ), 78 { 85 ( 2014 )

14. Wang , M. , Liu , M. , Liu , J. , Wang , S. , Long , G. , Qian , B. : Safe medicine recommendation via medical knowledge graph embedding . arXiv preprint arXiv:1710.05980 ( 2017 )

15. Wang , Y. , Ru nelli , D., Gemulla , R. , Broscheit , S. , Meilicke , C. : On evaluating embedding models for knowledge base completion ( 2019 )

16. Zamborlini , V. , Hoekstra , R. ,

Silveira , M. , Pruski , C. , ten Teije , A., van Harmelen , F. : Generalizing the detection of clinical guideline interactions enhanced with lod . In: International Joint Conference on Biomedical Engineering Systems and Technologies . pp. 360 { 386 . Springer ( 2016 )

17. Zamborlini , V. , Hoekstra , R. ,

Silveira , M. , Pruski , C. , Ten Teije , A. , Van Harmelen , F. : Inferring recommendation interactions in clinical guidelines 1 . Semantic Web 7 ( 4 ), 421 { 446 ( 2016 )