LOD Surfer API: A Web API for LOD Surfing Using Class-Class Relationships in Life Sciences Atsuko Yamaguchi1 , Kouji Kozaki2 , Yasunori Yamamoto1 , Hiroshi Masuya3,4 , and Norio Kobayashi4,3 1 Database Center for Life Science (DBCLS), Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba, 277-0871 Japan {atsuko,yy}@dbcls.rois.ac.jp 2 The Institute of Scientific and Industrial Research (ISIR), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka, 567-0047 Japan kozaki@ei.sanken.osaka-u.ac.jp 3 BioResource Center (BRC), RIKEN 3-1-1, Koyadai, Tsukuba, Ibaraki, 305-0074 Japan hiroshi.masuya@riken.jp 4 Advanced Center for Computing and Communication (ACCC), RIKEN, 2-1 Hirosawa, Wako, Saitama, 351-0198 Japan norio.kobayashi@riken.jp Abstract. Linked Open Data (LOD) is being increasingly used when publishing life science databases. To facilitate the flexible use of such databases, we employ a method that uses federated query search along a path of class-class relationships. To demonstrate our strategy, we im- plemented a prototype system accessible via a web API as a preliminary trial. We have been collecting SPARQL Builder Metadata (SBM) that describe a data schema for each SPARQL endpoint. Employing the SBM, we constructed a graph of the class-class relationships in LOD. Using this graph, our system can provide the information required to construct a federated search query, such as the SPARQL endpoints that include a class–class relationship, and paths of the class-class relationships between two classes. Keywords: Linked Open Data, class–class relationships, database inte- gration, federated query search 1 Introduction LOD Surfer is a search system that discovers data along paths of class–class relationships over life sciences Linked Open Data (LOD) provided by different SPARQL endpoints. A user can interactively obtain desired data by inputting data or class, and selecting an output class and a path of class–class relation- ships between the classes. In a previous study [1], we developed a system called SPARQL Builder (http://www.sparqlbuilder.org/) that allows users to build a SPARQL query for a SPARQL endpoint without thorough understanding of the RDF. For this system, we have been collecting metadata that describes a data schema for each SPARQL endpoint. These metadataa are called SPARQL Builder Metadata (SBM). By applying our SPARQL endpoint crawler program to the SPARQL endpoints obtained from YummyData [2], we collected SBM for 76 datasets from 43 SPARQL endpoints (as of July 2017). In this study, we implemented a web API for LOD Surfer as a preliminary trial, using technologies developed for SPARQL Builder such as SBM and class graphs. This enables flexible data acquisition from multiple SPARQL endpoints for users who do not have detailed information for data schema over LOD. 2 LOD Surfer API The LOD Surfer API is designed to eas- ily develop a search system, such as LOD Surfer. Using this API, a list of SPARQL endpoints, classes reach- able using links from a given class, and Fig. 1. Overview of the architecture of the LOD Surfer API paths between two given classes can be obtained in JSON format along with a federated SPARQL query and result obtained for a given path by a simple HTTP GET request. Fig- ure 1 shows the overview of the architecture of the LOD Surfer API. From the SBM collected by the LOD crawler, a single graph with 14147 nodes that corre- sponds to the classes appearing in LOD is constructed. Using this graph, the out- put generator can efficiently compute the output of this API. For example, from the connected components of the graph, a list of classes reachable from a given class can be instantly obtained. A typical example of the usage for this API is ob- taining related data typed by some other classes (e.g., Disease, Protein, etc) from various datasets for a set of IDs typed by a class (e.g. Gene). The code for this web API can be accessed at https://github.com/LODSurfer/lodsurfer-api. This system is an early prototype and will be improved to make it more practical for users. In addition, the development for an application using this API will be the focus of our future work. Acknowledgments This work was supported by JSPS KAKENHI grant num- bers 17K00434, 17K00424 and 17H01789 and by the National Bioscience Data- base Center (NBDC) of the Japan Science and Technology Agency (JST). References 1. Yamaguchi, A., Kozaki, K., Lenz, K., Yamamoto, Y., Masuya, H., Kobayashi, N.: Semantic Data Acquisition by Traversing Class-Class Relationships Over Linked Open Data. The 6th Joint International Semantic Technology Conference (JIST 2016), LNCS 10055, 136–151 (2016) 2. Yamamoto, Y., Yamaguchi, A., Splendiani, A.: Umaka-Yummy Data: A Place to Facilitate Communication between Data Providers and Consumers. 9th Interna- tional Conference Semantic Web Applications and Tools for Life Sciences, CEUR Workshop Proceedings 1795, 2016