Scalable graph analytics with GRADOOP [Abstract] Erhard Rahm University of Leipzig Augustusplatz 10 Leipzig, Germany rahm@informatik.uni-leipzig.de ABSTRACT About the Author Many Big Data applications in business and science require Erhard Rahm is full professor for databases at the com- the management and analysis of huge amounts of graph puter science institute of the University of Leipzig. His data. Previous approaches for graph analytics such as graph current research focusses on Big Data and data integra- databases and parallel graph processing systems (e.g., Pregel) tion. He has authored several books and more than 200 either lack sufficient scalability or flexibility and expressive- peer-reviewed journal and conference publications. His re- ness. We are therefore developing a new end-to-end ap- search on data integration and schema matching has been proach for graph data management and analysis at the Big awarded several times, in particular with the renowned 10- Data center of excellence ScaDS Dresden/Leipzig. The sys- year best-paper award of the conference series VLDB (Very tem is called Gradoop (Graph analytics on Hadoop). Gradoop Large Databases) and the Influential Paper Award of the is designed around the so-called Extended Property Graph conference series ICDE (Int. conf. on Data Engineering). Data Model (EPGM) which supports semantically rich, schema- Prof. Rahm is one of the two scientific coordinators of the free graph data within many distinct graphs. A set of high- new German center of excellence on Big Data ScaDS (com- level operators is provided for analyzing both single graphs petence center for SCAlable Data services and Solutions) and sets of graphs. The operators are usable within a domain- Dresden/Leipzig that started its operation in Oct. 2014. specific language to define and run data integration work- flows (for integrating heterogeneous source data into the Gradoop graph store) as well as analysis workflows. The Gradoop data store is currently utilizing HBase for distributed storage of graph data in Hadoop clusters. An initial version of Gradoop is operational and has been used for analyzing graph data for business intelligence and social network anal- ysis. 27th GI-Workshop on Foundations of Databases (Grundlagen von Daten- banken), 26.05.2015 - 29.05.2015, Magdeburg, Germany. Copyright is held by the author/owner(s). 10