=Paper= {{Paper |id=Vol-1810/EuroPro_paper_07 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1810/EuroPro_paper_07.pdf |volume=Vol-1810 |dblpUrl=https://dblp.org/rec/conf/edbt/ArdagnaCDL17 }} ==None== https://ceur-ws.org/Vol-1810/EuroPro_paper_07.pdf
     Scouting Big Data Campaigns using TOREADOR Labs
  Claudio A. Ardagna, Paolo Ceravolo                                    Marcello Leida                          Ernesto Damiani
      Università degli Studi di Milano                                        Taiger                        Consorzio Interuniversitario
      Computer Science Department                                     Madrid 28036, Spain                   Nazionale per l’Informatica
          Crema, CR 26013, Italy                                    marcello.leida@taiger.com                  Rome 00198, Italy
{claudio.ardagna,paolo.ceravolo}@unimi.it                                                                   ernesto.damiani@unimi.it

ABSTRACT                                                                      2. BIG DATA-AS-A-SERVICE
                       1                                                      Big Data Analytics-as-a-Service (BDAaaS) consists of a set of
TOREADOR Labs offer a Big Data Analytics-as-a-Service
                                                                              automatic tools and methodologies that allows customers to
environment for testing simplified but real-life Big Data analytics
                                                                              design BDA and deploy a full Big Data pipeline addressing their
vertical scenarios. Users are challenged with requirements,
                                                                              goals [1]. BDAaaS can be seen as a function that takes as input
described from a business perspective, and are requested to
                                                                              users’ Big Data goals and preferences, and returns as output a
compare alternative options, investigating the consequences of
                                                                              ready-to-be executed Big Data pipeline.
their choices. This “trial and error” approach brings up the
interconnections and interferences of the different design stages             While declarative goals underlying the use of Big Data services
typically addressed in preparing a Big Data campaign.                         are usually industry-dependent, we argue that identifying a core
                                                                              set of standard indicators is an important step towards increasing
Keywords                                                                      transparency of the commitments taken by Big Data service
Parallel computing methodologies; Modeling and simulation.                    providers, as well as the awareness of users adopting a Big Data
                                                                              solution. Indicators present a way for measuring or assessing a
1. INTRODUCTION                                                               business goal, such as analytics tasks or regulatory constraints on
Today, the level of complexity of architectures supporting Big                personal data protection, and are accompanied by Big Data
Data Analytics (BDA) and the lack of standardisation for them                 objectives representing the target to be achieved for fulfilling the
represents a huge barrier towards the adoption of Big Data                    goal.
technologies, especially for those organisations and SMEs not
having the sufficient amount of competences and skills. Another               3. TOREADOR LABS
major hindering factor is the so-called “regulatory barrier”, that is,        The model driven approach adopted by TOREADOR supports the
concerns about violating data access, sharing and custody                     creation of a virtual environment particularly suited for training
regulations when using BDA, and the high cost of obtaining legal              Big Data professionals using a “trial and error” approach. This
clearance for specific scenarios, which is discouraging companies,            environment supports users in understanding the interrelations and
particularly SMEs, from taking over BDA.                                      interferences of the different design options available when
Project TOREADOR aims to overcome some of these hurdles, by                   preparing a BDA.
providing a platform that supports customers lacking Big Data                 In this context, the TOREADOR Labs provide a free-limited
expertise in the management of BDA and deployment of a full                   access to TOREADOR using a Platform-as-a-Service solution. It
Big Data pipeline [2]. Users with different skills and expertise can          proposes a simplified version of real-life vertical scenarios and
benefit by using TOREADOR. Users lacking proper data science                  success stories organised in a set of challenges, where the trainees
expertise (e.g., modeling, analysis, problem solving) can use                 are requested to identify alternative options, and investigate the
TOREADOR for preparing the real analytics, reason on data to                  consequences of their choices. Note that this kind of experience is
find out hidden patterns and information, and solve business                  usually not available in the professional Big Data platforms today
problems. Users lacking expertise proper of data engineers (e.g.,             in the market, where the architectural and data complexity make it
builds a robust and fault-tolerant data pipeline, install a Big Data          difficult to compare different runs of a composite BDA.
system) can use TOREADOR to automatically identify and
deploy the proper set of technologies that accomplish their                   REFERENCES
requirements. Users lacking both type of expertise can use                    [1]   E. D. Claudio Ardagna, Paolo Ceravolo. Big data analytics
TOREADOR for a proper initiation in the Big Data realm.                             as-a-service: Issues and challenges. In Proceedings of the
                                                                                    3rd International Workshop on Privacy and Security of Big
1
  This project has received funding from the European Union’s Horizon 2020          Data (PSBD). IEEE, 2016.
research and innovation programme under the TOREADOR project, grant
agreement No 688797; Project Coordinator: Prof. Ernesto Damiani, CINI,        [2]   M. Leida, C. Ruiz, and P. Ceravolo. Facing big data variety
Italy; Project web site: http://www.toreador-project.eu/.                           in a model driven approach. In Research and Technologies
                                                                                    for Society and Industry Leveraging a better tomorrow
2017, Copyright is with the authors. Published in the Workshop Proc. of the
EDBT/ICDT 2017 Joint Conference (March 21, 2017, Venice, Italy) on CEUR-            (RTSI), 2016 IEEE 2nd International Forum on, pages 1–6.
WS.org (ISSN 1613-0073). Distribution of this paper is permitted under the          IEEE, 2016.
terms of the Creative Commons license CC-by-nc-nd 4.0