=Paper=
{{Paper
|id=Vol-1810/EuroPro_paper_07
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-1810/EuroPro_paper_07.pdf
|volume=Vol-1810
|dblpUrl=https://dblp.org/rec/conf/edbt/ArdagnaCDL17
}}
==None==
Scouting Big Data Campaigns using TOREADOR Labs
Claudio A. Ardagna, Paolo Ceravolo Marcello Leida Ernesto Damiani
Università degli Studi di Milano Taiger Consorzio Interuniversitario
Computer Science Department Madrid 28036, Spain Nazionale per l’Informatica
Crema, CR 26013, Italy marcello.leida@taiger.com Rome 00198, Italy
{claudio.ardagna,paolo.ceravolo}@unimi.it ernesto.damiani@unimi.it
ABSTRACT 2. BIG DATA-AS-A-SERVICE
1 Big Data Analytics-as-a-Service (BDAaaS) consists of a set of
TOREADOR Labs offer a Big Data Analytics-as-a-Service
automatic tools and methodologies that allows customers to
environment for testing simplified but real-life Big Data analytics
design BDA and deploy a full Big Data pipeline addressing their
vertical scenarios. Users are challenged with requirements,
goals [1]. BDAaaS can be seen as a function that takes as input
described from a business perspective, and are requested to
users’ Big Data goals and preferences, and returns as output a
compare alternative options, investigating the consequences of
ready-to-be executed Big Data pipeline.
their choices. This “trial and error” approach brings up the
interconnections and interferences of the different design stages While declarative goals underlying the use of Big Data services
typically addressed in preparing a Big Data campaign. are usually industry-dependent, we argue that identifying a core
set of standard indicators is an important step towards increasing
Keywords transparency of the commitments taken by Big Data service
Parallel computing methodologies; Modeling and simulation. providers, as well as the awareness of users adopting a Big Data
solution. Indicators present a way for measuring or assessing a
1. INTRODUCTION business goal, such as analytics tasks or regulatory constraints on
Today, the level of complexity of architectures supporting Big personal data protection, and are accompanied by Big Data
Data Analytics (BDA) and the lack of standardisation for them objectives representing the target to be achieved for fulfilling the
represents a huge barrier towards the adoption of Big Data goal.
technologies, especially for those organisations and SMEs not
having the sufficient amount of competences and skills. Another 3. TOREADOR LABS
major hindering factor is the so-called “regulatory barrier”, that is, The model driven approach adopted by TOREADOR supports the
concerns about violating data access, sharing and custody creation of a virtual environment particularly suited for training
regulations when using BDA, and the high cost of obtaining legal Big Data professionals using a “trial and error” approach. This
clearance for specific scenarios, which is discouraging companies, environment supports users in understanding the interrelations and
particularly SMEs, from taking over BDA. interferences of the different design options available when
Project TOREADOR aims to overcome some of these hurdles, by preparing a BDA.
providing a platform that supports customers lacking Big Data In this context, the TOREADOR Labs provide a free-limited
expertise in the management of BDA and deployment of a full access to TOREADOR using a Platform-as-a-Service solution. It
Big Data pipeline [2]. Users with different skills and expertise can proposes a simplified version of real-life vertical scenarios and
benefit by using TOREADOR. Users lacking proper data science success stories organised in a set of challenges, where the trainees
expertise (e.g., modeling, analysis, problem solving) can use are requested to identify alternative options, and investigate the
TOREADOR for preparing the real analytics, reason on data to consequences of their choices. Note that this kind of experience is
find out hidden patterns and information, and solve business usually not available in the professional Big Data platforms today
problems. Users lacking expertise proper of data engineers (e.g., in the market, where the architectural and data complexity make it
builds a robust and fault-tolerant data pipeline, install a Big Data difficult to compare different runs of a composite BDA.
system) can use TOREADOR to automatically identify and
deploy the proper set of technologies that accomplish their REFERENCES
requirements. Users lacking both type of expertise can use [1] E. D. Claudio Ardagna, Paolo Ceravolo. Big data analytics
TOREADOR for a proper initiation in the Big Data realm. as-a-service: Issues and challenges. In Proceedings of the
3rd International Workshop on Privacy and Security of Big
1
This project has received funding from the European Union’s Horizon 2020 Data (PSBD). IEEE, 2016.
research and innovation programme under the TOREADOR project, grant
agreement No 688797; Project Coordinator: Prof. Ernesto Damiani, CINI, [2] M. Leida, C. Ruiz, and P. Ceravolo. Facing big data variety
Italy; Project web site: http://www.toreador-project.eu/. in a model driven approach. In Research and Technologies
for Society and Industry Leveraging a better tomorrow
2017, Copyright is with the authors. Published in the Workshop Proc. of the
EDBT/ICDT 2017 Joint Conference (March 21, 2017, Venice, Italy) on CEUR- (RTSI), 2016 IEEE 2nd International Forum on, pages 1–6.
WS.org (ISSN 1613-0073). Distribution of this paper is permitted under the IEEE, 2016.
terms of the Creative Commons license CC-by-nc-nd 4.0