<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">D3M: Automated Data-Driven Decision Making</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Carles</forename><surname>Farré</surname></persName>
							<email>farre@essi.upc.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Universitat Politècnica de Catalunya</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<region>Catalonia</region>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Javier</forename><surname>Flores</surname></persName>
							<email>jflores@essi.upc.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Universitat Politècnica de Catalunya</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<region>Catalonia</region>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sergi</forename><surname>Nadal</surname></persName>
							<email>snadal@essi.upc.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Universitat Politècnica de Catalunya</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<region>Catalonia</region>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alejandra</forename><surname>Volkova</surname></persName>
							<email>alejandra.volkova@upc.edu</email>
							<affiliation key="aff0">
								<orgName type="institution">Universitat Politècnica de Catalunya</orgName>
								<address>
									<settlement>Barcelona</settlement>
									<region>Catalonia</region>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">D3M: Automated Data-Driven Decision Making</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">ED578F95C81CEDB5367E45428E152B2A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T10:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Data-driven software engineering</term>
					<term>Data integration</term>
					<term>Decision making</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Data has an undoubtedly impact on society. Storing, processing and analyzing large amounts of available data is currently one of the key success factors for an organization. Nonetheless, we are recently witnessing a change represented by huge and heterogeneous amounts of data. Thus, in order to carry on these data exploitation tasks, organizations must first perform data integration combining data from multiple sources to yield a unified view over them. In this paper, we report on the Automated Data-Driven Decision Making (D3M) project, whose main objective is to provide a mature software solution for automatic data integration with advanced decision making capabilities.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The importance of data in today's society is unquestionable. A large share of organizations base their business model on the collection, storage, and analysis of any data relevant to their business. This vision implies a radical change in the management of organizations' operations, where the collected data can be analyzed to generate relevant information for making informed decisions. This paper presents the D3M project, an acronym for Automated Data-Driven Decision Making. D3M<ref type="foot" target="#foot_0">1</ref> is a 2-year project started in December 2021, funded by the Spanish research agency, under the National Spanish Program for Research Aimed at the Challenges of Society 2020 (PdC 2021), which aims to address the current challenge of democratizing access to independent data sources to gain deeper analytical insights via automatic data integration and domain-specific decision making.</p><p>D3M is run by the integrated Software, Services, Information and Data Engineering (inSSIDE <ref type="foot" target="#foot_1">2</ref> ) research group at the Universitat Politècnica de Catalunya (UPC). inSSIDE is composed of two subgroups: (i) the Software and Service Engineering (GESSI) group <ref type="foot" target="#foot_2">3</ref> and (ii) the Database Technologies and Information Management (DTIM) group <ref type="foot" target="#foot_3">4</ref> . These two subgroups together cover the relevant aspects related to software engineering and data engineering that lay the foundations for D3M. Altogether, the D3M research team is composed of 7 senior researchers, 4 post docs, 1 PhD student, and 1 MSc student. Moreover, use cases for D3M include external support from end-users such as epidemiologists and junior software developers to validate our proposal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">Context</head><p>D3M is grounded on research carried out in the lines of automated data integration and domainspecific decision making. Here, we provide an overview of each.</p><p>Automating data integration tasks. Data integration is a well-studied area aimed at facilitating transparent access to various heterogeneous data sources <ref type="bibr" target="#b0">[1]</ref>. A prominent approach to data integration is exposing a knowledge graph conceptualizing the domain of interest to offer a uniform query interface over the sources. Queries over the knowledge graph are rewritten over the sources via schema mappings. The maintenance of such constructs (e.g., evolving the knowledge graph, adding new sources and mappings) is an arduous and manually-intensive task that hinders the ability of such systems to flexibly adapt and provide right-time integration <ref type="bibr" target="#b1">[2]</ref>. This limitation has been coined as the data variety challenge, which refers to the complexity of providing on-demand integration over a vast and evolving set of data sources. Dataspaces, which are data integration systems embracing a pay-as-you-go approach by gradually integrating data sources as needed, represent a significant step toward tackling the variety challenge. With the vision of reducing the usual upfront and maintenance costs, dataspaces claim for the adoption of a flexible and dynamic approach where different integration tasks are automated. One of them, known as bootstrapping <ref type="bibr" target="#b2">[3]</ref>, is the process of automatically generating the knowledge graph driven by the data sources, with the goal of incrementally building the query interface and mappings to query such heterogeneous data sources in an integrated manner.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Domain-specific decision making.</head><p>Organizations require facilitating access to informed decision making based on Key Performance Indicators (KPIs) relevant to their business. However, creating decision making support systems is expensive, time-consuming, and error-prone. The use of domainspecific, operationalized quality models offering actionable analytics from heterogeneous sources has been successful in multiple domains (e.g., software analytics <ref type="bibr" target="#b3">[4]</ref>). It enables plenty of analytics scenarios, from current situation assessment to prediction and what-if analysis. In a recent systematic review, data integration and final data aggregation were reported as part of the remaining challenges in Big Data analytics <ref type="bibr" target="#b4">[5]</ref>. At the same time, current approaches shall analyze more than one artifact and focus on integrating data from different sources and getting a holistic view <ref type="bibr" target="#b5">[6]</ref>. Thus, to enable domainspecific strategic indicators and data-driven decision making, it becomes necessary to facilitate the integration of data sources driven by the real information needs of end users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">Background</head><p>This project builds upon two research assets reported in the project Generation and Evolution of Smart APIs (GENESIS), funded by the National Spanish Program for Research Aimed at the Challenges of Society 2016: a dataspace management system (hereafter referred to as ODIN), and a software analytics tool (hereafter referred to as Strategic Dashboard). These products have been successfully validated as a prototype in pilot projects.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ODIN. ODIN (short for On-demand Data INtegration</head><p>) is a dataspace management system grounded on knowledge graphs <ref type="bibr" target="#b6">[7]</ref>. ODIN is conceived to overcome the limitations of traditional virtual data integration in large-scale scenarios where data variety plays a key role <ref type="bibr" target="#b7">[8]</ref>. Figure <ref type="figure" target="#fig_0">1</ref>, depicts how ODIN supports the dataspaces' complete lifecycle. ODIN automatically extracts the schemata from structured (e.g., relational) and semi-structured (e.g., JSON) data sources and translates them into a canonical data model. To that end, a set of production rules parse their metadata and generate source graphs. These are aligned while considering user feedback throughout this process. As a result, ODIN generates provenance graphs (PG) tracing the results of the previous stages. A PG is a target-agnostic metadata construct describing the integration of a particular set of data sources. It captures the results of bootstrapping the sources and aligning their schemata, and guarantees we can generate target-specific metadata from them. Thus, PGs are used to generate specific constructs of a given integration tool, such as conjunctive query (CQ)-oriented graphs, which expose the sources' schemata in first-normal form, and are then linked via local-as-view (LAV) schema mappings that connects elements of the sources' schemas to the global graph. LAV mappings characterize the sources in terms of a query over the knowledge graph, making them inherently more suitable in data variety settings, where new sources may be added or outdated sources removed dynamically.  Despite the benefits ODIN provides in terms of data integration, its query interface is limited to technical users familiar with semantic web technologies. Thus, there is a gap between such a low-level interface and the advanced capabilities that decision makers need in their organizations (e.g., progress indicators, what-if analysis). Additionally, the Strategic Dashboard is tightly coupled to the Distributed Data Sink, built ad-hoc for a specific domain. This difficults the integration of new data sources and the calculation of new types of metrics, quality factors, and strategic indicators.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Objectives</head><p>The main objective of D3M is to adapt and integrate these two independent tools, ODIN and the Strategic Dashboard, into a unified product bringing together the benefits of them both: (i) enabling the integration of disparate data sources in an incremental manner and (ii) provide advanced support on top of them for decision making processes via advanced dashboard interfaces. The project's main objective further decomposes into four specific objectives:</p><p>• O1: Data-driven semi-automatic bootstrapping. Provide means to enable an incremental semiautomatic extraction of the knowledge graph from a set of heterogeneous and independent data sources. This objective starting point is ODIN's core and will endorse it with new support for its enrichment with day-to-day concept vocabulary, and its enrichment with the domainspecific quality models. • O2: Integrated data exploration interface. Enable data wrangling tasks (navigational queries on tabular and semantic data) from heterogeneous data sources federated through a knowledge graph. This refers to a new exploitation feature required by our industrial partners (as an alternative to traditional decision making support). • O3: Customized decision making support. Enable the creation of an advanced dashboard that spans heterogeneous data sources applying domain-specific quality models to assist decision makers. This objective generalizes the available Strategic Dashboard to provide support in any domain, with improved techniques. • O4: Unified product to support the end-to-end decision making process over heterogeneous data sources. O4 integrates the results of O1-O3; i.e., it features incremental bootstrapping of the knowledge graph from the data sources of interest, and two kinds of exploitation: decision making support based on strategic indicators and data exploration based on data wrangling.</p><p>To achieve such objectives, D3M proposes the architecture presented in Figure <ref type="figure" target="#fig_2">3</ref>. D3M serves two types of data consumers: data wranglers (for data exploration) and decision makers (for advanced analytics and data-driven quality models). Besides, it requires interaction with other users for managing the system metadata, such as domain experts (for enriching the bootstrapped knowledge graph with day-to-day concepts and a domain-specific quality model) and data stewards (for assisting in the configuration of the alignments among heterogeneous data sources). While the integrated architecture proposed in Figure <ref type="figure" target="#fig_2">3</ref> offers the benefits of both ODIN and the Strategic Dashboard, it as well offers innovation by boosting the automated decision making process by means of linking heterogeneous data sources to the defined quality model via knowledge graphs, hence facilitating and mainly automating the calculation of the strategic indicators and their visualization for the decision makers. Besides the aforementioned objectives, D3M also aims to attain the following ones:</p><p>• O5: Incremental technology transfer of the proof of concept. Execute a technology transfer plan to assure an incremental evolution of the maturity level of the developed software components for D3M via validation and demonstration of the proposed proof-of-concept. • O6: Assessment of the viability of the proof of concept. Perform a market analysis to assess the technical, commercial, and social viability of the proposed product, and uncover evolutionary paths for D3M becoming a product adapted to current industry needs. • O7: Long-term sustainability of the proof of concept. Cultivate a broad network of industry and public sector contacts to create awareness and attract prospective customers. • O8: Intellectual property right assurance. Develop a strategy for managing the intellectual and industrial property rights of the developed proof of concept. • O9: Endorsing the project team with entrepreneurship skills. Define a training plan with a list of entrepreneurship courses and monitor its execution. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Use cases</head><p>Here, we depict two industrial projects that serve as use cases for D3M. Currently, each one is evolving ODIN and Strategic Dashboard in parallel, so that it is possible to apply the improvements achieved to the overall D3M project. For each, we describe the use case context and how adopting D3M can aid in the organization's decision making needs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Development of an imaging platform development for Malaria and Neglected Tropical Diseases (NTDs)</head><p>A recent study by the World Health Organization shows that in 2018 an estimated 228 million cases of malaria occurred worldwide, the majority in the African region <ref type="foot" target="#foot_4">6</ref> . The SDG targets 3. The role of D3M in this use case is to empower epidemiologists to cross-analyze diagnosis data predicted automatically by the imaging platform with other contextual data collected from the available data sources. For instance, the analysis of comorbidity, or coinfections, represents a paradigm change in how health diseases are treated. Traditionally, individual diagnoses were performed for each analyzed disease. However, major disease outbreaks have shown that previous conditions can impact the diagnosis. Similarly, many countries (un)intentionally omit to report on new infection cases, either due to limited resources or political issues. To get a complete picture of the situation, cross country-reported data with other sources may indicate the prevalence (e.g., amount of medicine requested). However, data integration needed to calculate these indicators is far from being trivial, especially in the case of NTDs that lack systematic data collection and in developing countries with minimal resources. To that end, as depicted in Figure <ref type="figure" target="#fig_4">4</ref>, D3M presents the user with a knowledge graph conceptualizing all domain elements of interest which are further linked to the different available data sources. With D3M, epidemiologists will be able to cross different data sources guided by relevant strategic indicators from the analytical dashboard, thus obtaining a more realistic and complete picture of the situation, and making a paradigm shift from a disease-centric to a patient-centric analysis. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Software Analytics</head><p>The domain of Software Analytics is broad and can be applied to various environments. This use case focuses on the higher education (i.e., universities) domain via the project Implementation of a dashboard for monitoring the progress of software projects developed by student teams, which can be easily extrapolated to scenarios with teams of junior software developers. The project aims to allow both students and professors to receive accurate and objective feedback on the individual and team learning process. Informed decisions can be made about prioritizing, planning, and evaluating their actions throughout the project. To this aim, an onboarding step will be beneficial for training juniors to learn how to extract insights and take data driven decisions from the information generated by the dashboard. D3M comes into play by using the Strategic Dashboard, which was adapted to the specific domain by creating new Connectors, and defining specific Quality Metrics, and customizing several Visualizations, so as part of future work it would be helpful to incorporate ODIN that can give support for Connectors' and Quality Model dataspace management systems.</p><p>The first connector that we created for GitHub, a provider of Internet hosting for software development and version control using Git, allows us to collect information about commits, modified lines, and issues. Another connector was made for Taiga, a free and open-source project management system for startups and agile developers, this one supplies us data about the Scrum methodology' resources, as user stories and tasks to deal with in each Sprint. Based on the information provided by these connectors, we defined different metrics: (i) the percentage of commits of each member of the team and its corresponding modified lines; (ii) the percentage of tasks assigned to each developer; (iii) the percentage of closed task by assignee; (iv) number of tasks without assignee and (v) standard deviation on numbers of commits or tasks. In addition to the previous metrics, we decided to focus on the quality and correctness of team members's information to connectors when they use Taiga or GitHub. For instance, we check (vi) if acceptance criteria are used when a user story is created or (vii) if a standard user story pattern is applied, also it is interesting to see (viii) if commits contain real task reference. In conclusion, all of them help to monitor the progress of software projects, some from the point of view of project management and others from the point of view of code development.</p><p>For team project metrics visualizations (see Figure <ref type="figure" target="#fig_5">5</ref>), there is a display of the current evaluation, which is calculated according to the formula settings for each metric and from the data collected by the connectors for this particular day. With the following representation, we can see the exact value of the metric rounded to the hundredth, through a half circle graph with different color categories. The categories are customizable, that is, the number of colors and the limits of each color can be defined in a way that best suits the metrics. Apart from the current evaluation, it is possible to visualize the historical data of the metrics through a line graph, that is, their evolution over time, to monitor progress as the course progresses. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Relevance to information science</head><p>The underlying research carried out in the context of the D3M project addresses a broad spectrum of challenges related to the information science field. Indeed, being D3M a project oriented to the development of a software prototype, it can fall in the area of Information Systems and their Engineering. Additionally, given that data integration is at the heart of D3M, it naturally fits the Data and Information Management area, and Data Science. Furthermore, considering the applicability of the project results via use cases to the industry, D3M is also relevant for the Domain-specific IS Engineering area (e.g., for the health or educational domains).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Open lines of research</head><p>Numerous open lines of research arise from D3M. A key question to be addressed is how far can we automate the process of data integration? In other words, where is the sweet spot that allows automating manual and cumbersome tasks without compromising the quality of the results obtained when the user is involved. It is already known that a fully-automated approach to data integration is not feasible, given that there will always exist some level of uncertainty and ambiguity. Nevertheless, we strive to minimize the efforts required by users to address these cases.</p><p>Another scientific challenge that D3M should face is how to create and assess domain-specific strategic indicators for any domain? In this regard, we have already met some of the issues that must be addressed in the future: (i) enable the on-demand and incremental definition of metrics, factors, and strategic indicators; (ii) define and implement a comprehensive catalog of visualizations for such metrics/factors/indicators; and (iii) simplify and automate as much as possible the configuration and deployment of strategic dashboards in new domains.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions</head><p>In this paper, we have presented the D3M project, an ongoing two-year project that will combine and extend the efforts accomplished in ODIN and the Strategic Dashboard into a unified tool. This solution will provide data wranglers with the mechanisms to easily integrate heterogeneous data sources and have the means to extract analytical insights for data-driven decisions. The features of D3M will be used on two industrial projects related to the domains of healthcare and software development. We believe the results of D3M will provide the following achievements: (i) scalable and automated data integration life cycle, (ii) effectively democratizing data access, (iii) advanced analytic models for predicting and optimizing outcomes, (iv) a set of user-friendly dashboards to assist non-technical endusers with exploratory and analytical tasks. Therefore, D3M can have a significant impact on the industry.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: High-level overview of ODIN The Strategic Dashboard. The Strategic Dashboard is a modular, configurable, and extensible software analytics tool used in Agile Software Development projects to improve the software development process and the quality of the software produced. The Strategic Dashboard (Figure 2) enables decision makers to define their own Quality Model, which is composed of quality-related Strategic Indicators (e.g., customer satisfaction, process performance, risk level) [4], decomposed in their turn into Quality Factors related to system development and usage (e.g., development speed, software performance). Quality factors are defined over different Quality Metrics (e.g., commits per day, duplicated lines of code, software response time). The Strategic Dashboard automatically performs a quality assessment of the whole quality model defined. Raw data is collected from multiple sources of information, such as development tools used by the software development team (e.g., JIRA, Github) and software usage from end-users (e.g. software logs). All the information is collected through Connectors that feed a Distributed Data Sink from which the quality metrics, quality factors, and strategic indicators are computed bottom-up. The quality assessment enables the strategic dashboard to perform several analyses that are provided to the Decision Maker: • Visualization of the current (and historical) status of software products and development processes through an easy-to-use interface with advanced navigational capabilities. • What-if analysis techniques enable decision makers to evaluate different scenarios based on the impact of metrics on quality factors and, further on, on the strategic indicators. • Forecasting techniques estimate the values of the strategic indicators and quality factors in a time frame, to predict and anticipate future issues in the software development process. • Semi-automatic generation of new requirements in response to alerts when a quality model element (typically, a strategic indicator) drops below unsatisfactory levels of quality [9].</figDesc><graphic coords="3,109.46,110.00,392.51,156.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Strategic Dashboard architecture</figDesc><graphic coords="3,75.00,582.60,438.00,91.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Overview of the D3M concept</figDesc><graphic coords="5,131.00,72.00,332.80,408.64" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>3 and 3.8 call for an end to such kind of epidemics by 2030. The main goal of this project is to develop an imaging platform by using artificial intelligence techniques for automated diagnosis of Malaria, Tuberculosis, and NTDs. The specific objectives are: (i) create an open source image bank and database; (ii) develop an image diagnostic system by image analysis using artificial intelligence techniques; (iii) develop software for Android-phones to move the microscopy slides, images acquisition, image analysis, and diagnosis; (iv) model the laboratory management software to be able to import the microscopic images and resend them to the general microscopy image bank; (vi) establish a quality control of the slides preparation, digital microscopic images and image diagnosis; (vii) validate the imaging platform in the field.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Visualization of the health use case knowledge graph</figDesc><graphic coords="6,119.40,325.00,369.80,199.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Visualization of the current state (on the left) and historical evolution (on the right) of the metrics.</figDesc><graphic coords="7,72.00,325.00,451.60,122.90" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://d3m.upc.edu/en</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://insside.upc.edu</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://gessi.upc.edu/en</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.essi.upc.edu/dtim</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">https://www.who.int/publications/i/item/9789241565721</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>This paper has been funded by the Spanish Agencia Estatal de Investigación (AEI) under project / funding scheme PDC2021-121195-I00.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Data Integration: A Theoretical Perspective</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lenzerini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems</title>
				<meeting>the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="233" to="246" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Graph-driven Federated Data Management</title>
		<author>
			<persName><forename type="first">S</forename><surname>Nadal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vansummeren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vassiliadis</surname></persName>
		</author>
		<ptr target="https://ieeexplore.ieee.org/document/9422168" />
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Survey of Directly Mapping SQL Databases to the Semantic Web</title>
		<author>
			<persName><forename type="first">J</forename><surname>Sequeda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">H</forename><surname>Tirmizi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Corcho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge Eng. Review</title>
		<imprint>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">26</biblScope>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Continuously assessing and improving software quality with software analytics tools: a case study</title>
		<author>
			<persName><forename type="first">S</forename><surname>Martínez-Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Vollmer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jedlitschka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="68219" to="68239" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Critical analysis of Big Data challenges and analytical methods</title>
		<author>
			<persName><forename type="first">U</forename><surname>Sivarajah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">M</forename><surname>Kamal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Irani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Weerakkody</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Business Research</title>
		<imprint>
			<biblScope unit="volume">70</biblScope>
			<biblScope unit="page" from="263" to="286" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<ptr target="https://www.forrester.com/report/The+Forrester+Wave+Value+Stream+Management+Solutions+Q3+2020/-/E-RES159825" />
		<title level="m">The Forrester Wave™: Value Stream Management Solutions</title>
				<imprint>
			<date type="published" when="2020">Q3 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">ODIN: A Dataspace Management System</title>
		<author>
			<persName><forename type="first">S</forename><surname>Nadal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Rabbani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tadesse</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference (ISWC</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="185" to="188" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">An integration-oriented ontology to govern evolution in Big Data ecosystems</title>
		<author>
			<persName><forename type="first">S</forename><surname>Nadal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abelló</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Vassiliadis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vansummeren</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Syst</title>
		<imprint>
			<biblScope unit="volume">79</biblScope>
			<biblScope unit="page" from="3" to="19" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Data-driven and Tool-supported Elicitation of Quality Requirements in Agile Companies</title>
		<author>
			<persName><forename type="first">M</forename><surname>Oriol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Software Quality Journal</title>
		<imprint>
			<biblScope unit="volume">28</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="931" to="963" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
