Graz University of Technology Institute for Software Technology Inffeldgasse 16b/2 A-8010 Graz Austria Alexander Felfernig, Juha Tiihonen, and Paul Blazek, Editors Proceedings of the 1st International Workshop on Personalization & Recommender Systems in Financial Services April 16, 2015, Graz, Austria Chairs Alexander Felfernig, Graz University of Technology, Austria Juha Tiihonen, University of Helsinki, Finland Paul Blazek, cyLEDGE, Austria Program Committee Zoran Anišić, University of Novi Sad, Serbia Mathias Bauer, mineway GmbH, Germany Shlomo Berkovsky, NICTA, Australia Paul Blazek, cyLEDGE, Austria Robin Burke, DePaul University, IL, USA Kuan-Ta Chen, Academia Sinica, Taiwan Li Chen, Hong Kong Baptist University, China Marco De Gemmis, University of Bari, Italy John O’Donovan, University of California Santa Barbara, CA, USA Alexander Felfernig, Graz University of Technology, Austria Gerhard Friedrich, Alpen-Adria-Universitaet Klagenfurt, Austria Hagen Habicht, CLIC, HHL Leipzig Graduate School of Management, Germany Dietmar Jannach, TU Dortmund, Germany Gerhard Leitner, Alpen-Adria-Universitaet Klagenfurt, Austria Pasquale Lops, University of Bari, Italy Hans Lundberg, Linnaeus University, Sweden Eetu Mäkelä, Aalto University, Finland Birgit Penzenstadler, California State University Long Beach, CA, USA Giovanni Semeraro, University of Bari, Italy Ian Sutherland, IEDC-Bled School of Management, Slovenia Juha Tiihonen, Aalto University, Finland Nava Tintarev, University of Aberdeen, UK Shuang-Hong Yang, Twitter Inc., CA, US Markus Zanker, Alpen-Adria-Universitaet Klagenfurt, Austria Organizational Support Martin Stettinger, Graz University of Technology, Austria Preface Personalization and recommendation technologies provide the basis for applications that are tailored to the needs of individual users. These technologies play an increasingly important role for financial service providers. The selection of papers of this year’s workshop demonstrates the wide range of techniques including contributions on knowledge-based recommender systems, case-based reasoning, knowledge interchange, psychological aspects of recommender systems in financial services, MediaWiki-based recommendation technologies, smart data analysis and big data, and campaign customization. The workshop is of interest for both, researchers working in the various fields of personalization and recommender systems as well as for industry representatives. It provides a forum for the exchange of ideas, evaluations, and experiences. As such, this year's workshop on “Personalization & Recommender Systems in Financial Services” aims at providing a stimulating environment for knowledge-exchange among academia and industry and thus building a solid basis for further developments in the field. Alexander Felfernig, Juha Tiihonen, and Paul Blazek Contents Smart Data Analysis for Financial Services (invited talk) Mathias Bauer 1—2 Conflict Management in Interactive Financial Service Selection Alexander Felfernig and Martin Stettinger 3—10 An Integrated Knowledge Engineering Environment for Constraint-based Recommender Systems 11—18 Stefan Reiterer A Personal Data Framework for Exchanging Knowledge about Users in New Financial Services 19—26 Beatriz San Miguel, Jose M. del Alamo and Juan C. Yelmo Human Computation Based Acquisition Of Financial Service Advisory Practices Alexander Felfernig, Michael Jeran, Martin Stettinger, Thomas Absenger, 27—34 Thomas Gruber, Sarah Haas, Emanuel Kirchengast, Michael Schwarz, Lukas Skofitsch, and Thomas Ulz Case-based Recommender Systems for Personalized Finance Advisory (invited talk) Cataldo Musto and Giovanni Semeraro 35—36 PSYREC: Psychological Concepts to enhance the Interaction with Recommender Systems 37—44 Gerhard Leitner Smart Data Analysis for Financial Services Mathias Bauer1 Abstract.1 This talk addresses opportunities for the application of Data Analysis can (and should) play a central role at various stages intelligent data analysis techniques at various stages of the value of the value added chain in the financial industry. In the following added chain for financial services. After introducing some basic we will have a closer look at some relevant activities in this notions and explaining the fundamental steps of data mining, we context. will have a closer look at various recent and ongoing projects and discuss issues of practical relevance such as data quality and expert knowledge. The talk concludes with some remarks on the potential 2.1 Appraisal of real economic goods impact of new developments, e. g. in the context of Big Data. Scoring and rating processes are at the heart of financial industry. Here we will demonstrate an approach to appraise vessels as typical representatives of real economic goods which form an important class of investments. 2.2 Fraud detection In B2B scenarios a company's annual accounts form the basis for their credit rating and all further negotiations. Usually, the numbers reported are accepted as a correct representation of last year's business activities. But what if they are manipulated? We describe an approach that identifies abnormalities in annual accounts, thus facilitating the detection of intentional manipulations. 2.3 Identifying interesting customers There are numerous aspects that can make a customer particularly interesting to a company – his/her interest in certain products, Figure 1: The CRISP-DM process model for data mining. credit-worthiness and default risk, churn rate etc. We describe an integrated approach to identify these individuals that reduces the marketing effort required while simultaneously improving the 1 DATA MINING company's insight into their customer base and the quality of Data mining – this notion will be used as a synonym for all kinds customer contact. of smart data analysis – is a complex process that aims at turning In particular, we will see how the modeling technique applied raw data into actionable knowledge (see Figure 1 which depicts a affects the usefulness of the analytical findings. standard process model). We will introduce the basic notions, discuss the various steps and in particular have a closer look at the 2.4 Stock selection choices to be made and a few pitfalls to be avoided. In particular, we will address the crucial aspects of how to From an abstract point of view, selecting a relevant set of stocks is choose an appropriate modeling approach and how to assess the similar to the previous task as it mainly involves segmentation and quality of a solution found by a data analyst. classification efforts. However, we will see that data preprocessing We show that in many cases it is not a good idea to simply in this case is significantly more complex and requires some apply the data analyst's favorite modeling technique. Instead we advanced expert knowledge. describe the various dimensions of such a choice and encourage the end users of a data analysis to clearly state their requirements. 3 Perspective Big data is more than a buzzword – even if it's not the silver bullet 2 SAMPLE APPLICATIONS for all problems ahead. We will discuss various techniques and attempts to commercially make use of huge, largely unstructured data sets and briefly discuss potential future applications. 1 mineway GmbH, Saarbrücken, Germany, email: mbauer@mineway.de Page 1 Page 2 Conflict Management in Interactive Financial Service Selection Alexander Felfernig1 and Martin Stettinger1 Abstract. Knowledge-based systems are often used to support this paper we will focus on the application of the concepts of model- search and navigation in a set of financial services. In a typical pro- based diagnosis [27, 5]. A first application of model-based diagnosis cess users are defining their requirements and the system selects and to the automated identification of erroneous constraints in knowl- ranks alternatives that seem to be appropriate. In such scenarios sit- edge bases is reported in Bakker et al. [1]. In their work the au- uations can occur in which requirements can not be fulfilled and al- thors show how to model the task of identifying faulty constraints ternatives (repairs) must be proposed to the user. In this paper we in a knowledge base as a diagnosis task. Felfernig et al. [8] extend provide an overview of model-based diagnosis techniques that can the approach of Bakker et al. [1] by introducing concepts that al- be applied to indicate ways out from such a ”no solution could be low the automated debugging of (configuration) knowledge bases found” dilemma. In this context we focus on scenarios from the do- on the basis of test cases. If one or more test cases fail within the main of financial services. scope of regression testing, a diagnosis process is activated that de- termines a minimal set of constraints in such a way that the deletion of these constraints guarantees that each test case is consistent with 1 Introduction the knowledge base. Model-based diagnosis [27] relies on the exis- Knowledge-based systems such as recommenders [2, 18] and config- tence of conflict sets which represent minimal sets of inconsistent urators [6, 9, 28] are often used to support users (customers) who are constraints. Conflict sets can be determined by conflict detection al- searching for solutions fitting their wishes and needs. These systems gorithms such as Q UICK XP LAIN [19]. select and also rank alternatives of relevance for the user. Examples Beside the automated testing and debugging of inconsistent of such applications are knowledge based recommenders that support knowledge bases, model-based diagnosis is also applied in situations users in the identification of relevant financial services [10, 11] and where the knowledge base per se is consistent but a set of customer configurators that actively support service configuration [12, 20]. requirements induces an inconsistency. Felfernig et al. [8] also sketch The mentioned systems have the potential to improve the under- an approach to the application of model-based diagnosis to the iden- lying business processes, for example, by reducing error rates in the tification of minimal sets of fault requirements. Their approach is context of order recording and by reducing time efforts related to based on breadth-first search that uses diagnosis cardinality as the customer advisory. Furthermore, customer domain knowledge can only ranking criteria. be improved by recommendation and configuration technologies; A couple of different approaches to the determination of person- through the interaction with these systems customers gain a deeper alized diagnoses for inconsistent requirements have been proposed. understanding of the product domain and – as a direct consequence DeKleer [4] introduces concepts for the probability-based identifica- – less efforts are triggered that are related to the explanation of basic tion of leading diagnoses. O’Sullivan et al. [25] introduce the concept domain aspects. For a detailed overview of the advantages of apply- of representative explanations (diagnosis sets) where each existing ing such technologies we refer the reader to [9]. diagnosis element is contained in at least one diagnosis of a repre- When interacting with knowledge-based systems, situations can sentative set of diagnoses. Felfernig et al. [13] show how to integrate occur where no recommendation or configuration can be identified. basic recommendation algorithms into diagnosis search and with this In order to avoid inefficient manual adaptations of requirements, to increase the prediction quality (in terms of precision) of diagnos- techniques can be applied which automatically determine repair ac- tic approaches. Felfernig et al. [14] extend this work and compare tions that allow to recover from an inconsistency. For example, if different personalization approaches with regard to their prediction a customer is interested in financial services with high return rates quality and the basis of real-world datasets. Based on the concepts of but at the same time does not accept risks related to investments, no Q UICK XP LAIN, Felfernig et al. [15] introduced FAST D IAG which corresponding solution will be identified. improves the efficiency of diagnosis search by omitting the calcuala- There are quite different approaches to deal with the so-called no tion of conflicts as a basis for diagnosis calculation. This diagnostic solution could be found dilemma – see Table 1. In the context of approach is also denoted as direct diagnosis [17]. The applicability 1 of FAST D IAG has also been shown in SAT solving scenarios [23]. Applied Software Engineering, Institute for Software Technology, Graz University of Technology, Austria, email: {felfernig, stet- Different types of knowledge-based systems have already been tinger}@ist.tugraz.at. applied to support the interactive selection and configuration of fi- Page 3 Topic Reference Reiter 1987 [27], DeKleer Foundations of model-based diagnosis et al. 1992 [5] Conflict detection and model-based diagnosis of inconsistent Bakker et al. 1993 [1] constraint satisfaction problems (CSPs) Regression testing and automated debugging of configuration Felfernig et al. 2004 [8] knowledge bases using model-based diagnosis (breadth-first search) Identification of minimal diagnoses for user requirements for the purpose of consistency preservation (breadth-first search) Identification of preferred minimal conflict sets on the basis of a Junker 2004 [19] divide-and-conquer based algorithm (Q UICK XP LAIN) Identification of representative explanations (each existing diagnosis O’Sullivan et al. 2007 [25] element is contained in at least one diagnosis of the result set) Identification of personalized diagnoses on the basis of Felfernig et al. 2009,2013 recommendation algorithms [13, 14] Probability based identification of leading diagnoses DeKleer 1990 [4] Identification of preferred minimal diagnoses on the basis of a Felfernig et al. 2012 [15] divide-and-conquer based algorithm (FAST D IAG) Marques-Silva et al. 2013 Preferred minimal diagnoses for SAT based knowledge representations [23] Table 1. Overview of research related to conflict management in knowledge-based systems. nancial services. Fano and Kurth [7] introduce an approach to the tions where (personalized) solutions are determined on the basis of visualization and planning of financial service portfolios. The simu- conjunctive queries [13]. In Section 5 we provide one further exam- lation is based on an integrated model of a human’s household and ple of consistency management in the loan domain. In Section 6 we interdependencies between different financial decisions. Felfernig et discuss issues for future work. With Section 7 we conclude the paper. al. [10, 11] show how to apply knowledge-based recommender ap- plications for supporting sales representatives in their dialogs with customers. Major improvements that can be expected from such an 2 Constraint-based Representations approach are less errors in the offer phase and more time for ad- Constraint Satisfaction Problems (CSPs) [16, 22] are successfully ditional customer meetings. An approach to apply the concepts of applied in many industrial scenarios such as scheduling [26], con- cased-based reasoning [21] for the purpose of recommending finan- figuration [9], and recommender systems [18]. The popularity of this cial services is introduced by Musto et al. [24]. type of knowledge representation can be explained by the small set The major focus of this paper is to provide an overview of tech- of representation concepts (only variables, related domains, and con- niques that help to recover from inconsistent situations in an auto- straints have to be defined) and the still high degree of expressivity. mated fashion. In this context we show how inconsistencies can be Definition 1 (Constraint Satisfaction Problem (CSP) and Solu- identified and resolved. The major contributions of this paper are the tion). A constraint satisfaction problem (CSP) can be defined as a following: (1) we provide an overview of error identification and re- triple (V, D, C) where V = {v1 , v2 , ..., vn } represents a set of vari- pair techniques in the context of financial services recommendation ables, dom(v1 ), dom(v2 ), ..., dom(vn ) represents the correspond- and configuration. (2) We show how diagnosis and repair techniques ing variable domains, and C = {c1 , c2 , ..., cm } represents a set of can be applied on the basis of different knowledge representations constraints that refer to corresponding variables and reduce the num- (CSPs as well as table-based representations). (3) We provide an out- ber of potential solutions. A solution for a CSP is defined by an as- look of major issues for future work. signment A of all variables in V where A is consistent with the con- The remainder of this paper is organized as follows. In Section straints in C. 2 we introduce basic definitions of a constraint satisfaction problem Usually, user requirements are interpreted as constraints (CSP) and a corresponding solution. On the basis of these defini- CREQ = {r1 , r2 , ..., rq } where ri represent individual user re- tions we introduce a first working example from the financial ser- quirements. In this paper we assume that the constraints in C are vices domain. Thereafter (in Section 3) we introduce a basic defi- consistent and inconsistencies are always induced by the constraints nition of a diagnosis task and show how diagnoses and repairs for in CREQ. If such a situation occurs, we are interested in the ele- inconsistent user requirements can be determined. In Section 4 we ments of CREQ which are responsible for the given inconsistency. switch from constraint-based to table-based knowledge representa- On the basis of a first example we will now provide an overview of Page 4 diagnosis techniques that can be used to recover from such incon- of HSDAG construction (an example is depicted in Figure 1). In the sistent situations. An example of a CSP in the domain of financial context of our example of C and CREQ, a first minimal conflict set services is the following. For simplicity we assume that each vari- that could be returned by an algorithm such as Q UICK XP LAIN [19] able has the domain {low, medium, high}. is CS1 : {r1 , r3 }. • V = {av, wr, rr} • dom(av) = dom(wr) = dom(rr) = {low, medium, high} • C = {c1 : ¬(av = high∧wr = high), c2 : ¬(wr = low∧rr = high), c3 : ¬(rr = high ∧ av = high)} An overview of the variables of this CSP is given in Table 2. variable description ri ∈ CREQ av availability r1 : av = high wr willingness to take risks r2 : wr = low rr expected return rate r3 : rr = high Figure 1. Hitting Set Directed Acyclic Graph (HSDAG) for requirements CREQ = {r1 : av = high, r2 : wr = low, r3 : rr = high}. Table 2. Overview of variables used in the example CSP definition. In addition to this basic CSP definition we introduce an example set of customer requirements CREQ = {r1 : av = high, r2 : wr = There are two possibilities of resolving CS1 , either by delet- low, r3 : rr = high} which is inconsistent with the constraints ing requirement r1 or by deleting requirement r3 . If we delete r3 defined in C. On the basis of this simplified financial service knowl- (see Figure 1), we managed to identify the first minimal diagnosis edge base defined as a CSP we will now show how inconsistencies ∆1 = {r3 } which is also a minimal cardinality diagnosis. The sec- induced by customer requirements can be identified and resolved. ond option to resolve CS1 is to delete r1 . In this situation, another conflict exists in CREQ, i.e., a conflict detection algorithm would return CS2 : {r2 , r3 }. Again, there are two possibilities to resolve 3 Diagnosis & Repair of Inconsistent Constraints the conflict (either by deleting r2 or by deleting r3 ). Deleting r3 leads In our working example, the requirements CREQ and the set of to a diagnosis which is not minimal since {r3 } itself is already a di- constraints C are inconsistent, i.e., inconsistent(CREQ ∪ C). In agnosis. Deleting r2 leads to the second minimal diagnosis which is such situations we are interested in a minimal set of requirements ∆2 = {r1 , r2 }. that have to be deleted or adapted such that consistency is restored. The diagnoses ∆1 and ∆2 are indicators of minimal changes that Consistency resolution is in many cases based on the resolution of need to be performed on the existing set of requirements such that conflicts. In our case, a minimal conflict is represented by a minimal a consistency between CREQ and C can be restored. The issue of set of requirements in CREQ that have to be deleted or adapted such finding concrete repair actions for the requirements contained in a that consistency can be restored. diagnosis will be discussed later in this paper. Definition 2 (Conflict Set). A conflict set CS is a subset of CREQ There can be quite many alternative diagnoses. In this context it s.t. inconsistent (CS ∪ C). A conflict set is minimal if there does is not always clear which diagnosis should be selected or in which not exist another conflict set CS 0 with CS 0 ⊂ CS. A minimal car- order alternative diagnoses should be shown to the user. In the fol- dinality conflict set CS is a minimal conflict set with the additional lowing we present one approach to rank diagnoses. The approach we property that there does not exist another minimal conflict CS 0 with sketch is based on multi-attribute utility theory [29] where we assume |CS 0 | < |CS|. that customers provide weights for each individual requirement. In Minimal conflict sets can be determined on the basis of con- the example depicted in Table 3, two customers specified their pref- flict detection algorithms such as Q UICK XP LAIN [19]. They can be erences in terms of weights for each requirement. For example, cus- used to derive diagnoses. In our case, a diagnosis ∆ represents a tomer 1 specified a weight of 0.7 for the requirement r3 : rr = high, set of requirements that have to be deleted from CREQ such that i.e., the attribute rr is of highest importance for the customer. These C ∪ (CREQ − ∆) is consistent, i.e., diagnoses help to restore the weights can be exploited for ranking a set of diagnoses. consistency between CREQ and C. Formula 1 can be used for determining the overall importance Definition 3 (Diagnosis Task and Diagnosis). A diagnosis task can (imp) of a set of requirements (RS). The higher the importance the be defined as a tuple (C, CREQ) where C represents a set of con- lower the probability that these requirements are element of a diag- straints in the knowledge base and CREQ represents a set of cus- nosis shown to the customer. Requirement r3 has a high importance tomer requirements. ∆ is a diagnosis if CREQ−∆∪C is consistent. for customer 1, consequently, the probability that r3 is contained in A diagnosis ∆ is minimal if there does not exist a diagnosis ∆0 with a diagnosis shown to customer 1 is low. ∆0 ⊂ ∆. Furthermore, ∆ is a minimal cardinality diagnosis if there does not exist a diagnosis ∆0 with |∆0 | < |∆|. imp(RS) = importance(RS) = Σr∈RS weight(r) (1) A standard approach to the determination of diagnoses is based on the construction of a hitting set directed acyclic graph (HSDAG) [27] Formula 2 can be used to determine the relevance of a partial or where minimal conflict sets are successively resolved in the process complete (minimal) diagnosis, i.e., this formula can be used to rank Page 5 customer weight(r1 : av = high) weight(r2 : wr = low) weight(r3 : rr = high) 1 0.1 0.2 0.7 2 0.3 0.5 0.2 Table 3. Individual weights regarding the importance of the requirements CREQ ={r1 , r2 , r3 }. diagnoses with regard to their relevance for the customer. The higher the relevance of a diagnosis, the higher the ranking of the diagnosis in a list of diagnoses shown to the customer. 1 rel(∆) = relevance(∆) = (2) importance(∆) Tables 4 and 5 show the results of applying Formulae 1 and 2 to the customer preferences (weights) shown in Table 3. For customer 1 (see Table 4), diagnosis ∆2 = {r1 , r2 } has the highest relevance. For customer 2 (see Table 5), diagnosis ∆1 = {r3 } has the highest relevance. Consequently, diagnosis ∆2 is the first one that will be shown to customer 1 and diagnosis ∆1 is the first one that will be Figure 3. FAST D IAG approach to diagnosis determination. CREQ shown to customer 2. represents a set of customer requirements and C represents a set of constraints. The algorithm is based on a divide-and-conquer approach: if diagnosis ∆j importance(∆j ) relevance(∆j ) {r1 , r2 , ..., rk/2 } is consistent with C then diagnosis search can be ∆1 : {r3 } 0.7 1.43 continued in {rk/2+1 ...rk }. ∆ is a diagnosis if CREQ − ∆ ∪ C is ∆2 : {r1 , r2 } 0.3 3.33 consistent. Table 4. Diagnosis with highest relevance (rel) determined for customer 1: ∆2 = {r1 , r2 }. The afore discussed approaches to diagnosis determination are based on the construction of a HSDAG [27]. Due to the fact that con- flicts have to determined explicitly when following this approach, di- diagnosis ∆j importance(∆j ) relevance(∆j ) agnosis determination does not scale well [13, 14]. The FAST D IAG ∆1 : {r3 } 0.2 5.0 algorithm [15] tackles this challenge by determining minimal and ∆2 : {r1 , r2 } 0.8 1.25 preferred diagnoses without the need of conflict detection. This al- gorithm has shown to have the same predictive quality as HSDAG Table 5. Diagnosis with highest relevance (rel) determined for customer 2: based algorithms that determine diagnoses in a breadth-first search ∆1 = {r3 }. regime. The major advantage of FAST D IAG is a high-performance diagnosis search for the leading diagnoses (first-n diagnoses). FAST D IAG is based on the principle of divide and conquer – see Figure 3: if a set of requirements CREQ is inconsistent with a cor- responding set of constraints C and the first part {r1 , r2 , ..., rk/2 } of CREQ is consistent with C then diagnosis search can focus on {rk/2+1 , ..., rk }, i.e., can omit the requirements in {r1 , r2 , ..., rk/2 }. A detailed discussion of FAST D IAG can be found in [15]. Determination of Repair Actions. Repair actions for diagnosis el- ements can be interpreted as changes to the originial set of require- ments in CREQ in such a way that at least one solution can be identified. If we assume that CREQ is a set of unary constraints that Figure 2. Personalized diagnosis determined for CREQ and the are inconsistent with C and ∆ is a corresponding diagnosis, then a individual importance weights defined in Table 3 (for customer1). In this set of repair actions R = {a1 , a2 , ..., al } can be identified by the con- example, ∆2 is the preferred diagnosis since relevance(∆2 ) > relevance(∆1 ). sistency check CREQ − ∆ ∪ C where aj (a variable assignment) is a repair for the constraint rj if rj is in ∆. In this section we took a look at different approaches that support On the basis of the relevance values depicted in Table 4, Figure 2 the determination of diagnoses in situations where a given set of re- depicts a HSDAG [27] with additional annotations regarding diagno- quirements becomes inconsistent with the constraints in C. In the sis relevance (rel). The higher the relevance of a (partial) diagnosis, following we will take a look at an alternative knowledge representa- the higher the ranking of the corresponding diagnosis. tion where tables (instead of CSPs) are used to represent knowledge Page 6 id return rate p.a. (rr) runtime in yrs. (rt) risk level (wtr) shares percentage (sp) acessibility (acc) bluechip(bc) 1 4.2 3.0 A 0.0 no yes 2 4.7 3.7 B 10.0 yes yes 3 4.8 3.5 A 10.0 yes yes 4 5.2 4.0 B 20.0 yes no 5 4.3 3.5 A 0.0 yes yes 6 5.6 5.0 C 30.0 no no 7 6.7 6.0 C 50.0 yes no 8 7.9 7.0 C 50.0 no no Table 6. Investment products: return rate p.a. (rr), runtime in years (rt), risk level (wtr), shares percentage (sp), accessibility (acc), and bluechip (bc). customer weight(r1 : rr ≥ 5.5) weight(r2 : rt = 3.0) weight(r3 : acc = yes) weight(r4 : bc = yes) 1 0.7 0.1 0.1 0.1 2 0.1 0.7 0.1 0.1 Table 7. Individual weights regarding the importance of the requirements CREQ ={r1 , r2 , r3 , r4 }. about financial services. Again, we will show how to deal with in- ∆ is a diagnosis if σ[CREQ−∆] T returns at least one solution. Mini- consistent situations. mality properties of diagnoses are the same as in Definition 3. The requirements rj ∈ CREQ are inconsistent with the items included in T (see Table 6), i.e., there does not exist a finan- 4 Table-based Representations cial service in T that completely fulfills the user requirements in CREQ. Minimal conflict sets that can be derived for CREQ = In Section 3 we analyzed different ways of diagnosing inconsistent {r1 : rr ≥ 5.5, r2 : rt = 3.0, r3 : acc = yes, r4 : bc = yes} CSPs [16, 22]. We now show how diagnosis can be performed on are CS1 : {r1 , r2 }, CS2 : {r2 , r3 }, and CS3 : {r1 , r4 }. The deter- a predefined set of solutions, i.e., a table-based representation. Ta- mination of the corresponding diagnoses is depicted in Figure 4. ble 6 includes an example set of investment products. The set of financial services {1, 2, ..., 8} is stored in an item table T [13] – T can be interpreted as an explicit enumeration of the possible so- lutions (defined by the set C in Section 2). Furthermore, we as- sume that the customer has specified a set of requirements CREQ = {r1 : rr ≥ 5.5, r2 : rt = 3.0, r3 : acc = yes, r4 : bc = yes}. The existence of a financial service in T that is able to fulfill all re- quirements can be checked by a relational query σ[CREQ] T where CREQ represents a set of selection criteria and T represents the corresponding product table. An example query on the product table T could be σ[rr≥5.5] T Figure 4. Hitting Set Directed Acyclic Graph (HSDAG) for requirements which would return the financial services {6,7,8}. For the query CREQ = {r1 : rr ≥ 5.5, r2 : rt = 3.0, r3 : acc = yes, r4 : bc = yes}. σ[r1 ,r2 ,r3 ,r4 ] T there does not exist a solution. In such situations we are interested in finding diagnoses that indicate minimal sets of re- quirements in CREQ that have to be deleted or adapted in order to Diagnoses are determined in the same fashion as discussed in be able to identify a solution. Section 2. Minimal diagnoses that can be derived from the conflict Definition 4 (Conflict Sets in Table-based Representations). A con- sets CS1 , CS2 , and CS3 are ∆1 : {r1 , r2 }, ∆2 : {r1 , r3 } and flict set CS is a subset of CREQ s.t. σ[CS] T returns an empty result ∆3 : {r2 , r4 } (see Figure 4). set. Minimality properties of conflict sets are the same as introduced Again, the question arises which of the diagnoses has the high- in Definition 2. est relevance for the user (customer). Table 7 depicts the importance A diagnosis task and a corresponding diagnosis in the context of distributions for the requirements of our example. Based on the im- table-based representations can be defined as follows. portance distributions depicted in Table 7 we can derive a preferred Definition 5 (Diagnosis in Table-based Representations). A diag- diagnosis (see Figure 5). Diagnosis ∆3 will be first shown to cus- nosis task can be defined as a tuple (T, CREQ) where T represents a tomer 1 since ∆3 has the highest evaluation in terms of relevance product table and CREQ represents a set of customer requirements. (see Formula 2). The first diagnosis shown to customer 2 is ∆2 . Page 7 Figure 5. Personalized diagnoses determined for CREQ and the individual importance weights defined in Table 7 (for customer 1). In this example, ∆3 is the preferred diagnosis. diagnosis ∆j importance(∆j ) relevance(∆j ) ∆1 : {r1 , r2 } 0.8 1.25 ∆2 : {r1 , r3 } 0.8 1.25 ∆3 : {r2 , r4 } 0.2 5.0 Table 8. Diagnosis with highest relevance (rel) determined for customer 1: ∆3 = {r2 , r4 }. diagnosis ∆j importance(∆j ) relevance(∆j ) ∆1 : {r1 , r2 } 0.8 1.25 ∆2 : {r1 , r3 } 0.2 5.0 ∆3 : {r2 , r4 } 0.8 1.25 Table 9. Diagnosis with highest relevance (rel) determined for customer 2: ∆2 = {r1 , r3 }. id creditworthiness(cw) loan limit(ll) runtime in yrs.(rt) interest rate (ir) 1 1 30.000 5.0 3% 2 2 25.000 5.0 4% 3 3 20.000 5.0 5% 4 1 40.000 6.0 4% 5 2 35.000 6.0 5% 6 3 30.000 7.0 5.2% 7 1 40.000 5.0 3% 8 2 35.000 5.0 3.5% 9 3 30.000 5.0 5% Table 10. Loans: creditworthiness (cw), loan limit (ll), runtime in years (rt), and interest rate (ir). Page 8 5 An Additional Example: Selection of Loans The requirements CREQ include one minimal conflict set which is CS1 : {r3 , r4 }. Consequently, there exist two different possibili- As a third example we introduce the domain of loans. The entries in ties to resolve the conflict: one possibility is to change the value for Table 10 represent different loan variants that can be chosen by cus- the intended runtime (irt) from 6.0 years to 5.0 years and to keep the tomers. Customers can specify their requirements on the basis of the preferred interest rate (pir) as is. The other possibility is to change variables depicted in Table 11. Furthermore, the different loan vari- the preferred interest rate from 4.5% to 6% and to keep the intended ants are characterized by their expected creditworthiness (cw), loan runtime as is. The overall loan costs related to these two alternatives limit (ll), runtime in yrs. (rt), and interest rate (ir). These variables are depicted in Table 13. If the overall loan costs are a major criteria are basic elements of the definition of the following Constraint Sat- then repair alternative 1 would be chosen by the customer, otherwise isfaction Problem (CSP). – if the upper limit for periodical payments is strict – repair alterna- variable description ri ∈ CREQ tive 2 will be chosen. ccw current creditworthiness r1 : ccw = 3 repair alternative irt pir costs costs per year ils intended loan sum r2 : ils = 30.000 mpp maximum periodical payment – 1 5.0 yrs. 5.0% 4.500 900.00 irt intended runtime r3 : irt = 6yrs. 2 7.0 yrs. 5.2% 6.240 891.43 pir preferred interest rate r4 : pir = 4.5% Table 13. Loan costs for different repair alternatives. Table 11. Overview of variables used in the example CSP definition (loans). • V = {ccw, ils, mpp, irt, pir, cw, ll, rt, ir} 6 Future Work • dom(ccw) = dom(cw) = {1,2,3}; dom(ils) = dom(ll) = float; dom(mpp) = float; dom(irt) = dom(rt) = integer; dom(pir) = A major issue for interactive applications is to guarantee reasonable dom(ir) = integer. response times which should be below one second [3]. This goal can • C = {c1 : ccw ≤ cw, c2 : ils ≤ ls, c3 : irt = rt, c4 : pir ≥ not be achieved with standard diagnosis approaches since they typi- ir, c5 : see below, c6,7 : see below} cally rely on the (pre-)determination of conflict sets. Although exist- ing divide-and-conquer based diagnosis approaches are significantly Constraint c5 represents the entities of Table 10 in disjunctive nor- faster when determining only leading (preferred) diagnosis, i.e., not mal form, for example, the first table row can be represented as ba- all diagnoses have to be determined, there is still a need for improv- sic constraint {cw = 1 ∧ ll = 30.000 ∧ rt = 5.0 ∧ ir = 3%}. ing diagnosis efficiency in more complex settings. In this context, The disjunct of all basic constraints is the disjunctive normal form. on research issue is the development of so-called anytime diagnosis Constraints c6,7 can be used to avoid situations where the periodical algorithms that help to determine nearly optimal (e.g., in terms of payments for a loan exceed the financial resources of the customer. prediction quality) diagnoses with less computational efforts. Although the prediction quality of diagnoses significantly in- costs(id) + ils creases and numerous recommendation algorithms have already been c6 : mpp ≥ (3) rt evaluated, there is still a need for further advancing the state-of-the- (rt(id) + 1) art in diagnosis prediction. One research direction is to focus on c7 : costs(id) = ils × ir(id) × (4) learning-based approaches that help to figure out which combination 2 For the purpose of our example let us assume that the customer of a set of basic diagnosis prediction methods best performs in the has the following requirements: CREQ = {r1 : ccw = 3, r2 : considered domain. Such approaches are also denoted as ensemble- ils = 30.000, r3 : irt = 6yrs., r4 : pir = 4.5%}. Since the based methods which focus on figuring out optimal configurations of customer creditworthiness has been evaluated with 3, only three al- basic diagnosis prediction methods. ternative loan variants are available (the ids 3,6,9). These variants are Efficient calculation and high predictive quality are for sure central depicted in Table 12. issues of future research. Beyond efficiency and prediction quality, intelligent visualization concepts for diagnoses are extremely impor- id cw ll rt ir tant. For example, the the context of group decision scenarios where 3 3 20.000 5.0 yrs. 5% groups of users are in charge of resolving existing inconsistencies in 6 3 30.000 7.0 yrs. 5.2% the preferences between group members, visualizations have to be 9 3 30.000 5.0 yrs. 5% identified that help to restore consistency (consensus) in the group as soon as possible. Such visualizations could focus on visualizing Table 12. Loans accessible for the customer with creditworthiness level 3. the mental state on individual group members as well visualizing the individual decision behavior (e.g., egoism vs. altruism). Since CREQ is inconsistent with the constraints in C we could 7 Conclusions determine minimal diagnoses as indicators for possible adaptations in the requirements. A possible criteria for personalizing diagno- In this paper we give an overview of existing approaches to deter- sis ranking could be the costs related to a loan (see Formula 4). mine diagnoses in situations were no solution can be found. We first Page 9 provide an overview of existing related work and then focus on ba- [19] U. Junker, ‘Quickxplain: Preferred explanations and relaxations for sic approaches to determine diagnoses in the context of two knowl- over-constrained problems’, in 19th National Conference on AI (AAAI04), pp. 167–172, San Jose, CA, (2004). edge representation formalisms (constraint satisfaction and conjunc- [20] S. Leist and R. Winter, ‘Konfiguration von Versicherungsdienstleistun- tive query based approaches). For explanation purposes we introduce gen’, Wirtschaftsinformatik, 36(1), 45–56, (1994). three different types of financial services as working examples (basic [21] F. Lorenzi and F. Ricci, ‘Case-based recommender systems: a unifying investment decisions, selection of investment products, and loan se- view’, Intelligent Techniques for Web Personalization, 89–113, (2005). lection). On the basis of these examples we sketch the determination [22] A. Mackworth, ‘Consistency in Networks of Relations’, Artificial Intel- ligence, 8(1), 99–118, (1977). of (preferred) diagnoses. Thereafter, we provide a short discussion of [23] J. Marques-Silva, F. Heras, M. Janota, A. Previti, and A. Belov, ‘On open research issues which includes diagnosis efficiency, prediction computing minimal correction subsets’, in IJCAI 2013, pp. 615–622, quality, and intelligent visualization. Peking, China, (2013). [24] C. Musto, G. Semeraro, P. Lops, M. DeGemmis, and G. Lekkas, ‘Fi- nancial Product Recommendation through Case-based Reasoning and REFERENCES Diversification Techniques’, pp. 1–2, Foster City, CA, USA, (2014). [25] B. O’Sullivan, A. Papadopoulos, B. Faltings, and P. Pu, ‘Representative [1] R. Bakker, F. Dikker, F. Tempelman, and P. Wogmim, ‘Diagnosing explanations for over-constrained problems’, in AAAI’07, pp. 323–328, and solving over-determined constraint satisfaction problems’, in IJCAI Vancouver, Canada, (2007). 1993, pp. 276–281, Chambery, France, (1993). [26] M. Pinedo, Scheduling: Theory, Algorithms, and Systems, Springer, 4 [2] R. Burke, A. Felfernig, and M. Goeker, ‘Recommender systems: An edn., 2012. overview’, AI Magazine, 32(3), 13–18, (2011). [27] R. Reiter, ‘A theory of diagnosis from first principles’, AI Journal, [3] S. Card, G. Robertson, and J. Mackinlay, ‘The information visual- 23(1), 57–95, (1987). izer, an information workspace’, in Conference on Human Factors in [28] J. Tiihonen and A. Felfernig, ‘Towards Recommending Configurable Computing Systems: Reaching Through Technology, pp. 181–186, New Offerings’, International Journal of Mass Customization, 3(4), 389– York, NY, USA, (1991). 406, (2010). [4] J. de Kleer, ‘Using crude probability estimates to guide diagnosis’, AI [29] D. Winterfeldt and W. Edwards, ‘Decision Analysis and Behavioral Re- Journal, 45(3), 381–391, (1990). search’, Cambridge University Press, (1986). [5] J. de Kleer, A. Mackworth, and R. Reiter, ‘Characterizing diagnoses and systems’, AI Journal, 56(197–222), 57–95, (1992). [6] A. Falkner, A. Felfernig, and A. Haag, ‘Recommendation Technologies for Configurable Products’, AI Magazine, 32(3), 99–108, (2011). [7] A. Fano and S. Kurth, ‘Personal Choice Point: Helping Users Visualize What it Means to Buy a BMW’, in International Conference on In- telligent User Interfaces IUI’03, pp. 46–52, Miami, FL, USA, (2003). ACM, New York, USA. [8] A. Felfernig, G. Friedrich, D. Jannach, and M. Stumptner, ‘Consistency-based diagnosis of configuration knowledge bases’, Ar- tificial Intelligence, 152(2), 213–234, (2004). [9] A. Felfernig, L. Hotz, C. Bagley, and J. Tiihonen, Knowledge-based Configuration – From Research to Business Cases, Elsevier, Morgan Kaufmann, 2014. [10] A. Felfernig, K. Isak, K. Szabo, and P. Zachar, ‘The VITA Finan- cial Services Sales Support Environment’, pp. 1692–1699, Vancouver, Canada, (2007). [11] A. Felfernig and A. Kiener, ‘Knowledge-based Interactive Selling of Financial Services with FSAdvisor’, in 17th Innovative Applications of Artificial Intelligence Conference (IAAI05), pp. 1475–1482, Pittsburgh, Pennsylvania, (2005). [12] A. Felfernig, J. Mehlau, A. Wimmer, C. Russ, and M. Zanker, ‘Konzepte zur flexiblen Konfiguration von Finanzdienstleistungen’, Banking and Information Technology, 5(1), 5–19, (2004). [13] A. Felfernig, M. Schubert, G. Friedrich, M. Mandl, M. Mairitsch, and E. Teppan, ‘Plausible repairs for inconsistent requirements’, in 21st In- ternational Joint Conference on Artificial Intelligence (IJCAI’09), pp. 791–796, Pasadena, CA, (2009). [14] A. Felfernig, M. Schubert, and S. Reiterer, ‘Personalized diagnosis for over-constrained problems’, in 23rd International Conference on Arti- ficial Intelligence (IJCAI 2013), pp. 1990–1996, Peking, China. [15] A. Felfernig, M. Schubert, and C. Zehentner, ‘An Efficient Diagnosis Algorithm for Inconsistent Constraint Sets’, Artificial Intelligence for Engineering Design, Analysis, and Manufacturing (AIEDAM), 25(2), 175–184, (2012). [16] E. Freuder, ‘In pursuit of the holy grail’, Constraints, 2(1), 57–61, (1997). [17] G. Friedrich, ‘Interactive Debugging of Knowledge Bases’, in DX’2014, pp. 1–4, Graz, Austria, (2014). [18] D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich, Recommender Systems, Cambridge University Press, 2010. Page 10 An Integrated Knowledge Engineering Environment for Constraint-based Recommender Systems Stefan Reiterer1 Abstract. Constraint-based recommenders support customers in The user interface of the W EE V IS environment provides intel- identifying relevant items from complex item assortments. In this pa- ligent mechanisms that help to make development and mainte- per we present a constraint-based environment already deployed in nance operations easier. Based on model-based diagnosis techniques real-world scenarios that supports knowledge acquisition for recom- [12, 17, 26], the environment supports users in the following situa- mender applications in a MediaWiki-based context. This technology tions: (1) if no solution could be found for a set of user requirements, provides the opportunity do directly integrate informal Wiki content the system proposes repair actions that help to find a way out from with complementary formalized recommendation knowledge which the ”no solution could be found” dilemma; (2) if the constraints in makes information retrieval for users (readers) easier and less time- the recommender knowledge base are inconsistent with a set of test consuming. The user interface supports recommender development cases (situation detected within the scope of regression testing of the on the basis of intelligent debugging and redundancy detection. The knowledge base), those constraints are shown to the users (knowl- results of a user study show the need of automated debugging and edge engineers) who are responsible for the faulty behavior of the redundancy detection even for small-sized knowledge bases. knowledge base; (3) if the recommender knowledge base includes redundant constraints, i.e., constraints that – if removed from the knowledge base – logically follow from the remaining constraints, 1 Introduction these constraints are also determined in an automated fashion and Constraint-based recommenders support the identification of relevant shown to knowledge engineers. items from large and often complex assortments on the basis of an ex- The major contributions of this paper are the following. (1) on the plicitly defined set of recommendation rules [3]. Example item do- basis of a working example from the domain of financial services, mains are digital cameras and financial services [5, 8, 9]. For a long we provide an overview of the diagnosis and redundancy detection period of time the engineering of recommender knowledge bases (for techniques integrated in the W EE V IS environment. (2) we report the constraint-based recommenders) required that knowledge engineers results of an empirical study which analyzed the usability of W EE - are technical experts (in the majority of the cases computer scien- V IS functionalities. tists) with the needed technical capabilities [14]. Developments in The remainder of this paper is organized as follows. In Section the field moved one step further and provided graphical engineering 2 we discuss related work. In Section 3 we present an overview of environments [5], which improve the accessibility and maintainabil- the recommendation environment W EE V IS and discuss the included ity of recommender knowledge bases. However, users still have to knowledge engineering support mechanisms. In Section 4 we present deal with additional tools and technologies which is in many cases a results of an empirical study that show the need of intelligent diagno- reason for not applying constraint-based environments. sis and redundancy detection support. In Section 5 we discuss issues Similar to the idea of Wikipedia to allow user communities to de- for future work, with Section 6 we conclude the paper. velop and maintain Wiki pages in a cooperative fashion, we intro- duce the W EE V IS2 environment, which supports the community- 2 Related Work based development of constraint-based recommender applications within a Wiki environment. W EE V IS has been implemented on the Based on original static Constraint Satisfaction Problem (CSP) rep- basis of MediaWiki3 , which is an established standard Wiki platform. resenations [15, 20, 29], many different types of constraint-based Compared to other types of recommender systems such as collabo- knowledge representations have been developed. Mittal and Falken- rative filtering [19] and content-based filtering [25], constraint-based hainer [22] introduced dynamic constraint satisfaction problems recommender systems are based on an underlying recommendation where variables have an activity status and only active variables knowledge base, i.e., recommendation knowledge is defined explic- are taken into account by the search process. Stumptner et al. [28] itly. W EE V IS is already applied by four Austrian universities (within introduced the concept of generative constraint satisfaction where the scope of recommender systems courses) and two companies for variables can be generated on demand within the scope of solution the purpose of prototyping recommender applications in the financial search. Compared to existing work, W EE V IS supports the solving of services domain. static CSPs on the basis of conjunctive queries where each solution 1 corresponds to a result of querying a relational database. Addition- SelectionArts Intelligent Decision Technologies GmbH, Austria, email:stefan.reiterer@selectionarts.com ally, W EE V IS includes diagnosis functionalities that help to auto- 2 www.weevis.org. matically determine repair proposals in situations where no solution 3 www.mediawiki.org. could be found [12]. Page 11 A graphical recommender development environment for single to the knowledge can be immediately experienced by switching from users is introduced in [5]. This Java-based environment supports the the view source to the read mode). In the read mode, knowledge development of constraint-based recommender applications for on- bases can as well be tested and in the case of inconsistencies (some line selling platforms. Compared to Felfernig et al. [5], W EE V IS test cases were not fulfilled within the scope of regression testing) provides a wiki-based user interface that allows user communities to corresponding diagnoses are shown to the user. develop recommender applications. Furthermore, W EE V IS includes efficient diagnosis [12] and redundancy detection [13] mechanisms that allow the support of interactive knowledge base development. 3.1 Overview A Semantic Wiki-based approach to knowledge acquisition for The website www.weevis.org provides a selection of different rec- collaborative ontology development is introduced in [2]. Compared ommender applications (full list, list of most popular recommenders, to Baumeister et al. [2], W EE V IS is based on a recommendation do- and recommenders that have been defined previously) that can be main specific knowledge representation (in contrast to ontology rep- tested and extended. Most of these applications have been developed resentation languages) which makes the definition of domain knowl- within the scope of university courses on recommender systems (con- edge more accessible also for domain experts. Furthermore, W EE - ducted at four Austrian universities). W EE V IS recommenders can be V IS includes intelligent debugging and redundancy detection mech- integrated seamlessly into standard Wiki pages, i.e., informally de- anisms which make development and maintenance operations more fined knowledge can be complemented or even substituted with for- efficient. We want to emphasize that intended redundancies can ex- mal definitions. ist, for example, for the purpose of better understandability of the In the following we will present the concepts integrated in the knowledge base. If such constraints are part of a knowledge base, W EE V IS environment on the basis of a working example from the these should be left out from the redundancy detection process. domain of financial services. In such a recommendation scenario, A first approach to a conflict-directed search for hitting sets in in- a user has to specify his/her requirements regarding, for example, consistent CSP definitions was introduced by Bakker et al. [1]. In the expected capital guarantee level of the financial product or the this work, minimal sets of faulty constraints in inconsistent CSP def- amount of money he or she wants to invest. A corresponding W EE - initions were identified on the basis of the concepts of model-based V IS user interface is depicted in Figure 1 where requirements are diagnosis [26]. In the line of Bakker et al. [1], Felfernig et al. [4] specified on the left hand side and the corresponding recommenda- introduced concepts that allow the exploitation of the concepts of tions are displayed in the right hand side. model-based diagnosis in the context of knowledge base testing and Each recommendation (item) has a corresponding support value debugging. Compared to earlier work [4, 24], W EE V IS provides an that indicates the share of requirements that are currently supported environment for development, testing, debugging, and application of by the item. A support value of 100% indicates that each requirement recommender systems. With regard to diagnosis techniques, W EE - is satisfied by the corresponding item. If the support value is below V IS is based on more efficient debugging and redundancy detection 100%, corresponding repair alternatives are shown to the user, i.e., techniques that make the environment applicable in interactive set- alternative answers to questions that guarantee the recommendation tings [12, 16, 21]. of at least one item (with 100% support). Since W EE V IS is a MediaWiki-based environment, the definition 3 The W EE V IS Environment of a recommender knowledge base is supported in a textual fashion on the basis of a syntax similar to MediaWiki. An example of the def- In it’s current version, W EE V IS supports scenarios where user re- inition of a (simplified) financial services recommender knowledge quirements can be defined in terms of functional requirements [23]. base is depicted in Figure 2. Basic syntactical elements provided in The corresponding recommendations (solutions) are retrieved from W EE V IS will be introduced in the next subsection. a predefined set of alternatives (also denoted as item set or product catalog). Requirements are checked with regard to their consistency with the underlying item set (consistency is given if at least one so- 3.2 W EE V IS Syntax lution could be identified). If no solution could be found, W EE V IS repair alternatives are determined on the basis of direct diagnosis al- Constraint-based recommendation requires the explicit definition of gorithms [12]. This way, W EE V IS does not only support item se- questions and possible answers, items and their properties, and con- lection but also consistency maintenance processes on the basis of straints (see Figure 2). intelligent repair mechanisms [6]. In W EE V IS the tag &QUESTIONS enumerates the set of user re- W EE V IS is based on the idea that a community of users coop- quirements where, for example, pension specifies whether the user eratively contributes to the development of a recommender knowl- wants a financial product to support his private pension plan [yes, no] edge base. The environment supports knowledge acquisition pro- and maxinvestment specifies the amout of money the user wants to cesses on the basis of tags that can be used for defining and test- invest. Furthermore, payment represents the frequency in which the ing recommendation knowledge bases. Using W EE V IS, standard payment should be done [once, periodical], payout specifies the fre- Wikipedia pages can be extended with recommendation knowledge quency the customer gets a payout from the financial product (out of that helps to represent domain knowledge in a more accessible and [once,monthly]), and guarantee the expected capital guarantee [low, understandable fashion. The same principles used for the developing high]. Wikipedia pages can also be used for the development and mainte- An item assortment can be specified in W EE V IS using the nance of recommender knowledge bases, i.e., in the read mode rec- &PRODUCTS tag (see Figure 2). In our example, the item (prod- ommenders can be executed and in the view source mode recommen- uct) assortment is specified by values related to the attributes name; dation knowledge can be defined and adapted. This way, rapid pro- guaranteep, the capital guarantee the product provides; payoutp, the totyping processes can be supported in an intuitive fashion (changes payout frequency of the product; mininvestp the minimal amount of Page 12 Figure 1. A simple financial service recommender (W EE V IS read mode). money for the financial service. Three items are specified: SecureFin, COM P and F ILT . On the basis of such a definition, W EE V IS is BonusFin, and DynamicFin. able to calculate recommendations that take into account a specified Incompatibility constraints describe incompatible combinations of set of requirements. Such requirements are represented as unary con- requirements. Using the &INCOMPATIBLE keyword, we are able to straints (in our case R = {r1 , r2 , ..., rk }). describe an incompatibility between the variables pension and guar- If requirements ri ∈ R are inconsistent with the constraints in antee. For example, financial services with low guarantee must not be C, we are interested in a subset of these requirements that should recommended to users interested in a product that supports their pri- be adapted in order to be able to restore consistency. On a formal vate pension plan. Filter constraints describe relationships between level we define a requirements diagnosis task and a corresponding requirements and items, for example, maxinvest ≥ mininvestp, i.e., diagnosis (see Definition 1). the amount of money the user is willing to invest must exceed the Definition 1 (Requirements Diagnosis Task). Given a set of re- minimal payment necessary for the financial product. quirements R and a set of constraints C (the recommendation knowl- In addition the recommendation knowledge base itself, W EE V IS edge base), the requirements diagnosis task is to identify a minimal supports the specification of test cases that can be used for the pur- set ∆ of constraints (the diagnosis) that has to be removed from R poses of regression testing (see also Section 3.4). After changes to such that R − ∆ ∪ C is consistent. the knowledge base, regression tests can be triggered by setting the An example of a set of requirements inconsistent with the defined —show— tag, that specifies whether the recommender system user recommendation knowledge is R = {r1 : pension = yes, r2 : interface should show the status of the test case (satisfied or not). maxinvest = 13500, r3 : payment = periodical, r4 : payout = once, r5 : guarantee = high}. The recommendation knowledge base induces two minimal conflict sets (CS) [18] in R which are 3.3 Recommender Knowledge Base CS1 : {r1 , r5 } and CS2 : {r1 , r4 }. For these conflict sets we have Recommendation knowledge can be represented as a CSP [20] with two diagnoses: ∆1 : {r4 , r5 } and ∆2 : {r1 }. The pragmatics, for the variables V (V = U ∪ P ) and the constraints C = COM P ∪ example, of ∆1 is that at least r4 and r5 have to be adapted in order P ROD ∪ F ILT where ui ∈ U are variables describing possible to be able to find a solution. How to determine such diagnoses on the user requirements (e.g., pension) and pi ∈ P are describing item basis of a HSDAG (hitting set directed acyclic graph) is shown, for properties (e.g., payoutp). Furthermore, COM P represents incom- example, in [4]. patibility constraints of the form ¬X ∨ ¬Y , P ROD the products In interactive settings, where diagnoses should be determined in with their attributes in disjunctive normal form (each product is de- an efficient fashion [12], hitting set based approaches tend to become scribed as a conjunction of individual product properties), and F ILT too inefficient. The reason for this is that conflict sets [18] have to be the given filter constraints of the form X → Y . determined as an input for the diagnosis process. This was the ma- The knowledge base specified in Figure 2 can be translated into jor motivation for developing and integrating FAST D IAG [12] into a corresponding CSP where &QUESTIONS represents U , &PROD- the W EE V IS environment. Analogous to Q UICK XP LAIN [18], this UCTS represents P and P ROD, and &CONSTRAINTS represents algorithm is based on a divide-and-conquer based approach that en- Page 13 Figure 2. Financial services knowledge base (view source (edit) mode). ables the determination of minimal diagnoses without the determi- and thus have a higher probability of being part of a diagnosis. In our nation of conflict sets. A minimal diagnosis ∆ can be used as basis working example ∆1 = {r4 , r5 }. The corresponding repair actions for determining repair actions, i.e., concrete measures to change user (solutions for R − ∆1 ∪ C) is A = {r40 : payout = monthly, r50 : requirements in R such that the resulting R0 is consistent with C. guarantee = low}, i.e., {r1 , r2 , r3 , r4 , r5 } − {r4 , r5 } ∪ {r40 , r50 } is consistent. The item that satisfies R − ∆1 ∪ A is {DynamicF in} (see in Figure 2). The identified items (p) are ranked according to 3.4 Diagnosis and Repair of Requirements their support value (see Formula 1). Definition 2 (Repair Task). Given a set of requirements R = #adaptions in A {r1 , r2 , ..., rk } inconsistent with the constraints in C and a corre- support(p) = (1) #requirements in R sponding diagnosis ∆ ⊆ R (∆ = {rl , ..., ro }), the corresponding repair task is to determine an adaption A = {rl0 , ..., ro0 } such that R − ∆ ∪ A is consistent with C. 3.5 Regression Testing In W EE V IS, repair actions are determined conform to Definition 2. For each diagnosis ∆ determined by FAST D IAG (currently, the W EE V IS supports regression testing processes by the definition and first n=3 leading diagnoses are determined), the corresponding solu- execution of (positive) test cases which specify the intended behavior tion search for R − ∆ ∪ C returns a set of alternative repair actions of the knowledge base. If some of the test cases are not accepted by (represented as adaptation A). In the following, all products that sat- the knowledge base (are inconsistent with the knowledge base), the isfy R − ∆ ∪ A are shown to the user (see the right hand side of causes of this unintended behavior have to be identified. On a formal Figure 1). level a recommender knowledge base (RKB) diagnosis task can be Diagnosis determination in FAST D IAG is based on a total lexico- defined as follows (see Definition 3). graphical ordering of the customer requirements [12]. This ordering Definition 3 (RKB Diagnosis Task). Given a set C (recommender is derived from the order in which a user has entered his/her require- knowledge base) and a set T = {t1 , t2 , ..., tq } of test cases ti , the di- ments. For example, if r1 : pension = yes has been entered before agnosis task is to identify a minimal set ∆ of constraints (the diagno- r4 : payout = once and r5 : guarantee = high then the underly- sis) that have to be removed from C such that ∀ti ∈ T : C −∆∪{ti } ing assumption is that r4 and r5 are of lower importance for the user is consistent. Page 14 Figure 3. W EE V IS maintenance support: diagnosis and redundancy detection. An example test case inducing an inconsistency with C is t : redundancies. Consequently, the corresponding set of constraints C pension = yes and guarantee = high and payout = once does not represent a minimal core. Taking a closer look at the knowl- (see Figure 2). In this context, t induces two conflicts in C which edge base it appears that two individual filter constraints are redun- are CS1 : ¬(pension = yes ∧ guarantee = high) and CS2 : dant with each other. More precisely, either the constraint &IF guar- ¬(pension = yes ∧ payout = once). In order to make C consis- antee? = high &THEN guaranteep = high or the constraint &IF tent with t, both incompatibility constraints have to be deleted from guarantee? = high &THEN guaranteep <> low can be removed C, i.e., are part of the diagnosis ∆ (see Figure 3). from the knowledge base (in our example, the latter is proposed as In contrast to the hitting set based approach [4], W EE V IS includes redundant by C ORE D IAG – see Figure 3). In the general case, higher a FAST D IAG based approach for knowledge base debugging which cardinality constraint sets can be removed, not only cardinality-1 sets is more efficient and can therefore be applied in interactive settings as in our example [13]. [12]. In this context, diagnoses are searched in C (the test cases used Similar to the diagnosis of inconsistent requirements the C ORE - for regression testing are assumed to be correct). In the case of re- D IAG algorithm is based on the principle of divide-and-conquer: quirements diagnosis, the total ordering of the requirements is related whenever a set S which is a subset of C is inconsistent with C, it to user preferences. In the case of knowledge base diagnosis [4, 16], is or contains a minimal core, i.e., a set of constraints which pre- the ordering is currently derived from the ordering of the constraints serve the semantics of C. C ORE D IAG is based on the principle of in the knowledge base. Q UICK XP LAIN [18]. As a consequence a minimal core (minimal set of constraints that preserve the semantics of C ) can be interpreted as a minimal conflict, i.e., a minimal set of constraints that are incon- 3.6 Identifying Redundancies sistent with C. Based on the assumption of a strict lexicographical ordering [12] of the constraints in C, C ORE D IAG determines pre- To support users in identifying redundant constraints in recom- ferred minimal cores. mender knowledge bases, the C ORE D IAG [13] algorithm has been integrated into the W EE V IS environment. C ORE D IAG relies on Q UICK XP LAIN [18] and is used for the determination of minimal 4 Empirical Study cores (minimal non-redundant constraint sets). On a formal level a 4.1 Study Design recommendation knowledge base (RKB) redundancy detection task can be defined as follows (see Definition 4). We conducted an experiment to highlight potential reductions of de- Definition 4 (RKB Redundancy Detection Task). Let ca be a con- velopment and maintenance efforts facilitated by the W EE V IS de- straint of C (the recommendation knowledge base) and C the logical bugging and redundancy detection support. For this study we defined negation (the complement or inversion) of C. Redundancy can be an- four knowledge bases that differed with regard to the number of con- alyzed by checking C − {ca } ∪ C for consistency - if consistency straints, variables, faulty constraints, and redundancies (see Table 1). is given, ca is non-redundant. If this condition is not fulfilled, ca is Based on these example knowledge bases, the participants had to find said to be redundant. By iterating over each constraint of C, execut- solutions for the following two types of tasks: ing the non-redundancy check C − {ca } ∪ C, and deleting redundant 1. Diagnosis task: The participants had to answer the question which constraints from C results in a set of non-redundant constraints (the minimal set ∆ of faulty constraints has to be removed from C minimal core). (C = COM P ∪F ILT ) such that there exists at least one solution As an example, the knowledge base shown in Figure 2 contains for ( (C − ∆) ∪ P ROD). Page 15 2. Redundancy detection task: The participants had to answer the groupB groupA question which constraints in C = COM P ∪ F ILT are redun- (kb2 ) (kb4 ) dant (if C − {ca } ∪ C is inconsistent then the constraint ca is average time (sec.) 281.3 497.5 redundant). correct (%) 50.0 10.0 incorrect (%) 50.0 90.0 knowledge base number of constraints /variables /faulty Table 3. Time efforts and error rates related to the completion of diagnosis constraints /test cases tasks. /redundancies kb1 (redundant) 5/5/0/0/2 kb2 (inconsistent) 5/5/1/2/0 The second goal of our experiment was to analyze time efforts kb3 (redundant) 10/10/0/0/4 and error rates related to the identification of redundant constraints kb4 (inconsistent) 10/10/2/4/0 in recommender knowledge bases. The second hypothesis tested in our experiment was the following: Table 1. Knowledge bases used in the empirical study. Hypothesis 2: Even low-complexity knowledge bases trigger the faulty identification of redundant constraints. The average time for identifying redundant constraints in knowl- edge base kb1 was 189.2 seconds, for kb3 337.4 seconds were The participants (subjects N=20) of our experiment were separated needed. The results show a significantly higher error rate when the into two groups (groups A and B). All subjects were students of Com- participants had to identify redundant constraints in the more com- puter Science (20% female, 80% male) who successfully completed plex knowledge base (see Table 4). Hypothesis 2 can be confirmed a course on constraint technologies and recommender systems. Each since even for low complexity knowledge bases error rates related to subject had to complete the assigned tasks on his/her own on a sheet redundancy detection tasks are high. With the automated redundancy of paper and they had to track the time for each task. In our exper- detection mechanisms integrated in W EE V IS, reductions of related iment we randomly assigned the participants to one of the two test error rates and time efforts can be expected. groups shown in Table 2. This way we were able to compare the time efforts of identifying faulty constraints and redundancies in knowl- groupA groupB edge bases as well as to estimate error rates related to the given tasks. (kb1 ) (kb3 ) average time (sec.) 189.2 337.4 testgroup 1st knowledge 2nd knowledge correct (%) 40.0 0.0 base base incorrect (%) 60.0 100.0 A (n = 10) kb1 (redundancy detection) kb4 (diagnosis) B (n = 10) kb2 (diagnosis) kb3 (redundancy detection) Table 4. Time efforts and error rates related to the completion of redundancy detection tasks. Table 2. Each subject had to complete one diagnosis and one redundancy detection task. Members of group A had a redundancy detection task of lower complexity and a higher complexity diagnosis detection task (randomized order). Vice-versa members of group B had to solve a higher complexity redundancy detection and a lower complexity diagnosis task. 5 Future Work There are a couple of issues for future work. The current W EE - V IS version does not include functionalities that allow the learn- ing/prediction of user preferences. The importance of individual user 4.2 Study Results requirements is based on the assumption that the earlier a require- The first goal of our experiment was to analyze time efforts and er- ment has been specified the more important it is. In future versions ror rates related to the identification of faulty constraints in recom- we want to make the modeling of preferences more intelligent by in- mender knowledge bases. The first hypothesis tested in our experi- tegrating, for example, learning mechanisms that derive requirements ment was the following: importance distributions on the basis of analyzing already completed Hypothesis 1: Even low-complexity knowledge bases recommendation sessions. trigger the identification of faulty diagnoses (note that all Diagnoses and redundancies are currently implemented on the knowledge bases used in the experiment can be interpreted level of constraints, i.e., intra-constraint diagnoses and redundancies as low-complexity knowledge bases [13]). are not supported. In future W EE V IS versions we want to integrate The average time effort for identifying minimal diagnoses in fine-granular analysis methods that will help to make analysis and knowledge base kb2 was 281.3 seconds, the average time needed to repair of constraints even more efficient. A major research challenge identify diagnoses in kb4 was 497.5 seconds. The results show a sig- in this context is to integrate intelligent mechanisms for diagnosis nificantly higher error rate when the participants had to identify the discrimination [27] since in many scenarios quite a huge number faulty constraints in the more complex knowledge base (see Table 3). of alternative diagnoses exists. In such scenarios it is important for Hypothesis 1 can be confirmed by the results in Table 3 that show that knowledge engineers to receive recommendations of diagnoses that even simple knowledge bases trigger high error rates and increasing are reasonable. This challenge has already been tackled in the context time efforts. With the automated diagnosis detection mechanisms in- of diagnosing inconsistent user requirements (see, e.g., [6]), however, tegrated in W EE V IS, reductions of related error rates and time efforts heuristics with high prediction quality for knowledge bases have not can be expected. been developed up to now [10, 11]. Page 16 A major issue for future work is to integrate alternative mech- [8] A. Felfernig, K. Isak, K. Szabo, and P. Zachar, ‘The VITA Finan- anisms for knowledge base development and maintenance. The cial Services Sales Support Environment’, pp. 1692–1699, Vancouver, Canada, (2007). knowledge engineer centered approach to knowledge base construc- [9] A. Felfernig and A. Kiener, ‘Knowledge-based Interactive Selling of tion leads to scalability problems in the long run, i.e., knowledge Financial Services with FSAdvisor’, in 17th Innovative Applications of engineers are not able to keep up with the speed of knowledge base Artificial Intelligence Conference (IAAI05), pp. 1475–1482, Pittsburgh, related change and extension requests. An alternative approach to Pennsylvania, (2005). knowledge base development and maintenance is the inclusion of [10] A. Felfernig, S. Reiterer, M. Stettinger, and J. Tiihonen, ‘Intelligent Techniques for Configuration Knowledge Evolution’, in VAMOS Work- concepts of Human Computation [7, 30] which allow a more deep shop 2015, pp. 51–60, Hildesheim, Germany, (2015). integration of domain experts into knowledge engineering processes [11] A. Felfernig, S. Reiterer, M. Stettinger, and J. Tiihonen, ‘Towards Un- on the basis of simple micro tasks. Resulting micro contributions can derstanding Cognitive Aspects of Configuration Knowledge Formaliza- be automatically integrated into constraints part of the recommenda- tion’, in VAMOS Workshop 2015, pp. 117–124, Hildesheim, Germany, (2015). tion knowledge base. [12] A. Felfernig, M. Schubert, and C. Zehentner, ‘An Efficient Diagnosis Finally, we are interested in a better understanding of the key fac- Algorithm for Inconsistent Constraint Sets’, Artificial Intelligence for tors that make knowledge bases understandable. More insights and Engineering Design, Analysis and Manufacturing, 26, 53–62, (2012). answers related to this question will help us to better identify prob- [13] A. Felfernig, C. Zehentner, and P. Blazek, ‘COREDIAG: Eliminating lematic areas in a knowledge base which could cause maintenance Redundancy in Constraint Sets’, International Workshop on Principles of Diagnosis (DX’11), 219–224, (2011). efforts above average. A first step in this context will be to analyze [14] G. Fleischanderl, G. Friedrich, A. Haselböck, H. Schreiner, and existing practices in knowledge base development and maintenance M. Stumptner, ‘Configuring Large Systems Using Generative Con- with the goal to figure out major reasons for the knowledge acquisi- straint Satisfaction’, IEEE Intelligent Systems, 13(4), 59–68, (1998). tion bottleneck and how this can be avoided in the future. [15] E. Freuder, ‘In Pursuit of the Holy Grail’, Constraints, 2(1), 57–61, (1997). [16] G. Friedrich, ‘Interactive Debugging of Knowledge Bases’, in Interna- 6 Conclusion tional Workshop on Principles of Diagnosis (DX’14), pp. 1–4, Graz, Austria, (2014). In this paper we presented W EE V IS which is an open constraint- [17] Russell Greiner, Barbara A. Smith, and Ralph W. Wilkerson, ‘A Cor- rection to the Algorithm in Reiter’s Theory of Diagnosis’, Artificial In- based recommendation environment. By exploiting the advantages telligence, 41(1), 79–88, (1989). of Mediawiki, W EE V IS provides an intuitive basis for the devel- [18] U. Junker, ‘QUICKXPLAIN: Preferred Explanations and Relaxations opment and maintenance of constraint-based recommender appli- for Over-constrained Problems’, in AAAI, volume 4, pp. 167–172, cations. W EE V IS is already applied by four Austrian universities (2004). within the scope of recommender systems courses and also applied [19] J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl, ‘GroupLens: Applying Collaborative Filtering to Usenet News’, Com- by companies for the purpose of prototyping recommender appli- munications of the ACM, 40(3), 77–87, (1997). cations. The results of our empirical study indicate the potential of [20] A. Mackworth, ‘Consistency in Networks of Relations’, Artificial Intel- reductions of error rates and time efforts related to diagnosis and re- ligence, 8(1), 99–118, (1977). dundancy detection. In industrial scenarios, W EE V IS can improve [21] J. Marques-Silva, F. Heras, M. Janota, A. Previti, and A. Belov, ‘On Computing Minimal Correction Subsets’, in IJCAI 2013, pp. 615–622, the quality of knowledge representations, for example, documenta- Peking, China, (2013). tions can at least partially be formalized which makes knowledge [22] S. Mittal and B. Falkenhainer, ‘Dynamic Constraint Satisfaction’, in more accessible – instead of reading a complete documentation, the National Conference on Artificial Intelligence, pp. 25–32, (1990). required knowledge chucks can be identified easier. [23] S. Mittal and F. Frayman, ‘Towards a Generic Model of Configuraton Tasks’, in IJCAI, volume 89, pp. 1395–1401, (1989). [24] B. O’Sullivan, A. Papadopoulos, B. Faltings, and P. Pu, ‘Representa- REFERENCES tive Explanations for Over-constrained Problems’, in Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07), eds., R. Holte [1] R. Bakker, F. Dikker, F. Tempelman, and P. Wogmim, ‘Diagnosing and and A. Howe, pp. 323–328, Vancouver, Canada, (2007). AAAI Press. Solving Over-determined Constraint Satisfaction Problems’, in 13th In- [25] M. Pazzani and D. Billsus, ‘Learning and Revising User Profiles: The ternational Joint Conference on Artificial Intelligence, pp. 276–281, Identification of Interesting Web Sites’, Machine learning, 27(3), 313– Chambery, France, (1993). 331, (1997). [2] J. Baumeister, J. Reutelshoefer, and F. Puppe, ‘KnowWE: a Semantic [26] R. Reiter, ‘A Theory of Diagnosis From First Principles’, Artificial in- Wiki for Knowledge Engineering’, Applied Intelligence, 35(3), 323– telligence, 32(1), 57–95, (1987). 344, (2011). [27] K. Shchekotykhin and G. Friedrich, ‘Diagnosis discrimination for on- [3] A. Felfernig and R. Burke, ‘Constraint-based Recommender Systems: tology debugging’, in ECAI 2010, pp. 991–992, (2010). Technologies and Research Issues’, in 10th International Conference [28] M. Stumptner, G. Friedrich, and A. Haselböck, ‘Generative Constraint- on Electronic Commerce, p. 3. ACM, (2008). based Configuration of Large Technical Systems’, AI EDAM, 12(04), [4] A. Felfernig, G. Friedrich, D. Jannach, and M. Stumptner, 307–320, (1998). ‘Consistency-based Diagnosis of Configuration Knowledge Bases’, Ar- [29] E. Tsang, Foundations of Constraint Satisfaction, volume 289, Aca- tificial Intelligence, 152(2), 213–234, (2004). demic press London, 1993. [5] A. Felfernig, G. Friedrich, D. Jannach, and M. Zanker, ‘An Integrated [30] L. VonAhn, ‘Human Computation’, in Technical Report CM-CS-05- Environment for the Development of Knowledge-based Recommender 193, (2005). Applications’, International Journal of Electronic Commerce, 11(2), 11–34, (2006). [6] A. Felfernig, G. Friedrich, M. Schubert, M. Mandl, M. Mairitsch, and E. Teppan, ‘Plausible Repairs for Inconsistent Requirements’, in IJCAI, volume 9, pp. 791–796, (2009). [7] A. Felfernig, S. Haas, G. Ninaus, M. Schwarz, T. Ulz, M. Stettinger, K. Isak, M. Jeran, and S. Reiterer, ‘RecTurk: Constraint-based Recom- mendation based on Human Computation’, in RecSys 2014 CrowdRec Workshop, pp. 1–6, Foster City, CA, USA, (2014). Page 17 Page 18 A Personal Data Framework for Exchanging Knowledge about Users in New Financial Services Beatriz San Miguel, Jose M. del Alamo and Juan C. Yelmo1 Abstract.1Personal data is a key asset for many companies, since Well aware of this situation, in 2014 the Center for Open this is the essence in providing personalized services. Not all Middleware (COM), a joint technology center created by Santander companies, and specifically new entrants to the markets, have the Bank and Universidad Politécnica de Madrid, launched a pilot opportunity to access the data they need to run their business. In project intended to research, analyze and evaluate new potential this paper, we describe a comprehensive personal data framework opportunities and applications around personal data. Specifically, that allows service providers to share and exchange personal data and knowledge about users, while facilitating users to decide who the project aims to establish a framework that allows the sharing can access which data and why. We analyze the challenges related and use of personal data among companies, and the creation of to personal data collection, integration, retrieval, and identity and knowledge about users, while allowing users to manage and privacy management, and present the framework architecture that control their flow of personal information, defining who access addresses them. We also include the validation of the framework in which data and why. a banking scenario, where social and financial data is collected and In this paper we introduce the aforementioned framework which properly combined to generate new socio-economic knowledge has been called the Personal Data Framework (PeDF). The PeDF about users that is then used by a personal lending service. includes mechanisms for gaining access to personal data from several heterogeneous data sources, and integrating them to facilitate their analysis and processing to produce and infer new 1 INTRODUCTION knowledge about users. This information can be provided to new Tailored and customized features are increasingly becoming more financial service providers that, as new players, do not have popular in IT services. These adjust offers and functionalities of sufficient personal data to offer their services. On the other hand, services to the user preferences, interests and personal needs, there are currently tensions related to the use of personal data, generally going beyond functionality of the service itself and thus, causing privacy and trust concerns in users. In this context, the improving it. In the banking sector, it is not an exception and for European public sector is attempting to regulate and evolve the some time now new players have appeared to offer financial existing legislation to strengthen individual rights in relation to the services based on personalization and recommendations. uses of their personal data and their privacy, while boosting digital Traditionally, banks have been early adopters of new and personal data economy [4]. Therefore, the framework includes technology solutions, but mainly following a bank-centric the necessary tools to involve users in the management and control approach that users are rarely able to notice [1]. IT companies and of their personal information. new service providers have leveraged this gap to offer user-centric The remainder of the paper is organized as follows. First, financial services. For example, on-line payment is one of the most Section 2 includes the technological background for each issue that competitive areas into which IT companies such as PayPal, Google covers the PeDF related to personal data: collection, integration, or Apple, have entered. Moreover, many financial services related retrieval, and identity and privacy management. Then, Section 3 to crowdfunding, lending clubs, investment recommendations, describes the PeDF architecture, and Section 4 includes the PeDF financial aggregators that allow the management of personal validation that we have conducted in the financial context. Finally, finances, the comparison or recommendation of banking products, we present related work in Section 5, and conclude the paper by etc. have transformed the traditional ways of financial highlighting conclusions and future directions in Section 6. organizations, or have even created entirely new ones. These innovative financial services create new opportunities, but also potential threats in the industry. It is vital for banks to 2 TECHNOLOGICAL APPROACCHES understand the new directions and develop threats into new The PeDF acts as an intermediate entity between service providers opportunities and returns. In this sense, most of these new financial and individuals to allow the former to share and exchange existing services require personal data and financial information about users personal data and new knowledge obtained from them which in order to know them better and then, offer and improve services. cannot be done unilaterally, while enabling users to retrieve a Here banks possess inherent competitive advantages, since they global view of their personal information and decide who can have a large amount of customer data, transaction information, and access which data and why. To make it possible, the PeDF has to the capabilities to enable financing and secure services [2] and [3]. include mechanisms for gaining access to personal data that are 1 scattered across different service providers (data sources). When Center for Open Middleware, Universidad Politécnica de Madrid, Spain, the data sources supply personal data to the PeDF, it has to be able email: {beatriz.sanmiguel, jose.delalamo, juancarlos.yelmo}@centeropenmiddleware.com to integrate them. This integration must allow the PeDF to provide Page 19 personal data and knowledge obtained from these data to service data interchange. Although the same protocol and language still providers (referred to as data consumers). All of the above has to apply, there are differences, since the suppliers’ API use different be controlled by the user and thus, it requires the PeDF to include syntax and semantic to refer to the same data. identity and privacy management solutions. In a nutshell, there is no unified API specification, each API In summary, the PeDF covers four main technological issues: contains its own description, which can be poorly documented, and personal data collection, integration, retrieval, and identity therefore, understanding each one is challenging. There are some management and privacy. Next, we will present the background initiatives to solve the associated API problems, such as the associated with each issue, detailing its technological solutions. OpenSocial standards [7] that include a set of open APIs that developers can use to gain access to user personal resources hosted by different providers who have implemented them. We can find a 2.1 Personal data collection few related solutions in the social network services, such as [8], Data sources can be classified into two main categories in relation that proposes a framework to integrate the interaction with to personal data access: public or private, but one source can be different social APIs. categorized as both, depending on the personal data concerned. The public data sources contain personal data that are accessible 2.2 Personal data integration in an equitable way for any entity in the public network. On the other hand, in the private data sources, the personal data can only Data integration is an old field of research that aims at combining be accessed by authorized entities. We can think of numerous data from different sources and providing them in a unified view examples of personal data sources, such as social networks, instant [9]. Over time, many solutions have been proposed [10], but two messaging services, mobile applications, and many other service main approaches regarding storage can be followed: providers specialized in a specific user domain such as education, • Centralized way. The personal data is retrieved from banking, or e-commerce. As an illustrative example, a social external data sources, saved, and stored in a central network can act as a public or private data source depending on the repository. This is a replication of the personal data stored user configuration. by data sources and thus, maintaining and updating the There are different technologies that allow third parties to replicated data is a key issue. It must incorporate collect the personal data from data sources. For the public ones, the techniques to carry out a periodical refreshing of personal so-called Internet bots, spiders, or web crawlers are the most data, or even better, mechanisms that allow the detection representative. These are software solutions that automatically of data changes in real time. Despite the aforementioned, search, access and retrieve public information on the Internet. it has clear benefits related to availability and timeliness. As regards private data sources, there are several mechanisms Furthermore, it facilitates data analysis and processing. based on user consent that allow third parties to access the • Decentralized way. Here, there is a central directory or protected personal data. One of the easiest ways is the method registry and a distributed data storage. It entails little or based on data files. This kind of files contains personal data created no storage since personal data is maintained and stored by by a user in a specific data source and can be exported by users. each external data source. However, personal data access For example, Google allows its users to access their personal data, is more complex and generally less efficient than the downloading different files2. The main problem associated with previous way because recovering data is carried out on this solution is that it requires extra work for the users, since they the fly and there can be source access limitations. have to be actively involved to download their files, carrying out manual tasks. Moreover, files can be easily manipulated to change The two mechanisms are complementary since the central their content, and therefore, the security mechanisms are weak. In repository of the first way can be considered as an extra storage order to solve this problem, a set of programming functions, point for the decentralized solution. Furthermore, both solutions protocols, and standards has appeared to automate the process: data face the challenges of corresponding personal data at different data sharing Application Programming Interfaces (APIs). sources, and giving them a common definition. The former entails APIs have become the de facto mechanism for sharing and the development of algorithms and mapping techniques that exchanging personal data, since they allow different software (semi)automate the correspondence process to eliminate manual applications to communicate and interact directly [3]. They offer tasks. On the other hand, the common definition of personal data code-based access to different functionalities and services to third involves establishing a standard to represent the personal data. parties by abstracting their implementation details. On the Internet, There is no standard or a generally adopted representation for the Representational State Transfer (REST) [5] architectural style personal data, neither the structure (format of the representation), has recently emerged as the favorite for implementing APIs. It is nor even the semantic (meaning of the content). We can find many based on the Hypertext Transfer Protocol (HTTP) to allow proposals for standards and proprietary solutions to define each connectivity, but it does not specify the syntax of messages. The personal data category, almost as many as there are service individual messages and interfaces are designed according to the providers. One of the most promising solutions for integrating all suppliers’ semantic. For example, Facebook and Twitter include these discrepancies is the use of ontologies. different APIs (Graph API3 and REST APIs4, respectively) to read An ontology is an engineering artifact made up of a vocabulary and write their user personal data, which are based on the HTTP that describes a certain reality, and a set of explicit assumptions for communication, and JavaScript Object Notation (JSON) [6] for regarding the intended meaning of the vocabulary terms [11]. It enables a common understanding of a specific domain to be shared 2 https://support.google.com/accounts/answer/3024190?hl=en across a wide range of service providers, adding interoperability, 3 https://developers.facebook.com/docs/graph-api consistency, reusability, and many other advantages [12]. 4 https://dev.twitter.com/rest/public Page 20 Over time, many ontologies have been proposed for diverse misinterpreted. Moreover, there is a lack of connection between domains including healthcare, molecular biology, or web concepts and it does not help in modelling users for other contexts. searching. There are general ontologies describing concepts (e.g., object, process and event) that are the same across different domains, such as the Suggested Upper Merged Ontology (SUMO) 2.3.2 Stereotypes [13]. Additionally, there are more specific ontologies (namely Stereotype modelling [21] attempts to cluster all possible users of a domain ontologies) that represent the particular concepts of a system into different groups, namely stereotypes. Each user that domain. In the social network field, the Friend of a Friend (FOAF) belongs to the same stereotype is treated like the rest of the ontology [14] includes the main terms to describe people, the links members of the group so his or her individual features are not between them and the things they create and do on Internet. In the considered. Typically, the data used in the classification is a financial industry, the Financial Industry Business Ontology demographic that users have to provide, for example in a (FIBO) [15] is an ongoing definition of financial industry terms registration form. such as contracts, product/service specifications and governance The main goals of this modelling approach are to define the compliance documents. SUMO also includes domain ontologies stereotypes of a system and to implement the trigger techniques for finance and economy. that provide mapping from a specific user to one stereotype. These Finally, there are different methodologies and languages for include different clustering analyses, machine-learning techniques defining your own ontologies, such as those described in [16]. One and reasoning among others [22]. There is an obvious disadvantage of the most popular languages is the Web Ontology Language of this approach and it lies in the limited personalization and (OWL) [18] that is part of the W3C technology stack. OWL allows individualization of users, besides the difficulty in recovering new the definition of concepts and the complex and rich relationships user models from the existing ones. between them. 2.3.3 Classifier based models 2.3 Personal data and knowledge retrieval Classifier systems [23] use information about items or the domain Personal data can be offered to third entities, and even more together with user data as an input to generate a custom response to interestingly, these data can be analyzed and processed to obtain the user. These can be implemented using different machine knowledge that cannot be achieved unilaterally by service learning methods and the user model is represented as the providers. The process for producing this knowledge is referred to particular model structure of the used classifier. For example, there as user modelling in the literature [19]. can be user models based on decision trees, association rules, or Traditionally, user modelling is a one-sided process in which Bayesian Networks. This approach, like the previous ones, has service providers autonomously collect personal data and then difficulties in retrieving and sharing user models since it is very generate user models that satisfy their business needs in a specific limited and is based on solving specific tasks. domain. A user model is understood as the interpretation of a person in a specific context for an organization. It includes what the organization thinks the user is, prefers, wants, or is going to do, 2.3.4 Semantic user modelling and comprises mainly derived and inferred data. The user model Semantic technologies have appeared as a way to solve can be used to recommend new contents or services, personalize communication problems, and interoperability issues among user interaction, or predict user behavior, among others. systems, and to provide and facilitate reusability, reliability, and a There are different techniques to create user models, choosing common specification [12]. Semantic user modelling [20] is based one or another depends on what information is been stored and the on using ontologies that model a user or a specific domain using a final application of the model. Next, we point out some of the rich network where terms are connected by different kinds of links approaches that can be taken. that indicate its relations [24]. Using ontologies solves the polysemy problem and facilitates to 2.3.1 Vector-based models retrieve and share user models between entities. There are different languages and techniques that allow the extraction of data from Here, a user is represented by a set of feature-value pairs. The ontologies. For example, the SPARQL Protocol and Resource features can be items or concepts of a domain, such as products of Description Framework (RDF) Query Language (SPARQL) and a shop, or links on a web site. Each of them has associated a value the accompanying protocols [25] make possible to send queries and (usually, a boolean or real number) that indicates the attitude of a receive results from semantic data (expressed as RDF information), user to this feature. For example, the value can indicate whether a e.g., through HTTP. Moreover, new relations between concepts user has searched for a product or the number of visits to a link. and thus, about user features, can be inferred from ontology There are other approaches similar to this one such as keyword- representation. Particularly, reasoner engines [16] are software based, bag of words, or user-items rating matrix [20], which components that allow autonomously the discovery of new consider only words or terms interesting to users with or without an knowledge from ontologies. Generally, they employ their own associated value, or historical user ratings on items, respectively. rules, axioms and appropriate chaining methods. We can find This approach is one of the simplest since its implementation stand-alone reasoners, such as Pellet5, or reasoners included in and retrieval is quite easy. It has been used by nearly every different semantic frameworks as for example, Protégé6 and Jena7. information retrieval system [21]. However, it is difficult to share 5 with other data consumers because the features and values can be http://clarkparsia.com/pellet 6 http://protege.stanford.edu/ 7 https://jena.apache.org/ Page 21 2.4 Identity Management and Privacy 3 FRAMEWORK ARCHITECTURE Identity management commonly refers to the processes involved in As described in the previous section, there are many solutions and the management and selective disclosure of personal data, either specific technologies to handle the design and implementation of within an institution or between several entities, while preserving the PeDF. We have proposed a comprehensive architecture for the and enforcing both privacy and security requirements. There are PeDF that considers different approaches for personal data different approaches to implementing identity management, collection, integration, retrieval, and identity and privacy mainly: network-centric and user-centric approaches [26]. management, regardless of the specific technologies and Network-centric approaches are based on agreements between implementations. Figure 1 represents this PeDF architecture where service providers that establish trust relationships. Each service we can distinguish its modules, and its relationships with different provider maintains its own personal data but users can link external data sources, data consumers, and the user. (federate) isolated accounts that they own across different providers to be recognized within the federated domain. Technological standards for identity federation include the OASIS Security Assertion Markup Language (SAML) [27] and the Kantara Initiative8. On the other hand, user-centric approaches highlight user empowerment in the governing of their personal information. Generally, there is a third entity that is in charge of providing user identity to service providers and the user is in the center of the transactions, managing the sharing of personal data. Examples of this approach are [28]: OpenID, OAuth 2.0, and OpenID Connect. Most of the social-based APIs for personal information sharing rely on OAuth 2.0, as for example the Facebook Login API9. It introduces a third role to the traditional client-server authentication/authorization model: the resource owner. Following this model, the client (who is not the resource owner, but is acting on his behalf) requests access to resources controlled by the resource owner, but hosted by a container i.e. the online social network. OAuth 2.0 allows the service provider to verify the identity of the client making the request, as well as ensuring that Figure 1. Personal Data Framework architecture the resource owner has authorized the transaction without revealing their credentials. Firstly, we have considered that there are diverse existing data Identity management technologies also contribute to privacy sources (private or public), and crawlers on the Internet that can be management by allowing users to decide on the sharing process. linked with the PeDF to gain access to user personal data. This data However, this is not enough, as any system managing personal source-user association can be carried out by the user through the information must abide by the privacy and data protection legal User Manager module, or by data consumers via the Registrar framework in place, and thus fulfill a set of requirements derived module but the latter requires user consent. from the legal principles. For example, in Europe the main Once the data sources are linked, the Collector module is in principles include lawfulness collection and processing; gathering charge of obtaining personal data from them and these data have to specific, informed and explicit consent from data subjects; purpose be integrated. We have proposed two complementary approaches binding; necessity and data minimization; transparency and to carry out this integration. One is based on collecting and storing openness; rights of the individual; and, security safeguards [29]. personal data, which requires a User Data Store module. The other The state of the art includes a plethora of technological method is based on indexing personal data, which entails a solutions, each addressing a specific privacy concern, and globally Registry module that identifies which personal data can be referred to as Privacy Enhancing Technologies (PETs) [29]. accessed and where they are stored. However, adding PETs on top of an existing system does not solve Moreover, we have provided the PeDF with the ability to supply all privacy requirements, and thus there is a general consensus on personal data and user models to data consumers through a the need to introduce Privacy by Design (PbD) approaches when Retriever module. The creation of user models entails the developing systems i.e. considering privacy issues from the onset incorporation of different components that extract knowledge from of a project and through its entire lifecycle [30]. personal data. These components have been grouped together in a All the aforementioned technologies facilitate the access and main component namely Generator. management of personal data. However, user-centric solutions Summarizing, the PeDF incorporates seven modules: allow users to control and manage their personal data directly, 1. User Manager. It is a vertical module that allows users to bringing a better user-experience. interact with PeDF to sign in, activate the incorporation of new data sources, and check and manage authorizations for access to their personal data and user accounts. It implements an identity management infrastructure and privacy solutions. 2. Registrar. This module allows data consumers to ask for the incorporation of new data sources in order to include new 8 https://kantarainitiative.org/ 9 https://developers.facebook.com/products/login/ Page 22 personal data in the PeDF. It interacts with the User Manager 4.1 External data sources module to obtain the user consent. We have considered two private data sources for PeDF validation: 3. Collector. This module is in charge of obtaining personal data PosdataP2P service, and the social network Facebook. from external data sources, checking user authorization. It can PosdataP2P service [17] is an innovative financial service also include crawlers’ components that get personal data from developed within the context of a COM project. It allows public data sources. Santander University Smart Card (USC) holders to make payments 4. Registry. It allows the PeDF to store pointers to external to or request money from friends, using alternative social channels personal data that the PeDF is able to recover from data such as texting systems e.g. Telegram, or online social networks sources. e.g. Facebook or Twitter. 5. Generator. It comprises a set of components that allow PeDF The USC is a smart card issued by over 300 universities in to obtain user models from personal data. These implement collaboration with Santander Bank. It is used by 7.8 million people different techniques of user modelling to uncover user needs, worldwide to access university services, such as libraries, control preferences, interests, etc. access (for example, to computers, campus, sports pavilions, etc.), 6. User Data Store. It is a central repository that stores the electronic signature, discounts at retailers, etc. It can be also used personal data that is obtained from external data sources or by to gain access to Santander Bank financial services, working as a the Generator module. It contains different interfaces that credit/debit card linked to the holder’s saving account. allow the updating and refreshing of personal data. To use PosdataP2P service, USC holders have to activate the 7. Retriever. This module is in charge of communicating with service first, providing their USC information. Then, they choose data consumers who are interested in obtaining personal data the social channels that they want to use to carry out financial and user models of a specific user. It interacts with the User transactions. Having done that, students can start making financial Manager module to check user consent and with the Registry transactions by simply posting messages to their friends within or User Data Store to retrieve the personal data requested. their enabled social channels (Figure 3). 4 FRAMEWORK VALIDATION We have validated the PeDF in a banking scenario which considers a person-to-person payment service namely PosdataP2P, and the social network Facebook as data sources. Moreover, it includes a financial service called FriendLoans that uses user models from the PeDF to offer its users recommendations about microloans. It is an integration effort to provide user models that fulfill individual business needs of third entities. We have focused our work on a centralized integration based on semantic technologies, which improve the user modelling process. Moreover, we have validated the PeDF with five beta testers from our research group. Figure 2 represents our validation to the PeDF. Here, we can observe the two private data sources (PosdataP2P and Facebook), the data consumer (MicroLoans), the user and the main PeDF modules that we have validated: User Manager, Collector, User Data Store, Generators, and Retriever. Figure 3. PosdataP2P screenshot using Facebook as a channel The PosdataP2P service generates financial data on USC holders, which is properly recovered by the PeDF in real time. Specifically, the PosdataP2P has an interface to notify financial transaction to PeDF. The PeDF also obtains demographic and social data from Facebook with user consent. It is based on the Facebook Login and the Facebook Graph API as mentioned in Section 2. 4.2 A Personal Socio-Economic Network The PeDF validation applies a centralized approach where personal data obtained from external data sources are stored in a central repository. Specifically, it is based on a semantic modelling and storing, and an ontology, namely the Personal Socio-Economic Network (PSEN). Figure 2. Personal Data Framework validation architecture The PSEN represents the exchange of money between people and user social data. We have considered the reusing of existing ontologies, which is a must to allow semantic and syntactic Page 23 interoperability. Thus, we have identified the FOAF ontology as classes. We have distinguished the terms of the different ontologies the best alternative for representing people in a social network with darker rectangles indicated in the legend of the figure. context and the SUMO’s financial ontology (using the OWL version) for representing the financial concepts. We have also extended them and linked the different socio-economic concepts. 4.3 Knowledge retrieval The nomenclature that we have used to represent the PSEN We have validated the retrieval of user knowledge through the concepts is based on SUMO terms so it can be easily related to the FriendLoans service, which is based on friendsourcing [31]. It is a upper ontology. form of crowdsourcing where the user’s social network is Briefly, the PSEN includes the main terms to describe people, mobilized to achieve a specific objective. Specifically, the relationships between them, and the financial data and activities FriendLoans relies on the PSEN data to offer financial carried out between them (Figure 4). We represent people as the recommendations on microloans to raise money from friends. It Person class from FOAF and we use the corresponding FOAF has been implemented as a web application in which authenticated properties to describe their user’s demographic information: users can ask for money from their friends. Basically, a user firstName, lastName, gender, age, birthday, and mbox (omitted in accesses to the service, indicates the money needed (Figure 5 at the Figure 4 for the sake of simplicity). We also made use of the top) and the service provides a list of prospective borrowers who Online Account class from FOAF that allows the modelling of are trusty, available, and solvent enough to lend (Figure 5 at the different web identities or online accounts of a person. We have bottom). Figure 5 shows an example of the FriendLoans service for extended it to include online payment and banking accounts. The a user called Maria who needs 200€ from her friends. former is devoted to service providers that allow users to carry out payment operations through the Internet, such as PosdataP2P service. It has associated a BankCard or a Financial Account class from the SUMO financial ontology that denotes where the payment will become effective. These classes have a relationship (namely, cardAccount) since a BankCard is always associated with a FinancialAccount. On the other hand, the Online Banking Account class represents online banking services including financial institutions, such as Santander Bank. To model user economic activities, we have defined a SocialInteraction class within the PSEN ontology. It includes three main properties: timestamp, channel and patient. The timestamp and channel properties indicate when and where the social interaction happens respectively, and patient designates an Entity that participates in the social interaction, i.e. the money exchange. The SocialInteraction class also has two subclasses: Transaction and Communication that have Payment and Request subclasses correspondingly. These are related to a hasPayment link that indicates whether a request for money has been paid. Figure 5. Screenshot of FriendLoans for a user called Maria Generating a list of friends for a user requires user models that are unknown to FriendLoans, but can be retrieved from the PeDF. The PeDF has incorporated two mechanisms that allow data consumers to ask for user financial relationships and other banking information, all with the consent of the user. Specifically, the PeDF abstracts a set of SPARQL sentences and calls the reasoners which obtain and derive additional knowledge from the PSEN. The SPARQL sentences obtain personal data and user models directly from the PSEN which can be used by FriendLoans. This information does not derive facts or inferences under the PSEN data, just data contained in it. For example, the list of friends for a specific user, if a person has carried out payments or requests for money, if a person has received money, if a person has requests for money and no associated payments, etc. As regards the reasoners, they include the mechanisms that Figure 4. Personal Socio-Economic Network definition allow the extration of derived data. For this, we have implemented four custom rules that detect: 1) whether a user knows another user In Figure 4, the rounded rectangles characterize the main A; 2) whether a user owes money to a user A; 3) whether a user has concepts and the edges indicate the relationships between two Page 24 received a payment greater than X euros; and 4) whether a user has integration, retrieval, and identity and privacy management. These requests for money with greater amount of money than Y euros. In have been widely analyzed separately over time in different the rules, the user A and the amount of money X and Y can be contexts, and we can find many researchers addressing each of indicated by FriendLoans to give recommendations to its users. In them in depth. For example, the previously cited literature [10] this way, for the example shown in Figure 5, A will be the includes a study into data integration in business environments, or authenticated user Maria who needs money from her friends, X [32] presents the user modelling techniques, its challenges and the and Y could be at least 200€ or the amount wanted by FriendLoans. state-of-the-art research, focusing on ubiquitous environments. The results obtained from executing these rules are a set of users We can find aligned systems that attempt to solve the same issues that fulfill all conditions. This set is not ordered since the order of as the PeDF in the personal data context. For example, the so- execution of the rules is not predictable in the reasoner. However, called data brokers [33] are companies that collect personal data on the PeDF has implemented an algorithm that orders the results individual (generally, from public data sources), and resell them to including tags that indicate the prioritization. or share them with third parties. These systems are focused on data The next program listing shows an example of a rule that tags collection and integration, but individuals are generally unaware of the results as the most important ones (it is indicated by the tag their activities. Otherwise, there are a number of companies and isFirstFor) for the user Maria (specified by the second line of the projects within the initiative called Personal Cloud10. It advocates rule). The conditions of the rule are: 1) a user who has debts with the creation of safe places where users have complete control of Maria (defined in a function called hasDebtWith), and 2) a user has their data. The associated solutions address the definition of a new not requested an amount of money greater than 5€ with other interaction model between users, service providers, and devices, people (defined in a function called possibleProblem). where clouds connect voluntarily to services which use stored personal data. They focus on identity management, encryption, [isFirst: data storage, cloud computing, as well as other user modelling (?Maria psen:isTarget “true”^^xs:Boolean) works related to reputation. Closely related to these, there are (?person psen:hasDebtWith psen:Maria) different identity management systems [34] that implement end- noValue(?ecAct psen:possibleProblem user solutions with the goal of making personal data available only “true”^^xs:Boolean) to the right parties, establishing trust between parties involved, -> (?person psen:isFirstFor ?Maria)] avoiding the abuse of personal data, and making these provisions possible in a scalable, usable, and cost-effective manner. These latter solutions do not generally include user modelling techniques. 4.4 Identity management and privacy On the other hand, there are also specialized systems, namely We have based our identity management infrastructure on OAuth Generic User Modelling Systems [35] that can serve as a separate 2.0, as it has become the de facto standard to gain access to user modelling component to different service providers. They personal data on the Web. The User Manager includes the address issues related to data representation, inferential component that manages the interaction with external sites. capabilities, management of distributed information, or privacy. Users can currently link their accounts on the PosdataP2P However, they focus on the reuse of technological user modelling service and Facebook to the PeDF. The process works as follows: components rather on the reuse of the personal data and user when a user activates a data source (i.e. Facebook), he is then models themselves. Finally, there are solutions referred as Personal redirected to the service provider site to grant the PeDF the Data Store, Personal Data Locker, or Personal Data Vault that required level of authorization. If successful, the data source roughly describe the same concept. Generally, these solutions are delivers a token that allows access to the user profile. based on a central place where the user can save and manage all As regards privacy, the PeDF has been designed to observe their personal data, including data such as text, passwords, images, European privacy and data protection principles following a video or music [36]. These solutions have an end-user approach. privacy-by-design approach. The User Manager is also the key To summarize, the aforementioned solutions are rather diverse component here, since it provides users with an identity and from one another, and each of them focuses on a main objective privacy dashboard allowing them to 1) grant/revoke consent to the (i.e., personal data collection, identity management, and data collection, processing and disclosure of their personal data, 2) storage). Our work is an integration effort to provide an end-to-end check the PeDF privacy policies, 3) manage the personal data solution that aims at incorporating the best solutions for each issue. known and stored by the PeDF, their sources, and the details on the Our first approach is based on integrating social and financial data. disclosures to third parties as well as exercising their right to To the best of our knowledge, this is the first effort in this context. access, rectify, erase or block personal data. At the same time, the User Data Store implements security safeguards to avoid and 6 CONCLUSIONS AND FUTURE WORK mitigate privacy threats derived from malicious attackers or unwitting users. Finally, as regards the data minimization principle, In this paper we have presented a comprehensive framework the use of reasoners allows third parties to be limited and allows intermediating between users and organizations to support the justified users to be able to query and retrieve that specified and seamless integration of personal data from several, distributed agreed to by the data subject. sources and generating advanced knowledge on users, to be shared with interested third parties, all supervised by the users who control and manage the flow of their personal data. The framework 5 RELATED WORK includes components for personal data collection, integration, and The PeDF is an ambitious solution that covers four main retrieval, as well as users’ identity and privacy management. technological challenges related to personal data: collection, 10 http://personal-clouds.org Page 25 The framework has been validated in a financial context, [17] B. San Miguel, J. M. del Alamo, J. C. Yelmo, ‘Creating and integrating social information from Facebook and a person-to- Modelling Personal Socio-Economic Networks in On-Line Banking’ person payment service, to generate knowledge useful for a in 7th International Workshop on Personalization and Context- personal lending application. Awareness in Cloud and Service Computing, PCS 2014, pp. 177–190, (2015) Springer [In press]. Our future work includes advancing on the design of the [18] World Wide Web Consortium (W3C), ‘OWL Web Ontology privacy-preserving elements required to minimize the personal Language’, W3C Recommendation, (2004). information retrieved by the data consumers while keeping it useful [19] N. P. de Koch, Software engineering for adaptive hypermedia enough to fit their business needs. These developments will systems: reference model, modeling techniques and development comprise advanced privacy enhancing technologies for attribute- process, Ph.D. dissertation, Ludwig Maximilians University Munich, based credentials and database privacy. 2001. [20] S. Gauch, M. Speretta, A. Chandramouli and A. Micarelli, ‘User profiles for personalized information access’, in The Adaptive Web, ACKNOWLEDGEMENTS eds., P. Brusilovsky, A. Kobsa, and W. Nejdl, Springer-Verlag, (2007). This work is part of the Center for Open Middleware (COM), a [21] P. Brusilovsky, and E. Millán, ‘User Models for Adaptive joint technology center created by Universidad Politécnica de Hypermedia and Adaptive Educational Systems’, in The Adaptive Madrid, Banco Santander and its technological divisions ISBAN Web, eds., P. Brusilovsky, A. Kobsa, and W. Nejdl, Springer-Verlag, and PRODUBAN. (2007). [22] J. Kay, ‘Lies, damned lies and stereotypes: Pragmatic approximations of users’, in Proceedings of the Fourth International Conference on REFERENCES User Modeling, pp. 175–184, Hyannis, MA, (1994). ACM. [23] M. Montaner, B. López, and J. L. de la Rosa, ‘A Taxonomy of [1] I. Barri, T. Loilier, M. van Rijn, A. Stolk, and H. Vasiliadis, ‘Open Recommender Agents on the Internet’, Artificial Intelligence Review, innovation in the financial services sector - Why and how to take 19(4), 285–330, (2003). action’, Technical report, GFT Technologies AG, (2014). [24] S. Sosnovsky, and D. Dicheva, ‘Ontological technologies for user [2] J. P., Moreno, Harvard Business Publishing, Banks’ New modelling’, International Journal of Metadata, Semantics and Competitors: Starbucks, Google, and Alibaba. Ontologies, 5(1), 32–71, (2010). https://hbr.org/2014/02/banks-new-competitors-starbucks-google- [25] World Wide Web Consortium (W3C), SPARQL Current Status. and-alibaba/ http://www.w3.org/standards/techs/sparql#w3c_all [3] Open Data Institute and Fingleton Associates, ‘Data Sharing and [26] J. M. del Alamo, M. A. Monjas, J. C. Yelmo, B. San Miguel, R. Open Data for Banks’, Technical report, (2014). Trapero, and A. M. Fernandez, ‘Self-service privacy: User-centric [4] European Commission, Protection of personal data. privacy for network-centric identity.’, in Trust Management IV. 4th http://ec.europa.eu/justice/data-protection/ IFIP WG 11.11 International Conference on Trust Management, [5] R. T. Fielding, Architectural Styles and the Design of Network-based IFIPTM 2010, pp. 17–31, Morioka, Japan, (2010). Springer Berlin Software Architectures, Ph.D. dissertation, University of California, Heidelberg. 2000. [27] OASIS, OASIS Security Services (SAML) TC. https://www.oasis- [6] Internet Engineering Task Force (IETF), ‘The JavaScript Object open.org/committees/tc_home.php?wg_abbrev=security Notation (JSON) Data Interchange Format’, Proposed Standard RFC [28] O. Manso, M. Christiansen, and G. Mikkelsen, ‘Comparative analysis 7159, (2014). - Web-based identity management systems’, Technical report, The [7] W3C, OpenSocial Foundation Moves Standards Work to W3C Social Alexandra Institute, (2014). Web Activity, http://www.w3.org/blog/2014/12/opensocial- [29] G. Danezis, J. Domingo-Ferrer, M. Hansen, J. H. Hoepman, D. L. foundation-moves-standards-work-to-w3c-social-web-activity/ Metayer, R. Tirtea, and S. Schiffner, ‘Privacy and Data Protection by [8] G. Gouriten and P. Senellart, ‘API Blender: A Uniform Interface to Design – from policy to engineering’ Technical report, European Social Platform APIs’, CoRR, abs/1301.2086, (2013). Union Agency for Network and Information Security (ENISA), [9] M. Lenzerini, ‘Data integration: A theoretical perspective’, in (2014). Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART [30] A. Crespo García, N. Notario McDonnell, C. Troncoso, D. Le Symposium on Principles of Database Systems, PODS ’02, pp. 233– Métayer, I. Kroener, D. Wright, J. M. del Álamo and Y. S. Martín, 246, New York, NY, USA, (2002). ACM. ‘D1.2: Privacy and Security-by-design Methodology’, Technical [10] P. Ziegler and K. R. Dittrich, ‘Three decades of data integration - all report, PRIPARE (2014). problems solved?’, in 18th IFIP World Computer Congress (WCC [31] M. S. Bernstein, D. Tan, G. Smith, M. Czerwinski, and E. Horvitz, 2004), Volume 12, Building the Information Society, pp. 3–12, ‘Personalization via friendsourcing’, ACM Transactions on (2004). Computer-Human Interaction (TOCHI), 17(2), 6:1–6:28, (2008). [11] N. Guarino, ‘Formal ontology and information systems’, in FOIS98, [32] J. Kay T. Kuflik, and B. Kummerfeld, ‘Challenges and solutions of pp. 3–15, Trento, Italy, (1998). IOS Press. ubiquitous user modeling’, in Ubiquitous Display Environments, eds., [12] M. Uschold and M. Gruninger, ‘Ontologies: Principles, methods and A. Krüger and T. Kuflik, Springer Berlin Heidelberg, (2012). applications’, Knowledge Engineering Review, 11(2), 93–136, [33] E. Ramirez, J. Brill, M. K. Ohlhausen, J. D. Wright, T. McSweeny, (1996). ‘Data Brokers – A Call for Transparency and Accountability’, [13] I. Niles and A. Pease, ‘Towards a standard upper ontology’, in Technical report, Federal Trade Commission, (2014). Proceedings of the International Conference on Formal Ontology in [34] E.Bertino and K. Takahashi, Identity Management: Concepts, Information Systems, FOIS01, pp. 2–9, New York, NY, USA, (2001). Technologies, and Systems, Artech House, Inc., 2010. ACM. [35] A. Kobsa, ‘Generic user modeling systems’, User Modeling and [14] D. Brickley and L. Milller, ‘Foaf vocabulary specification 0.99’, User-Adapted Interaction, 11(1-2), 49–63, (2001). Namespace Document - Paddington Edition, (2014). [36] M. Sabadello, ‘Startup Technology Report – Phase One: Acquiring, [15] Object Management Group, Financial Services Standards. Storing, Accesing and Managing Personal Data’, Technical report, http://www.omg.org/hot-topics/finance.htm Personal Data Ecosystem Consortium, (2014). [16] L. Yu, A Developer’s Guide to the Semantic Web, Springer, 2011. Page 26 Human Computation Based Acquisition Of Financial Service Advisory Practices Alexander Felfernig1 and Michael Jeran1 and Martin Stettinger1 and Thomas Absenger1 and Thomas Gruber1 and Sarah Haas1 and Emanuel Kirchengast1 and Michael Schwarz1 and Lukas Skofitsch1 and Thomas Ulz1 Abstract. Knowledge-based recommenders support an easier com- vide knowledge chunks that can be aggregated into a P EOPLE V IEWS prehension of complex item assortments (e.g., financial services recommender knowledge base. and electronic equipment). In this paper we show (1) how such The resulting P EOPLE V IEWS recommenders support customers recommenders can be developed in a Human Computation based (and especially in the financial services domain also sales representa- knowledge acquisition environment (P EOPLE V IEWS) and (2) how tives) in finding products that fit their wishes and needs. Using such a the resulting recommendation knowledge can be exploited in a recommender, items are retrieved within the scope of a dialog (these competition-based e-Learning environment (S TUDY BATTLE). systems are often also denoted as conversational) where users articu- late their requirements and the system tries to identify corresponding solutions. Major advantages of such systems are reduced error rates 1 Introduction in the phase of order acquisition, more time that can be invested in contacting new customers due to fewer errors, more satisfied cus- Knowledge-based recommenders [2] support users on the basis of tomers, and also pre-informed customers due to the fact that recom- semantic knowledge about the item (product) domain.2 One vari- mender applications can be made publicly available. ant of knowledge-based recommenders are constraint-based recom- Knowledge-based recommender systems have been applied in var- menders [8] which exploit explicit constraints (rules) that encode the ious item domains – due to the diversity of applications, we can recommendation knowledge. Another variant are critiquing-based only give some examples of applications of these systems. In the recommenders [4]: new items are presented to the user as long as financial services domain, for example, the following applications of the user is unsatisfied and articulates critiques (e.g., an item should knowledge-based recommendation technologies are reported in the be cheaper). In critiquing-based recommendation, new items are de- literature. Felfernig et al. [11, 12] show an application in the con- termined by similarity functions. For a detailed overview of recom- text of investment decisions where recommenders are provided to mendation approaches we refer to [3, 20]. sales representatives who exploit the recommenders in sales dialogs. In this paper we focus on constraint-based recommenders, i.e., rec- Time savings are reported as one of the major improvements directly ommenders that are based on explicit recommendation rules (con- related to the application of recommendation technologies. Another straints). The development of such recommenders is often a time- application of knowledge-based technologies in financial services is consuming and error-prone process which can be primarily explained presented by Fano and Kurth [7] who introduce a simulation envi- by the knowledge acquisition bottleneck: in the formalization of ronment that can directly visualize the effects of financial decisions product domain and recommendation knowledge, misunderstandings on the financial situation of a family. can occur and as a result knowledge engineers encode this knowledge Felfernig et al. [9] present a digital camera recommender de- in an unintended fashion. The more recommenders have to be devel- ployed on a large Austrian product comparison platform. Peischl oped and maintained the higher the risk that the organization runs et al. [22] show the application of constraint-based recommenda- into a scalability problem where additional resources are needed to tion technologies in the domain of software effort estimation. W EE - be able to perform knowledge engineering and maintenance. V IS[25]3 is a MediaWiki4 based environment for the development An alternative to the hiring of additional staff for development and maintenance of constraint-based recommender applications – and maintenance of recommendation knowledge bases is to change a couple of freely available recommenders have already been de- the underlying knowledge engineering paradigm. The idea of P EO - ployed. Knowledge-based technologies for the recommendation of PLE V IEWS is to engage domain experts more deeply into knowledge business plans are introduced by Jannach and Bundgaard-Joergensen engineering tasks. We do not want to ”convert” them into techni- [19]. The recommendation of equipment configuration in the con- cal experts but to define basic tasks (micro tasks) that are easy to text of smarthomes is introduced by Leitner et al. [21]. Technologies understand and complete even for domain experts without the cor- that recommend changes in software development practices are in- responding technical expertise. Micro tasks completed by users pro- troduced by Pribik and Felfernig [23]. Finally, Burke and Ramezani 1 Applied Software Engineering, Institute for Software Technol- [5] show how to select recommendation algorithms by introducing ogy, Graz University of Technology, Austria, email: {felfernig, rules for recommending recommenders. mjeran, stettinger}@ist.tugraz.at, {thomas.absenger, th.gruber, sarah.haas, emanuel.kirchengast, michael.schwarz, lukas.skofitsch, thomas.ulz}@student.tugraz.at. 3 www.weevis.org. 2 The terms item and product are used synonymously throughout the paper. 4 www.mediawiki.org. Page 27 In P EOPLE V IEWS, principles of Human Computation [26] are id item name included into the development of knowledge-based recommenders. Φ1 Investment Fund A The idea of Human Computation is to let persons perform tasks in Φ2 Investment Fund B which they are better than computers, for example, the identification Φ3 Building Loan of product properties from a website. In the context of knowledge Φ4 Bond Φ5 Savings Book base development and maintenance the idea is to let domain experts perform tasks they are much better in compared to knowledge engi- neers who typically have less knowledge about the product domain Table 2. Example set of items used in working example. and thus relieve the work of knowledge engineers. M ATCHIN [18] is based on the idea of preference elicitation by asking users what development of the application on the basis of micro tasks. a person would typically prefer when having to choose between al- ternatives. Compared to this work, P EOPLE V IEWS allows to derive user attribute question to user attribute domain constraint-based recommenders which are the basis for intelligent What are your {Studies, Pension, Speculation, goal (gl) user interfaces that support, for example, deep explanations [17] and personal goals? Car, House, World trip, noval} {in 1 year, in 2 years, in 3-5 years, the diagnosis and repair of inconsistent requirements [13, 14]. When is the runtime (rt) in 5-10 years, in 10-20 years, in The major contributions of this paper are the following. First, we money needed? more than 20 years, noval} show how financial service recommender knowledge bases can be Preparedness to risk (ri) {low, medium, high, noval} developed by a community of domain experts. Second, we sketch take risks? how such knowledge bases can also be exploited for teaching advi- sory practices on the basis of games (S TUDY BATTLE environment). Table 3. User attributes u ∈ U of example financial services Third, we provide a discussion of major issues for future research. recommender. The remainder of this paper is organized as follows. In Section 2 we introduce basic concepts of Human Computation based knowl- In the P EOPLE V IEWS recommendation mode, user attributes can edge construction. To give an impression of the P EOPLE V IEWS and be used to specify user (customer) requirements reqi ∈ REQ. In the S TUDY BATTLE user interface, we present example screenshots the modeling mode, user attributes represent a central element of a in Section 3. Preliminary results of empirical evaluations are shortly micro task: given a certain item, users are asked to estimate which discussed in Section 4. In Section 5 we provide an overview of issues values of user attributes are compatible with the item, i.e., are a crite- for future work. We conclude the paper with Section 6. ria for selecting and recommending the item. The evaluation of items with regard to user attributes is the central micro task implemented 2 Developing P EOPLE V IEWS Recommenders in the current P EOPLE V IEWS prototype. A detailed evaluation of the example items (Table 2) regarding the user attributes goal, runtime, The P EOPLE V IEWS environment supports two basic modes of inter- and risk is provided in Table 4. action. First, recommender applications can be created in the mod- Each row of Table 4 specifies a so-called user-specific filter con- eling mode and second, the applications can be executed in the rec- straint [10], i.e., a filter constraint (specified by a user) regarding a ommendation mode. In this section we discuss different tasks to be specific item. For example, user Luc specified Pension and Specu- performed in order to create a P EOPLE V IEWS recommender. Table lation as possible goals that lead to an inclusion of the item Invest- 1 provides an overview of the users of our working example. These ment Fund B into a recommendation. Furthermore, Luc believes that users will jointly develop a P EOPLE V IEWS recommender. a user should have a high preparedness to take risks (attribute risk) and should need the payment in 3-5 years, 5-10 years or 10-20 years user email pwd from now on. Semantically, an item X is selected by a user-specific Andrea andrea@... **** filter constraint if all the preconditions are fulfilled. Mary mary@... ***** Luc luc@... ****** In order to derive recommendation-relevant filter constraints (rec- Torsten torsten@... **** ommendation rules) [10]), user-specific filter constraints have to be aggregated. An example of this aggregation step is depicted in Table 5. For each item all related user-specific filter constraints are inte- Table 1. Example users of P EOPLE V IEWS environment. grated into one constraint. Each row in this table has to be interpreted as a filter constraint for a specific item, for example, the constraint Table 2 contains an overview of items (financial services) that are in the first row of Table 5 is the following. The item Φ1 (Investment used in our working example. The Investment Funds (A and B) have Fund A) is included (recommended) if the user requirements regard- a higher risk of loss and require that customers have a high willing- ing goal (gl), runtime (rt), and risk (ri) are consistent with the condi- ness to take risks, otherwise these services will not be recommended. tion of the recommendation-relevant filter constraint gl ∈ {Studies, Building Loan, Bond, and Savings Book are lower-risk items. In the Pension, Speculation, noval} ∧ rt ∈ {in 5-10 year, in 10-20 years, current version of P EOPLE V IEWS, items can be characterized by ad- noval} ∧ ri ∈ {medium, high, noval} → include(Φ1 ). ditional item attributes, however, these attributes are not used by rec- Table 5 includes the complete set of recommendation-relevant ommendation rules constructed from micro contributions. filter constraints (recommendation rules). Exactly these conditions In P EOPLE V IEWS, user requirements reqi ∈ REQ are specified are applied by P EOPLE V IEWS to determine recommendations for as assignments of user attributes. For our financial services recom- a user. In P EOPLE V IEWS, each item has exactly one related mender we define a set of user attributes which are enumerated in Ta- recommendation-relevant filter constraint; each such filter constraint ble 3. In the current version of the system, user attributes are defined is represented by one row in Table 5. The general logical represen- by the creators of a recommender application, i.e., attribute defini- tation of a recommendation-relevant filter constraint f for an item tions can not be extended by other users who contribute to the further Φ is shown in Formula 1. In this context, values(Φ, u) is the set of Page 28 user item name (id) goal runtime risk Studies, Pension, in 5-10 years, in 10-20 Andrea Investment Fund A (Φ1 ) high Speculation years in 5-10 years, in 10-20 Luc Investment Fund A (Φ1 ) Pension, Speculation high years in 5-10 years, in 10-20 Mary Investment Fund A (Φ1 ) Pension, Speculation medium, high years in 3-5 years, in 5-10 years, Torsten Investment Fund B (Φ2 ) Pension, Speculation high in 10-20 years in 3-5 years, in 5-10 years, Luc Investment Fund B (Φ2 ) Pension, Speculation high in 10-20 years Studies, Pension, Car, in 5-10 years, in 10-20 Mary Building Loan (Φ3 ) low, medium, high House years Studies, Pension, Car, Andrea Building Loan (Φ3 ) in 5-10 years low, medium House Studies, Pension, Car, Luc Building Loan (Φ3 ) in 5-10 years low, medium House in 2 years, in 3-5 years, in Mary Bond (Φ4 ) Studies, Car, House low, medium 5-10 years Studies, Car, House, World in 1 year, in 2 years, in 3-5 Andrea Savings Book (Φ5 ) low trip years, in 5-10 years in 1 year, in 2 years, in 3-5 Torsten Savings Book (Φ5 ) Studies, House, World trip low years, in 5-10 years Table 4. Example of user-specific filter constraints (= micro contributions). supported domain values of user attribute u ∈ U (see Table 4). The constant noval denotes the fact that no value has been selected for the corresponding user attribute. item name attribute:value support value (id) Investment ^ goal: Studies 0.33 f (Φ) : u ∈ values(Φ, u) ∪ {noval} → include(Φ) (1) Fund A (Φ1 ) goal: Pension, Speculation 1.0 u∈U runtime: in 5-10 years, in 10-20 years 1.0 risk: medium 0.33 For each pair (Φ, val ∈ values(Φ, u)), P EOPLE V IEWS deter- risk: high 1.0 mines a corresponding support value (see Formula 2). In this context, Investment occurrence(Φ, val) denotes the number of times, value val occurs goal: Pension, Speculation 1.0 Fund B (Φ2 ) in a user-specific filter constraint for item Φ and occurrence(Φ) de- runtime: in 3-5 years, in 5-10 years, in 1.0 notes the number of times an item Φ is referred in a user-specific 10-20 years risk:high 1.0 filter constraint. For example, support(Φ1 , Studies) = 13 . Building goal: Studies, Pension, Car, House 1.0 Loan (Φ3 ) occurrence(Φ, val) runtime:in 5-10 years 1.0 support(Φ, val) = (2) occurrence(Φ) runtime:in 10-20 years 0.33 risk:low, medium 1.0 The complete set of support values is depicted in Table 6. In P EO - risk:high 0.33 PLE V IEWS, an item Φ can have an associated rating (rating(Φ)) Bond (Φ4 ) goal: Studies, Car, House 1.0 which represents an item evaluation with regard to quality and related runtime:in 2 years, in 3-5 years, in 5-10 1.0 services. Such a rating can be determined, for example, by calculat- years risk:low, medium 1.0 ing the average of the individual user item ratings.5 For simplicity, we Savings do not take into account user ratings in the utility function discussed goal: Studies, House, World trip 1.0 Book (Φ5 ) below (see Formula 3). goal:Car 0.5 Depending on the requirements articulated by the current user runtime:in 1 year, in 2 years, in 3-5 1.0 (see, e.g., Table 7), P EOPLE V IEWS determines and ranks a set years, in 5-10 years risk:low 1.0 of relevant items as follows. First, recommendation-relevant fil- ter constraints are applied to pre-select items that fulfill the user requirements REQ = {req1 , req2 , ..., reqk }. In our example, the Table 6. Support values (see Formula 2) derived from user-specific filter set {Investment Fund A, Building Loan} would be selected by the constraints (see Table 4). recommendation-relevant filter constraints (see Table 5). 5 Similar to ratings provided by platforms such as amazon.com. Page 29 item name (id) goal runtime risk Investment Studies, Pension, in 5-10 years, in 10-20 medium, high Fund A (Φ1 ) Speculation years Investment in 3-5 years, in 5-10 years, Pension, Speculation high Fund B (Φ2 ) in 10-20 years Building Loan Studies, Pension, Car, in 5-10 years, in 10-20 low, medium, high (Φ3 ) House years in 2 years, in 2-5 years, in Bond (Φ4 ) Studies, Car, House low, medium 5-10 years Savings Book Studies, Car, House, World in 1 year, in 2 years, in 3-5 low (Φ5 ) trip years, in 5-10 years Table 5. Example of recommendation-relevant filter constraints which are the result of integrating user-specific filter constraints (see Table 4). id requirement If users are logged in, they are allowed to contribute to the de- req1 goal = Studies velopment of P EOPLE V IEWS recommender applications. Only the req2 goal = Pension creators of a recommender application are allowed to define user at- req3 runtime = in 5-10 years tributes. Other users can complete micro tasks in terms of evaluating req4 risk = medium items with regard to a defined set of user attributes. The list of user attributes used in our working example is depicted in Figure 2 (cor- Table 7. Example set of user requirements (reqi ∈ REQ). responds to the entries of Table 3). The determined recommendation set must be ranked before being presented to the user. In P EOPLE V IEWS, item ranking is based on the following utility function (see Formula 3). The utility of each item is derived from the support values of individual requirements (see Formula 2). utility(Φ, REQ) = Σreq∈REQ support(Φ, req) (3) The item ranking of our working example as a result of apply- ing Formula 3 is depicted in Table 8. For example, utility(Φ3 ,REQ = {goal = Studies, goal = Pension, runtime = in 5-10 years, risk = medium}) = support(Φ3 , goal = Studies) + support(Φ3 , goal = Pension) + support(Φ3 , runtime = in 5-10 years) + support(Φ3 , risk = medium) = 1.0 + 1.0 + 1.0 + 1.0 = 4.0. item name (id) utility rank Building Loan (Φ3 ) 4.0 1 Investment Fund A (Φ1 ) 2.66 2 Table 8. Utility-based ranking of items in the recommendation set. 3 User Interface Figure 1. P EOPLE V IEWS homescreen – the current version of the user 3.1 P EOPLE V IEWS interface is provided in German. The homescreen explains the basic functionalities of the system (development, maintenance, and execution of In this section we discuss the P EOPLE V IEWS user interface6 and also recommender applications). show how P EOPLE V IEWS recommendation knowledge can be ex- Logged-in users are also allowed to enter new items to the recom- ploited by the S TUDY BATTLE learning environment. The P EOPLE - mender product catalog. The P EOPLE V IEWS representation of prod- V IEWS homescreen is depicted in Figure 1. For applying P EOPLE - uct catalogs is exemplified in Figure 3 (corresponds to the list of V IEWS recommenders, there is no explicit need for being logged in. items shown in Table 2). Recommenders can be selected and activated directly from the home- The interface for evaluating an item with regard to a set of user screen (see the tag cloud in Figure 1). attributes is depicted in Figure 4. The screenshot depicts the evalu- ation of Building Loan with regard to the user attribute goal. After 6 The user interface is currently only available in German. having completed the definition of a P EOPLE V IEWS recommender, Page 30 product knowledge and sales practices. Examples of S TUDY BATTLE games are the following. Assign Properties. Figure 6 depicts an example user interface of a S TUDY BATTLE application that implements a quiz related to knowl- edge about the relationship between user attributes and items. In the example, users have the task to assign items on the left hand side to user attribute values on the right hand side where each product has to be assigned to at least one attribute value and vice-versa. Find Items. A different version of the game depicted in Figure 6 is to ask for products that fulfill certain criteria (represented by a combination of user attribute settings). Figure 2. P EOPLE V IEWS: example user attributes. Find Incompatibilities. This game focuses on combinations of user attribute values that do not lead to a solution, i.e., users have to spec- ify combinations of user attribute values from which they think that no corresponding solution could be found. Maximize Requirements. The task is to identify minimal sets of requirements (from a given set of requirements REQ) that have to be deleted from REQ such that the remaining requirements lead to at least one solution. This game type reflects the principles of model- based diagnosis [6, 24], i.e., support users in learning and improving repair behavior in situations where no solution can be identified. Maximize Items. A similar task is focused on the repair of item sets; in this context the task of users is to identify a maximal set of items from a given set of items such that there exists at least one combination of user attribute values that lead to these items (not nec- essarily exclusively). An additional criteria could be that at least n items from the original item list must remain in the result set. Figure 3. P EOPLE V IEWS: example of an item list. the recommender can directly be executed. The user interface of our financial services recommender is depicted in Figure 5. Figure 4. P EOPLE V IEWS: example of an item evaluation user interface Figure 6. S TUDY BATTLE ”Assign Properties” learning application. The (evaluation of item Building Loan with regard to the user attribute goal). task of the user is to relate items with corresponding attribute values. 3.2 S TUDY BATTLE 4 Preliminary Evaluation Results Recommendation-relevant filter constraints can be further exploited Human Computation based Knowledge Acquisition. Applying Hu- for generating different learning applications that are part of the man Computation concepts [26] in the context of recommender ap- S TUDY BATTLE environment. S TUDY BATTLE is a game-based learn- plication development and maintenance has the potential to lift the ing environment which can be utilized as an environment for learning burden of enormous engineering and maintenance efforts from the Page 31 Figure 5. P EOPLE V IEWS: example of a recommender application (Financial Services). shoulder of knowledge engineers. Micro tasks as sketched in this Weighting of Item Evaluations. In the current P EOPLE V IEWS ver- paper can be structured in a way that they are understandable for sion it is possible to assign user attribute values to items, i.e., to domain experts without a computer science background. Knowledge specify which criteria are relevant for the selection of a certain item. gained from completed micro tasks can be easily integrated into a In future versions of P EOPLE V IEWS it will be possible to integrate corresponding recommender knowledge base. Due to the increas- weights into item evaluations. This maybe does not play a major role ing size and complexity of knowledge bases, the development of in financial service related recommender applications but can be im- such technologies is crucial since they help to tackle scalability is- portant in other domains were nuances and personal tastes play a sues which otherwise could cause a complete failure with regard to a more important role. For example, in the context of recommending company-wide recommender deployment. As such, P EOPLE V IEWS digital cameras, it can be important to specify degrees regarding cer- technologies can be considered as a first step towards more scalable tain camera properties, for example, the degree to which a camera is development methods that will also help to further increase the pop- able to support sports photography. ularity of knowledge-based (recommendation) technologies. Further Micro Tasks. In the current system version, the only mi- Usability. An initial user study has been conducted with an early cro task to be completed is to define the relationship (compatibility version of P EOPLE V IEWS at the Graz University of Technology [10]. properties) between items and corresponding user attribute values. N=161 (15% female and 85% male) students interacted with the sys- In future versions of P EOPLE V IEWS we will extend this list of micro tem with the goal to develop different recommender applications. Af- tasks (see Table 9). ter having completed the development, the study participants had to User Selection for Micro Tasks. An important enhancement will be complete a questionnaire which was based on the system usability the inclusion of methods that automatically select users for a given scale (SUS) [1]. Evaluation results regarding the SUS aspects are set of micro tasks and also take into account fairness in the distribu- summarized in Figure 7. Besides usability questions, further feed- tion of micro tasks. As detected in our initial studies, users are willing back has been provided by the study participants, for example, the to contribute to the further development of P EOPLE V IEWS recom- majority of the participants (69% of all study participants) would menders. An important issue in this context is to find the users with like to further contribute to P EOPLE V IEWS recommenders. 56% out the right expertise for certain tasks and also to not overload users. of those participants who wanted to contribute agreed to contribute Our approach in this context will be to maintain user profiles which within a time frame of less than 30 minutes per week. are derived from observing the activities of a user within P EOPLE - V IEWS. For example, if a user selects a certain item when interact- ing with the financial services recommender, the keywords extracted 5 Future Work from the corresponding item description are stored in the user pro- file. If (in the future) micro tasks related to similar items (items with The major goal of this paper was to provide an overview of the P EO - a similar description) have to be completed, users with expertise re- PLE V IEWS recommendation environment. There are many issues for garding such items will be the preferred contact persons. future work that we want to tackle and integrate corresponding solu- Games. Games will be another mechanism for data collection in tions in upcoming P EOPLE V IEWS versions. Page 32 Figure 7. Results of a SUS-based usability study [1] of the P EOPLE V IEWS environment. the P EOPLE V IEWS modeling mode. A single user game will be in- cluded that is quiz-based. The overall goal is to guess user attribute settings correctly that best describe a certain item. In a second game two users will jointly try to figure out user attribute values that best describe shown items. The more matching item evaluations exist the better the team performs. Dependencies between User Attributes and Item Attributes. An ex- name description tension of the current P EOPLE V IEWS version will be the possibility check whether a certain item belongs to to identify direct relationships between user attribute values and tech- item quality check a specific recommender (is an existing nical product properties. This is not the case in the current P EOPLE - recommender-related item) V IEWS version since dependencies are only defined between user check whether a certain attribute attribute values and items. belongs to to a specific recommender attribute quality check Recommendation Algorithms. The current version of P EOPLE - (user attribute or item attribute exists in the item domain) V IEWS relies on the discussed recommendation-relevant filter con- check whether a certain value belongs straints – item ranking is based on a utility-based evaluation (see attribute value quality to the domain of an attribute (user Formula 3). In future versions of P EOPLE V IEWS we will extend the check attribute or item attribute) quality of recommendation algorithms by, for example, adapting the check whether a certain figure belongs graphic check to a certain item determination of support values. If, for example, additional infor- evaluate item assign user attribute values to items mation about the performance of a certain user is available (e.g., attribute value utility derive a ranking that shows which items performance with regard to correctly completed micro tasks in the check best support a user attribute value past), this information can be used to increase/decrease the weight of a user when determining support values. Finally, when users are Table 9. Example list of micro tasks to be integrated in P EOPLE V IEWS. specifying their requirements, future versions of P EOPLE V IEWS will allow the specification of preferences (weights) which indicate user preferences regarding certain requirements. This will also include ap- proaches to the learning of weights (users should not have to specify all weights explicitly). Inconsistency Management. Given a set of customer requirements it could be the case that no solution can be presented to the user. In upcoming versions of P EOPLE V IEWS we will focus on integrating state-of-the-art diagnosis algorithms that help to automatically deter- mine repair actions in such inconsistent situations [15]. These repairs Page 33 will take into account user weights (preferences) and thus minimize [11] A. Felfernig, K. Isak, K. Szabo, and P. Zachar, ‘The VITA Finan- the number of interaction cycles needed to find a reasonable solu- cial Services Sales Support Environment’, pp. 1692–1699, Vancouver, Canada, (2007). tion. In addition to this more intelligent management of inconsistent [12] A. Felfernig and A. Kiener, ‘Knowledge-based Interactive Selling of requirements, we will integrate mechanisms that help to consolidate Financial Services with FSAdvisor’, in 17th Innovative Applications of the set of user-specific filter constraints in order to make the result- Artificial Intelligence Conference (IAAI05), pp. 1475–1482, Pittsburgh, ing recommendation-relevant filter constraints more compact. Con- Pennsylvania, (2005). solidation will be achieved, for example, on the basis of redundancy [13] A. Felfernig, M. Schubert, G. Friedrich, M. Mandl, M. Mairitsch, and E. Teppan, ‘Plausible repairs for inconsistent requirements’, in 21st In- detection algorithms [16]. ternational Joint Conference on Artificial Intelligence (IJCAI’09), pp. Quality Management. The major task of quality management is 791–796, Pasadena, CA, (2009). to assure the quality of the dataset collected on the basis of differ- [14] A. Felfernig, M. Schubert, and S. Reiterer, ‘Personalized diagnosis for ent micro tasks. Quality assurance must be capable of detecting and over-constrained problems’, in 23rd International Conference on Arti- ficial Intelligence (IJCAI 2013), pp. 1990–1996, Peking, China. preventing manipulations of the dataset (also under the assumption [15] A. Felfernig, M. Schubert, and C. Zehentner, ‘An Efficient Diagnosis that anonymous users are allowed to complete micro tasks), it must Algorithm for Inconsistent Constraint Sets’, Artificial Intelligence for also identify changes to the given set of user-specific filter constraints Engineering Design, Analysis, and Manufacturing (AIEDAM), 25(2), that help to improve the prediction quality of recommendation algo- 175–184, (2012). rithms. Quality assurance is also responsible for the generation of [16] A. Felfernig, C. Zehentner, and P. Blazek, ‘Corediag: Eliminating re- dundancy in constraint sets’. micro tasks that need to be completed in order to improve the overall [17] G. Friedrich, ‘Elimination of spurious explanations’, in European Con- quality of the P EOPLE V IEWS datasets. The micro tasks generated by ference on Artificial Intelligence (ECAI 2004), pp. 813–817, Valencia, quality assurance are summarized as an agenda – this agenda is for- Spain, (2004). warded to micro task scheduling that is responsible for distributing [18] S. Hacker and L. VonAhn, ‘Matchin: Eliciting User Preferences with an Online Game’, in CHI’09, pp. 1207–1216, (2009). micro tasks to the P EOPLE V IEWS user community. [19] D. Jannach and U. Bundgaard-Joergensen, ‘SAT: A Web-Based Inter- active Advisor for Investor-Ready Business Plans’, in Intl. Conference 6 Conclusions on e-Business (ICE-B 2007), pp. 99–106, (2007). [20] D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich, Recommender In this paper we gave an overview of the P EOPLE V IEWS recommen- Systems, Cambridge University Press, 2010. dation environment which exploits concepts of Human Computation [21] G. Leitner, A. Fercher, A. Felfernig, and M. Hitz, ‘Reducing the Entry Threshold of AAL Systems: Preliminary Results from Casa Vecchia’, to integrate domain experts more deeply into knowledge base de- in 13th Intl. Conference on Computers Helping People with Special velopment and maintenance processes. P EOPLE V IEWS knowledge Needs, pp. 709–715, (2012). bases can be exploited to generate learning applications which can [22] B. Peischl, M. Zanker, M. Nica, and W. Schmid, ‘Constraint-based be used in the S TUDY BATTLE environment. A major focus of this Recommendation for Software Project Effort Estimation’, Journal of Emerging Technologies in Web Intelligence, 2(4), 282–290, (2010). paper was to show how P EOPLE V IEWS can be applied in the context [23] I. Pribik and A. Felfernig, ‘Towards Persuasive Technology for Soft- of financial service recommendation. The concepts presented in this ware Development Environments: An Empirical Study’, in Persuasive paper have the potential to avoid scalability issues which already ex- Technology Conference (Persuasive 2012), pp. 227–238, (2012). ist in many knowledge-based environments due to the increasing size [24] R. Reiter, ‘A theory of diagnosis from first principles’, AI Journal, and complexity of knowledge bases. 23(1), 57–95, (1987). [25] S. Reiterer, A. Felfernig, P. Blazek, G. Leitner, F. Reinfrank, and G. Ninaus, ‘WeeVis’, in Knowledge-based Configuration – From Re- REFERENCES search to Business Cases, eds., A. Felfernig, L. Hotz, C. Bagley, and J. Tiihonen, chapter 25, 365–376, Morgan Kaufmann Publishers, [1] A. Bangor, P. Kortum, and J. Miller, ‘An Empirical Evaluation of (2013). the System Usability Scale (SUS)’, International Journal of Human- [26] L. VonAhn, ‘Human Computation’, in Technical Report CM-CS-05- Computer Interaction, 24(6), 574–594, (2008). 193, (2005). [2] R. Burke, ‘Knowledge-based recommender systems’, Encyclopedia of Library and Information Systems, 69(32), 180–200, (2000). [3] R. Burke, A. Felfernig, and M. Goeker, ‘Recommender systems: An overview’, AI Magazine, 32(3), 13–18, (2011). [4] R. Burke and K. Hammond, ‘The FindMe Approach to Assisted Brows- ing’, IEEE Expert, 32–40, (1997). [5] R. Burke and M. Ramezani, ‘Matching recommendation technologies and domains’, in Recommender Systems Handbook, 367–386, Springer, (2010). [6] J. de Kleer, A. Mackworth, and R. Reiter, ‘Characterizing diagnoses and systems’, AI Journal, 56(197–222), 57–95, (1992). [7] A. Fano and S. Kurth, ‘Personal Choice Point: Helping Users Visualize What it Means to Buy a BMW’, in International Conference on In- telligent User Interfaces IUI’03, pp. 46–52, Miami, FL, USA, (2003). ACM, New York, USA. [8] A. Felfernig and R. Burke, ‘Constraint-based recommender systems: Technologies and research issues’, in IEEE ICEC’08, pp. 17–26, Inns- bruck, Austria, (2008). [9] A. Felfernig, G. Friedrich, D. Jannach, and M. Zanker, ‘An environment for the development of knowledge-based recommender applications’, International Journal of Electronic Commerce (IJEC), 11(2), 11–34, (2006). [10] A. Felfernig, S. Haas, G. Ninaus, M. Schwarz, T. Ulz, M. Stettinger, K. Isak, M. Jeran, and S. Reiterer, ‘RecTurk: Constraint-based Recom- mendation based on Human Computation’, in RecSys 2014 CrowdRec Workshop, pp. 1–6, Foster City, CA, USA, (2014). Page 34 Case-based Recommender Systems for Personalized Finance Advisory Cataldo Musto1 and Giovanni Semeraro1 1 Abstract and diversify the investments over time. Similarly, CF algorithms can hardly be adopted because of the well-known sparsity problem, Wealth Management is a business model operated by banks and bro- which makes very difficult to identify the neighbors of the target user. kers, that offers a broad range of investment services to individual These dynamics suggest to focus on different recommendation clients to help them reach their investment objectives. Wealth man- paradigms. Given that financial advisors have to analyze and sift agement services include investment advisory, subscription of man- through several investment portfolios4 before providing the user with dates, sales of financial products, collection of investment orders by a solution able to meet her investment goals, the insight behind clients. Due to the complexity of the tasks, which largely require our recommendation framework is to exploit Case-Based Reasoning a deep knowledge of the financial domain, a trend in the area is the (CBR) to tailor investment proposals on the ground of a case base of exploitation of recommendation technologies to support financial ad- previously proposed investments. visors and to improve the effectiveness of the process. The talk presents a framework to support financial advisors in the task of providing clients with personalized investment strategies. The 3 Methodology methodology is based on the exploitation of case-based reasoning Our recommendation process is based on the typical CBR workflow and the introduction of a diversification technique. A prototype of described in [2] and sketcted in Figure 3. Our pipeline is structured the framework has been used to generate personalized portfolios, and in three different steps: its performance, evaluated against 1,172 real users, shows that the yield obtained by recommended portfolios overcomes that of portfo- lios proposed by human advisors in most experimental settings. 2 Introduction Wealth management services have become a priority for most finan- cial services companies. As investors are pressing wealth managers to justify their value proposition, turbulences in financial markets re- inforce the need to improve the advisory offering with more cus- tomized and sophisticated services. As a consequence, a recent trend in wealth management is to improve the advisory process by exploit- ing recommendation technologies. However, some peculiarities of the financial domain make hard to put into practice the most common recommendation approaches, as the Content-Based (CB) or the Col- laborative Filtering (CF). As regards CB recommenders, the avail- able content, which is necessary to feed a CB recommendation algo- rithm, is very inadequate and not meaningful, since each user can be just modeled through her risk profile2 along with some demographi- Figure 1. Case-Based Reasoning for Personalized Wealth Management cal features. Similarly, financial products are described through a rat- ing3 provided by credit rating agencies, an average yield on different time intervals and the category it belongs to. In this recommenda- (1) Retrieve and Reuse: retrieval of similar portfolios is performed tion setting a pure CB strategy is likely to fail, since the overlap be- by representing each user through a feature vector: risk profile, in- tween features is very poor. Moreover, the over-specialization prob- ferred through the standard MiFiD questionnaire5 , investment goals, lem [1], typical of CB recommenders, may collide with the fact that temporal goals, financial experience, and financial situation have turbulence and fluctuations in financial markets suggest to change been chosen as features. Each feature is represented on a five-point 1 Dipartimento di Informatica, Universita degli Studi di Bari ”Aldo Moro”, ordinal scale, from very low to very high. Next, cosine similarity is Bari, Italy, email:{cataldo.musto, giovanni.semeraro}@uniba.it adopted to retrieve the most similar users (along with the portfolios 2 The Risk Profile is defined as ”an evaluation of an individual or organiza- they agreed) from the case base. tion’s willingness to take risks”. Typically, this value is obtained by con- ducting the above mentioned standard MiFiD questionnaire. 4 http://en.m.wikipedia.org/wiki/Portfolio (finance) 3 http://en.wikipedia.org/wiki/Credit rating 5 http://en.wikipedia.org/wiki/Markets in Financial Instruments Directive Page 35 (2) Revise: candidate solutions retrieved at step 1 are typically too many to be consulted by a human advisor. Thus, the Revise step fur- ther filters this set to obtain the final solutions. To revise the candidate solutions, four techniques are compared: (a) Basic Ranking: portfolios are ranked in descending cosine similarity order, according to the scores returned by the R ETRIEVE step. The first k portfolios are returned to the advisor as final solu- tions. (b) Greedy Diversification: this strategy implements the diver- sification algorithm described in [3]. The algorithm tries to diver- sify the final solutions by iteratively picking from the original set of Figure 3. Ex-post evaluation candidate solutions the ones with the best compromise between co- sine similarity and intra-list diversity with respect to the previously picked solutions. At each step of the strategy, the solution with the best compromise is removed from the set of candidate solutions and The performance of the framework has been evaluated in an ex- is stored in the set of final solutions. perimental session against 1,172 real users. Results show that the (c) FCV: Financial Confidence Value (FCV) calculates how close yield obtained by recommended portfolios overcomes that of port- to the optimal one is the distribution of the asset classes in a portofo- folios proposed by human advisors in many experimental settings. lio, according to the average historical yield obtained by each class. As shown in Figure 2, FCV significantly outperforms human recom- Given a set of asset classes A, for each portfolio p the set P , of the mendations (the average monthly yield increases from 0.18 to almost asset classes in it, and its complement P are computed. Next, FCV 0.30) for all the neighboorhood (put on the X axis) taken into account. is formally defined as: The experimental results were further confirmed by an ex-post eval- uation performed on real financial data from January to April 2014. As shown in Figure 3, this experiment provided very interesting re- F CV (p) = Y (p)log(λ)+1 (1) sults: beyond confirming the goodness of FCV-based ranking and the statistically significance of the gap with respect to both collab- |P | P|P | orative and human baselines, the most interesting outcome was that X yai Y (p) = pai ∗ yai λ = P i=1 (2) the combination of the diversification technique and FCV can further |P | i=1 y k=1 ak improve the performance of the proposed portfolios. This result sug- where pai and yai are the percentage and the average yield of the gests that the integration of the approaches can make the framework i-th asset class in the portfolio, respectively. Y (p) is the total yield even more effective. This is due to the fact that a combined strategy obtained by the portfolio, and λ is a drift factor which calculates can merge the advantages of a ranking based on past performance, the ratio in terms of average yield between the asset classes in the as FCV, with an algorithm that may lead to more diverse recommen- portfolio and those which are not in. For values of λ ≥ 1, it acts as dations. This makes the investment strategy better, since the human a boosting factor (for λ  1, it acts as a dumping factor). Through advisor does not base her investment proposal on a set of very similar this strategy, all the candidate solutions are ranked according to the portfolios, but rather on a set of diversified solutions which is more FCV score and thetop-k solutions are returned to the advisor. stable and effective, especially when market fluctuations have to be (d) FCV + Greedy: this combined strategy first uses the greedy tackled. algorithm to diversify the solutions, then exploits the FCV to rank the portfolios and obtain the final solutions. 4 Deployment of the framework (3) Review and Retain: in the Review step the user and the human advisor can further discuss and modify the portfolio, before generat- A demo version of the platform is available online6 . ing the final solution for the user. If the monthly yield obtained by the Given that the platform is supposed to be of aid for financial ad- newly recommended portfolio is acceptable, the solution is stored in visors, it lets the advisor to select the current user as well as the the case base and can be used in the future as input to resolve similar recommendation technique to be adopted. Next, the ”Recommenda- cases. tion” button shows the most promising portfolios for the target users along with the distribution of the asset classes. The distribution can be further discussed by user and advisor before coming to the final proposal which is stored in the case base. REFERENCES [1] P. Lops, M. de Gemmis, and G. Semeraro, ‘Content-based recommender systems: State of the art and trends’, in Recommender Systems Hand- book, pp. 73–105. Springer, (2011). [2] F. Lorenzi and F. Ricci, ‘Case-based recommender systems: a unify- ing view’, in Intelligent Techniques for Web Personalization, 89–113, Springer, (2005). [3] B. Smyth and P. McClave, ‘Similarity vs. diversity’, in Case-Based Rea- Figure 2. In vitro evaluation soning Research and Development, 347–361, Springer, (2001). 6 http://193.204.187.192:8080/OBWFinance/ - Login: 2 - Password: 12345 Page 36 P SY R EC: Psychological Concepts to enhance the Interaction with Recommender Systems Gerhard Leitner1 Abstract. Although recommender systems are already a successful carried out and showing concrete possibilities for combining psy- part of many online systems, there are still areas of research which chological knowledge and recommender technologies are exempli- are unexploited. One of them is the appropriate consideration of psy- fied. The paper concludes with a discussion and an outlook on future chological theories which could be beneficial for the interaction be- work. tween a computerized system and an online consumer, particularly in the financial services sector. This paper emphasizes the potentials of integrating psychological knowledge into the further development of 2 Theoretical Background recommender systems on the basis of psychological theories and ba- In the history of online sales many examples of online platforms exist sic decision processes. The enumerated concepts have been demon- which were characterized by high technical quality and innovative- strated to be influential in consumer buying behaviour in numerous ness but lost market share or even disappeared because they did not studies and therefore are used as a theoretical basis of the presented appropriately consider user needs. For example, the first company work. A conceptual framework is build upon the technology accep- offering books online was superseded by competitors who provided tance model (TAM) which offers the possibility of integrating psy- better user experience. Another example showing the importance of chological knowledge in the further development of online financial considering user needs is Boo.com, which was based on cutting edge services. Possible applications and implementations are shown on the technology but showed bad usability, see, for example, [5]. Recom- basis of empirical work that has been carried out in the past years. mender systems can be considered as state of the art technologies supporting online interaction and purchase and have demonstrated 1 Introduction their benefits and capabilities in numerous studies. However, as [7] pointed out, decision support tools such as recommender systems The utility of recommender systems to enhance the quality of deci- consist of three parts:”...database management capabilities, mod- sion processes and their outcome has been approved many times, ac- elling functions, and a powerful yet simple user interface..”. Specif- cording to [1] they are among the most successful applications in Ar- ically the latter offers high potentials for enhancement, by consider- tificial Intelligence. Although recommenders have such a successful ing human capabilities such as attitudes, emotions, and other factors history, there are still unexploited potentials for advancement [2, 3]. influencing their behaviour in their design. The goal to achieve is Specifically promising in this regard is knowledge from psychology an enhanced quality of interaction between the human user and the and research aiming to integrate it into recommender systems. This computerized part of a system resulting in a better outcome for both, area of research is, taking the words of [4], still in its infancy. This the user and the provider. paper opens new perspectives on the potentials of psychological con- Recommender systems can be seen as the technical counterpart cepts and theories to enhance the interaction with recommender sys- of real shopping environments. For about a century research in con- tems in general and in the context of financial services in particular. sumer psychology has been influential in advertising, marketing, and The emphasis is put on interface and interaction aspects, because sales. Speaking of the offline world it does not surprise any more recommender systems are typically characterized by highly sophis- that the design of supermarkets in regard to shopping paths, light- ticated algorithmic and technical basis. However, investigating also ing conditions or sound exposure is not left to chance and consumer efforts in the enhancement of the interface is important, or, as Louis psychology is omnipresent [8]. In comparison, psychological knowl- [5] formulated it: ”No matter how good your back-end systems are, edge applied in the online sector is limited, although an increased the users will only remember your front end. Fail there and you will consideration could be beneficial on different levels [9]. Specifically fail, period.” phenomena addressed in consumer and decision psychology are of The rest of the paper is structured as follows. In the first sections interest in this regard [10, 11]. The challenge addressed in this paper an introduction into the theoretical background with an emphasis on is to take this knowledge to optimize recommender operated plat- psychological concepts is given. This part is followed by a detailed forms in a way that consumers can, on the one hand, benefit from the discussion on decision phenomena and how these are related to rec- advantages of information and communication technologies (ICT). ommender systems. Afterwards a framework based on the TAM, the This is possible because recommender systems are able to dynam- technology acceptance model [6] is presented serving as a research ically adapt to the individual user. This can constitute a meaningful basis for future research activities. In Section 6 studies which were alternative to offline purchase situations where an average sales assis- 1 Alpen-Adria-Universität Klagenfurt, Institute for Informatics- tant can be assumed to base his recommendations only on a limited Systems, Universitätsstrasse 65-67, 9020 Klagenfurt, Austria, set of alternatives. On the other hand it is important to make the user email:gerhard.leitner@aau.at forget about the disadvantages online systems could have compared Page 37 to real shopping experiences. These are, for example, the possibil- and value. Expectancy refers to the degree to which a person is ity to touch and investigate a product physically and to communicate capable of reaching a goal. Value refers to the importance the goal with a human counterpart, negotiate a price or ask questions. The has for the person. Example theories of this group are the theory of challenge for the service-provider is the increased difficulty to con- planned behaviour (TPB) or the theory of reasoned action (TRA) vince an online user about the benefits of a product or even persuade and they are important in the context of online buying. Besides him or her to buy it, because there are limited possibilities to estab- personal aspects (i.e., attitude to a behaviour), social aspects play lish a pleasant atmosphere. In the following a spotlight is put on a an important role and influence the value. For example, how peo- selection of psychological concepts and theories which have a direct ple from relevant groups such as peer groups, family and friend relation to buying behaviour and therefore build a promising basis would judge a certain behaviour (e.g., the purchase of a certain for further research and to enhance recommender systems in a way product) [18, 19]. that they are capable of supporting all facets and phases of human • Need for Cognition / Elaboration Likelihood Model, NfC consumer behaviour. This is neither easy nor possible in just one it- NfC implies that depending on the importance of the domain eration. (”personal involvement”) a person tends to process information on different elaboration routes. In domains which are of high impor- tance for the person information is processed on the central route, 3 Basic Psychological Theories characterized by a high level of elaboration (extensive collection The following list of theories is not intended to be exhaustive, it of information, comparison, outweighing of pros and cons, etc.) should just point out the potentials of psychological concepts which The alternative way of processing, the peripheral route, is char- have, as demonstrated in numerous studies, a direct relation to hu- acterized by low involvement of the person and, as an effect, an man behaviour and insofar could also be useful for the enhancement intentional low investment of efforts in processing information. of online behaviour in general and in regard to financial services in The type of elaboration is, for example, of interest when an online particular. Some of the elements of the theories have been either anal- platform is intending to include persuasive technologies [20, 21]. ysed for applicability or actually used within own studies [12, 13, 1], • Cognitive Dissonance, CD others are planned to be integrated in our future work. CD is assuming a mental model that a person establishes about a certain area of life, a behaviour or other relevant issues. The • Prospect Theory, PT model only includes ”consonant” information, which means that PT is of interest in regard to the behaviour of consumers in situ- information present in the model should not be contradictory. For ations characterized by uncertainty and and risk. These are, when example, if a person thinks about financing a holiday trip with a considering the work of [10] demonstrating that the assumptions loan this may contradict with a negative attitude towards taking of economic theory do not hold, almost all situations. Because out a loan for things that do not have a material value (such as of limitations in human information processing, systematic biases cars or real estates) . In this case dissonance occurs and, accord- in rating situations and decision making are observable. For ex- ing to the model, mental efforts are invested to restore consistency ample, humans act risk seeking when a loss is probable, or risk [22]. For the concrete example an argument could be that the ex- averse when a profit can be expected [11, 14]. This asymmetry is, change rate of country’s currency where the journey is heading is for example, one explanation why people invest additional money favourable and insofar money is saved. into loss-making investments. • Reactance Theory, RT • Locus of Control Theory, LoC Implies that humans are driven by the assumption that they can LoC implies that behaviour depends on the interpretation of a per- behave and act unrestrictedly. If a behaviour or an ”object of de- son whether she has control over a situation or interaction and the sire” is not available or difficult to reach, its subjective value is outcome of an interaction (internal locus of control). When a situ- increased and the reactant user tries to overcome this shortage by ation or outcome is beyond influence (e.g. the user has the feeling increased efforts [23]. Online platforms try to induce reactance that the system or external forces have the control), then external by indicating limitations in product or service availability. In re- locus of control is the case [15]. gard to financial services, for example, special offers for loans or • Attribution Theories, AT financing models are made available for limited time periods. Attribution theories are, as LoC, assuming internal/external con- • Flow, F trol as one important dimension, but also include other dimen- The central concept of the theory is the state of flow which is sions, for example stability vs. flexibility. It is not only of rele- characterized by an immersion of the user with the system. Flow vance whether control is perceived as internal or external but also is, for example, observable on computer game players, musicians if it is stable, depending on the domain or a particular situation or craftsmen who smoothly interact with their tools without ob- [16, 17]. An example for the influence of LoC and AT in the con- servable disruptions [24]. A platform offering financial services text of financial services is that a person may assume that it makes should aim at supporting flow by enabling a smooth interaction sense to actively control her financial portfolio (internal control) to dialogue between user and system and giving the possibility to increase prosperity. A person who observes herself as externally ”play” with alternatives. controlled may think that anyway only governments with taxa- tion policies and financial service providers are responsible for How elements of the enumerated theories and concepts could af- the financial status of the individual. This attitude can be stable fect the interaction with a financial services platform is illustrated in or flexible, the latter, for example, by observing the own financial the following example. situation as depending on the global economy and the possibility Example. Imagine a potential consumer is using an online sys- to change when the financial crisis is overcome. tem to inform herself about loan opportunities. Based on her attribu- • Expectancy-Value Theories, EVT tional patterns (AT, LoC) she has a certain understanding of whether This group of theories is based on the two dimensions expectancy she is able to use an online platform and can control the outcome of Page 38 the product search. We assume that she is self-confident in the usage mean that the outcome of the decision is better. One of the reasons is of the system (EVT, expectancy) and the system is appropriately de- that the dimensions consulted for a decision are often unconscious. signed that she can ”play around” and easily evaluate alternatives An a posteriori justification is done on dimensions which can be ra- (and eventually reaches a kind of ”flow”, F). Depending on the per- tionalized but those may not be the ones which were responsible for sonal importance (EVT, value) of the product she is searching for the decision. (loan for a holiday trip, a car or a house) she will put low or high Limited Decision. Another person having in mind to rent an apart- efforts in the evaluation, comparison, and selection of the product ment and just needs money for new furniture may be less passionate (NfC). When she knows what she wants and has good experiences and would apply other criteria to the decision process. She applies with a certain brand or provider (PT, CD) she will not care that much the second type of decision, which is limited decision. Decisions fol- what others say about her decision (EVT, peers). If she is uncertain, lowing this strategy are based on experiences (positive and negative doesn’t want to make a mistake or wants a product with a high status ones) and heuristics which were derived from these experiences, such she will orient herself on information of other users (EVT, peers) and as ”Brand A is better than brand B” or, ”The more expensive, the bet- in what percentage they purchased what product (for example based ter a product”. The person may choose the company for financing on online ratings or discussions with her peer groups). If the product furniture based on an advertisement she recently saw. In this case the or service she has finally chosen is not available immediately, she availability heuristic, described by [11, 14], is applied (e.g., brands will try to solve the problem by finding other sources from where to and companies that are commonly known are better). Following this get the product (PT, RT) or she will resign and decide not to buy any heuristic could lead to choosing a financing the furniture shop offers product (AT). to his customers (an alternative the first person probably would not think about). An influence could also have the social environment (subjective norm, [18, 19]). Recommendations of relatives or friends 4 Decisions as the Connecting Element which have good experiences with a bank can be taken into account. The direct application of the theories and concepts enumerated above Habitual Decision. The third type of decision, habitual decision, can is difficult because many of them are too abstract. It is therefore nec- be seen as a combination of extensive and limited decision. Based on essary to investigate the ”atomic” element of consumer behaviour previous experiences a mental model has been established, on the which is decision. Each purchase or even browsing for information basis of which consumer behaviour follows a routine sequence and to prepare a purchase is characterized by a singular decision or a se- may not involve explicit decisions. This strategy mainly is applied in quence of decisions. They are made on the basis of gathered informa- routine behaviour when no extraordinary investment is planned (such tion, the consultation of different information sources, the outweigh- as in the previous examples). For example, if a person has to trans- ing of alternatives, etc. Economic theory has assumed that humans fer money to a country where the receiver still requires conventional can be considered as omniscient and make decisions on the basis of paper based transfer, she typically goes to her familiar bank branch optimal rationality. Since the work of Simon [10] it is commonly and transfers the money there although there might be another com- agreed that this assumption does not hold for most decision situa- pany who offers cheaper transfers to the target country. In the past tions. The majority of human decision processes is characterized by the selection of the best bank might have involved extensive deci- limited information use, biased mental models and routines either sion strategies. When these efforts were successful and resulted in because of missing capabilities or a low level of motivation to invest selecting an appropriate bank, a mental model is build which drives cognitive efforts. Depending on the kind of limitation, technological future behaviour. If the combination of services, price and reputation means supporting the basic decision processes have to be designed has been working satisfactorily in the past it would not have a seri- in different ways. ous impact, if it did not work any more (e.g., prices for services are Felser [25], based on the work of [26], categorizes decisions in slightly increased) - in terms of financial loss or well-being. consumer behaviour into 4 types, namely extensive, limited, habitual Impulsive Decisions. The last form - impulsive buying - is character- and impulsive decisions. What type of decision is actually applied is ized as a ”reaction” to environmental stimuli rather active behaviour depending on the type of product or service, the degree of personal and may not include decisions at all. This form of occurs in the con- involvement, and emotional contribution (activation) to the domain text of financial services, for example, when a credit card is used for and other personality traits. For example, searching for an appropri- buying things. This also involves investing money, but the investment ate loan for an apartment can have very different characteristics and is hidden and partly unconscious. motives. The previous paragraph was describing decisions on a general Extensive Decision. If a person is planning to buy the apartment level. Beckett et al.[27] have focused their work on financial prod- this is a long term investment that influences the financial life of the ucts and present their findings in the form of a four-field decision person for decades. Therefore the person is probably highly involved, matrix which has parallels to the four types of decisions described activated, and will invest high efforts to find out the best financing al- by [25]. Additionally to involvement, which is part of the systematic ternative and therefore applies an extensive decision procedure until of [26, 25] and NfC [21], the authors point out confidence as another he gets the best financial plan which the smallest influence in the relevant dimension, which is a relevant dimension in LoC and AT current financial situation. The strategy followed has characteristics [17] as well as the EVT [18]. The first decision type included in the of the central route processing of need for cognition theory [20, 21]. matrix is repeat-passive decisions - which correspond to habitual de- Although this type of decision making is highly sophisticated, it has cision in the nomenclature of [25]. Based on positive experiences the some weaknesses. For example, the amount of information consid- consumer has developed loyalty to an enterprise (a bank or insurance) ered in the decision is not directly proportional to the amount of in- and does not explicitly search for alternatives. The rational-active de- formation available, which means that even if higher amounts of in- cision type corresponds to the extensive decision strategy. The third formation would be available, people prefer short cuts [25]. An em- type identified by [27], relational-dependent decisions corresponds pirical proof for this hypothesis could be shown in our own work [1]. to [25, 26]’s limited decision type and is based on heuristics regard- Another insight is that higher effort invested into a decision does not ing experience and brand. If this strategy has been successful, trust Page 39 is developed which reduces search and information processing activ- and mobile first [33]. Not only the technology in the back-end (the ities. Finally, the impulsive type of [25] does not occur very often recommender system) has to be adaptive, but also the interface itself in the context of financial decisions. Therefore the matrix of [27] in- should adapt to the needs of users. Burke [34] proposes a hybrid so- cludes a fourth field labelled ”no purchase”. Figure 1 is showing the lution for recommender system technology, a similar approach could decision types of [25] and their counterparts described in the work of also be imagined for the user interface part. A one fits all approach [27]. seems not to be contemporary, different interface alternatives seem to be a proper way to provide an adaptive access to a recommender system for different groups of users in different contexts of use. One and the same user could be interacting with different views of the system, on different devices, depending on the task at hand, contex- tual aspects, and psychological factors such as involvement in the domain. This means that interfaces do not only have to be adaptive, but personalized, platform independent and customizable [35, 36]. The application of conventional usability engineering methods to ac- company the development is crucial [37, 38], integrated in a user centred design process and combined with frequent evaluations in- volving representatives of the intended user groups. 5 An Integrated Model as Basis of Research Figure 1. Comparison of decision types of [25] and [27] The aspects addressed in the previous sections characterizing con- sumer behaviour in general and online consumer behaviour in partic- ular are difficult to capture. Their comprehension would be easier if The matrix has been evaluated in a series of focus groups and a way could be found to operationalize them based on an integrated three product types are corresponding to the different decision types framework. The technology acceptance model (TAM) originally pro- shown in Figure 1: basic transaction services (existing accounts), ba- posed by Davis [39] could build a basis for this attempt. TAM and its sic insurances products (car, house), and investment services (stocks, derivates have been empirically validated in numerous studies, and shares, pensions, etc.). Repeat-passive decisions mainly take place in it optimally combines the two dimensions emphasized in the previ- the context of basic transaction services, when brand loyalty to bank- ous section. Content - meaning the psychological aspects related to ing institution and confidence in the decision is high. Rational-active a decision making and Presentation - aspects that related to human decisions are made when price is one of the most important criteria. computer interaction. The TAM has relations to many of the theories This strategy is characterized by the necessity to search for products, and concepts enumerated in the previous sections. Figure 2 shows an to deal with a big amount of information and to thoroughly analyse adapted version of the latest version of TAM, TAM 3, introduced by the outcome. This could be necessary because, for example, insur- [6]. The dimensions of TAM and their relation to the concepts and ance companies offer more or less the same services and products theories enumerated above are described in this section. The descrip- and deliberately make comparison to competitive products difficult. tions are partly taken from [6, 40]. Relational-dependent decisions are, according to the results achieved by [27] still strongly depending on personal communication and ad- • Experience vice, because of the inherent complexity of the products and services. Already having used a system or similar ones can have an influ- The previous paragraphs were devoted to the content of decision ence on many factors, such as the perceived usefulness and the processes involved in consumer behaviour. The second, similarly im- subjective norm. In relation to psychological theories, experience portant dimension in regard to online platforms based on recom- can increase, for example, the confidence and the assumption of mender systems is the presentation of information. We take the dif- internal control (LoC, AT). ferentiation of [9] who proposes to differentiate two roles an online • Voluntariness consumer has to assume, one as a shopper and the second as a com- The extent to which users perceive the usage of a system to be puter user. What characterizes and drives the shopper has been em- non-mandatory. This aspect relates to reactance theory (RT) - if a phasized above, in the next part the focus is put on the role of a person has the freedom to choose an online system for financial computer user. Supporting a user in decision making requires the services additionally to offline services this makes a difference provision of interfaces that is appropriate, an issue the research areas to being forced to use online services (because the nearby bank of human computer interaction (HCI), usability engineering and user branch has been closed). experience [28, 29, 30, 31] are dealing with. In regard to online con- • Subjective Norm sumer behaviour one of the major goals has to be to design interfaces A person’s perception that most people who are important think he in a way that they compensate the limitations an online system has in or she should or should not perform a behaviour or use a system. comparison to a to real world shopping situation and emphasize the There could, for example, be a conflict between the personal pref- advantages online systems have over real world shopping. The flex- erences and the attitude of the relevant others, which could lead to ibility, adaptiveness, and adaptability of recommender systems en- cognitive dissonance (CD) (”I would issue a credit for a holiday abling an individual support of each consumer is probably not avail- trip”.) able in typical shopping environments and insofar bear high poten- • Image tials but are also challenging in regard to user interface design. This The degree to which the use of an innovation is perceived to en- means, for example, that the development has to be based on state of hance one’s status in the social system. In regard to the provision the art interface design technologies, such as responsive design [32] of different platforms (desktop or mobile platforms) this aspect, Page 40 Figure 2. Technology Acceptance Model Version 3, adapted from [40] and complemented with example relations to psychological theories for example, influences the usage of a mobile app. It is depend- • Computer Self-Efficacy ing on whether or not the platform is accepted by the peer group The degree to which a person beliefs that he or she has the abil- (Apple, Android, Windows mobile) and illustrates that the attitude ity to perform the intended task. This depends on the experience towards a system is not always based on functional requirements with computer systems in general, and on the experiences within (EVT). a specific domain (e.g. financial services) in particular (LoC, AT). • Task Relevance • Perceptions of External Control A person’s perception regarding the degree to which the target The degree to which a person believes that an organizational and system is relevant to his or her life. If a system offers enhanced technical infrastructure exists to support use of the system. This efficiency (e.g., not having to visit a bank branch for basic tasks) could also be influential in a negative way (according to LoC and without loosing quality (NfC) it will be used. AT) when a person feels that the organization behind a system • Output Quality limits his or her performance or degrees of freedom. The degree to which a person believes that the system offers the • Computer Anxiety same services and enables to achieve the same results as other The degree of a person’s fear, when she/he is faced with the need alternatives, for example, services offered in a bank branch (PT, of using computers to access services. Specifically in the context NfC). of financial services (or even online transactions with credit cards) • Result Demonstrability people are anxious because of the danger to lose money (PT). Tangibility of the results of using the system. This aspect has re- • Computer Playfulness lations to subjective norm and image, for example showing in- The degree of cognitive spontaneity in computer interactions. If a creased prosperity as a result of intelligent investments (EVT). system supports this kind of interaction, such as simulating differ- Page 41 ent variants of financing, this supports persons engaging in exten- digital cameras (pixels, storage, zoom). Only the order of items was sive decision making processes (NfC). manipulated but this significantly increased their recall. • Perceived Enjoyment The extent to which using a specific system is perceived to be en- joyable, whereas enjoyment can have different dimensions. Feel- ing safe in the sense of nothing unexpected can happen when transferring money could be one form of enjoyment. Another one is developing trust towards an institution or a platform when the latter is characterized by transparency and comprehensibility (NfC). • Objective Usability A comparison of systems based on the actual level of effort re- quired to complete specific tasks. If it is faster to go to the bank branch to transfer money than using the computer interface, then the objective usability of an online system would be low (EVT). • Perceived Usefulness The degree to which a person believes that using the system will help him or her to attain gains in life quality. Saving money by us- ing an online system instead of personal services convinces people to adapt to new technologies (EVT). • Perceived Ease of Use Figure 3. Recall frequency in a manipulated item sequence (continuous line) and a familiar item sequence (dashed line) [1] The degree of ease associated with the use of the system. Besides the utility aspects of a system, the subjective usability is relevant. If people do not trust a system or are doubtful in their usage, they A more recent work which builds upon the work on serial position would not use it (LoC, AT). effects was carried out in the domain of group decision making [52]. • Behavioural Intention Making decisions in groups, for example choosing a dinner with a The degree to which a person has conscious plans to perform or business partner or deciding what movie to watch with friends in a not perform some specified behaviour. Only if the enumerated di- cinema always involves psychological phenomena on the individual mensions are fulfilled in a certain degree, a person will have the as well as on the group level. Decisions derived in group situations intention to use a system. The correlation between the intention are influenced by rhetoric skills of the participants, negotiation tech- and the actual use still is low (EVT). niques applied, leadership competency and other personality factors. • Use Behaviour When every aspect is, depending on the individ- In contrast to this real-time and synchronous approach, an online tool ual preferences, optimally fulfilled, then a flow experience could supports asynchronous and sequential decision procedures. Psycho- occur (F). logical concepts that could have an impact in this kind of decision process are, for example, originating from research groups who de- As emphasized in the enumeration of elements, the TAM has con- veloped the prospect theory [11, 14]. One group of effects are an- nections to the concepts and theories addressed in this paper [9] and choring or framing effects, or more general, context effects [53, 51]. would also allow the integration of additional aspects, for example A following small example illustrates their influence. To be able to trust, cf. e.g. [41, 42, 43, 44]. The TAM has also served as basis for sketch a financial plan it is necessary to have a starting point, the an- research in the financial services domain, cf. e.g. [45, 46, 47]. chor stimulus. This starting point is typically the amount of money that has to be financed. A strategy that is frequently used in adver- 6 Empirical Work tising is not to use the whole amount for evaluation (for example, 100.000 are needed + overhead costs) but the monthly rate (for exam- The theoretical concepts presented in this paper have been evaluated ple 500). Within the study we investigated alternatives of presenting in several empirical works. In this section a selection of these works information and were interest in the possibilities of manipulating se- and their relation to the theoretical parts of the paper is presented and rial position effects and other form of presentation, concretely based relations to the enumerated models and concepts are emphasized. on the multi attribute utility model (MAUT). The results showed that The first work in this regard is a paper on serial position effects. MAUT concepts can counteract serial position effects and insofar The effect, being one of the oldest phenomena in psychological ba- represent an appropriate means to steer decision processes. Figure 4 sic research [48, 49, 50], is characterized by the fact that items pre- is showing an example screen of the C HOICLA group decision sup- sented in a list or sequence are better memorized when presented at port tool on which preferences can be declared based on multiple the beginning or the end of the list. In our work [1] we could show attributes. that changing the sequence of items significantly influences the recall The last empirical work presented was focused on persuasion [54] of the items and this offers a possibility to influence the interaction and the potentials of the asymmetric dominance effect, better known between a consumer and a computer system on the level of presen- as decoy effect [55]. This concept has also a relation to anchoring and tation. Depending on the motives and needs that drive the consumer framing effects which can be manipulated. In contrast to the example (e.g. involvement, confidence, type of decision, willingness to invest above where information is hidden or presented in another form, the efforts) important information can be put in the sequence where it has decoy effect uses the influence of adding additional information to the highest probability to be perceived and memorized for further us- a decision situation. Adding a decoy element is intended to divert age. Figure 3 is shows the effect on the recall of items by simply or even disturb the attentive processes of a potential consumer and changing their order. The list used in the study contained features of open a new perspective to him or her to lead a decision in a certain Page 42 computerised systems based on recommender technology. The the- oretical basis builds a selection of psychological concepts and the- ories which have been empirically investigated in numerous studies and proved themselves as being relevant in the context of consumer behaviour. An increased consideration of knowledge from psychol- ogy could enhance the quality of recommender systems, specifically on the level of the user interface. The different types of decisions related to consumer behaviour were discussed and possibilities of recommender systems to support such decisions were exemplified. The technology acceptance model serves as a basis for further re- search in this area because it already integrates many of the relevant psychological concepts and theories that have been demonstrated to be influential in the context of consumer behaviour. With an appro- priate consideration of this knowledge, recommender systems could overcome the disadvantages online system have in comparison to of- Figure 4. Choicla Screen to enter preferences for restaurants based on fline interaction between consumers and, for example, shop assis- MAUT [52] tants. The advantages of recommender systems such as their capabil- ities of processing huge amounts of data, selecting the correct prod- direction, to persuade a user to purchase a product or to initiate a ucts from millions of alternatives, and calculating the best product preference construction which would not have been started without for are consumer within a few seconds could be exploited in a better the distractive element. In our paper we investigated the asymmetric way if not only the back-end functionalities but also the front-end, dominance effect and could show possibilities how to integrate them the interface to the customer is enhanced in an appropriate way. into recommender systems. Figure 5 is showing a decoy situation. Although our work is addressing different domains, the concep- Before introducing the decoy element (D) two products are available tual work sketched and the empirical studies performed are also to the customer, C (competitor product) and T (target product). C is applicable to the financial sector. Specifically of interest in this re- characterized by a lower price, but also by lower quality than T. As gard are the different types of decisions driving potential customers price is one of the most important dimensions in purchase decisions and motivating them to use an online system, choosing a product or [26] consumers tend to buy C. With introducing the decoy D which service, changing parts of his or her financial portfolio. In the con- has a lower quality than T, but a higher price, the focus of attention is text of recent developments in the financial sector (e.g., merging of directed to quality. This new perspective is not only of advantage for banks and insurance companies, closing of branches) the importance the provider (because of higher revenue) but also for the consumer of online services will increase. Appropriate systems supporting the (because of higher quality and satisfaction with the product). different needs, motives of end consumers, and also respecting the different levels of efforts people are willing to invest into financial decisions will be more important than ever before. Recommender systems integrating psychological aspect and simulating a ”human image” [36] could fill the arising gaps. With the system M YLIFE, an award winning platform, we could demonstrate respective possi- bilities. M YLIFE is an online platform enabling insurance agents to- gether with end consumers to manage the consumer’s financial port- folio in a cooperative partnership instead of putting the consumer in the role of a ”supplicant” towards financial service providers. The system consists of an intelligent algorithmic basis FAST D IAG [56] and an appropriate user interface visualizing in an integrated fashion the finance portfolio of a customer. The empirical work presented can only be seen as the starting point in the endeavour of enhancing human recommender interac- tion in the emphasized way. An unresolved problem in this regard is, Figure 5. Showing the example for the asymmetric dominance (”decoy”) for example, how a recommender system could find out what strat- effect. Product C (competitor) is of lower quality than product T (the target egy a consumer is currently applying (e.g. extensive or limited de- product), but C is cheaper and price is typically the feature with the highest influence in purchase situations. People would therefore, in general, choose cision) and to change the presentation of information accordingly. product C. By introducing a product D (decoy) which is of higher quality There are of course domains where one strategy is the most proba- than C, but of lower quality than T and more expensive than both of them, ble one (e.g. financing a real estate are probably based on extensive the viewpoint (anchor, reference frame) changes, and product T is preferred and central route elaboration) but further research is necessary to ad- by the majority of consumers [54] dress this problem. Of course transferring services form offline to online does not only have advantages. In the context of current de- . velopments in regard to privacy and business ethics this opens new challenges which are influencing the orientation of future research activities. Our major goal is to complete the ”puzzle” of which we 7 Discussion and Conclusions have already identified elements in our past research work. In this paper we have tried to emphasise the potentials of psycholog- ical theories to enhance the quality of interaction between users and Page 43 REFERENCES [30] Lohse, G. J. L. (2000). Usability and profits in the digital economy. In People and Computers XIV Usability or Else!, 3-15. [31] Lohse, G. L., & Spiller, P. (1999). Internet retail store design: How the [1] Felfernig, A., Friedrich, G., Gula, B., Hitz, M., Kruggel, T., Leitner, G., user interface influences traffic and sales. Journal of Computer Mediated Melcher, R., Riepan, D., Strauss, S., Teppan, E., & Vitouch, O. (2007). Communication, 5(2), 0-0. Persuasive recommendation: serial position effects in knowledge-based [32] Marcotte, E. (2011). Responsive web design. Editions Eyrolles. recommender systems. In: Persuasive Technology, 283-294. [33] Wroblewski, L. (2012). Mobile first. Editions Eyrolles. [2] Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation [34] Burke, R. 2002. Hybrid recommender systems: Survey and Experiments. of recommender systems: A survey of the state-of-the-art and possible User Model. User-Adapt. Interact. 12, 4, 180-200. extensions. Knowledge and Data Engineering, IEEE Transactions on, [35] Karat, C. M., & Blom, J. O. (Eds.). (2004). Designing personalized user 17(6), 734-749. experiences in eCommerce (Vol. 5). Springer Science & Business Media. [3] Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., [36] Blom, J. (2002). A theory of personalized recommendations. In CHI’02 & Riedl, J. (1997). GroupLens: applying collaborative filtering to Usenet Extended Abstracts on Human Factors in Computing Systems, 540-541. news. Communications of the ACM, 40(3), 77-87. [37] Weibelzahl, S. (2001). Evaluation of adaptive systems, In: User Model- [4] Chen, L., de Gemmis, M., Felfernig, A., Lops, P., Ricci, F., & Semeraro, ing 2001. LNAI 2109, 292-294. G. (2013). Human decision making and recommender systems. ACM [38] Swearingen, K., & Sinha, R. (2001, September). Beyond algorithms: An Transactions on Interactive Intelligent Systems (TiiS), 3(3), 17. HCI perspective on recommender systems. In ACM SIGIR 2001 Work- [5] Louis, T. Boo.com goes bust. The Boo.com post-mortem, from shop on Recommender Systems, Vol. 13, No. 5-6, 1-11. an insider. http://www.tnl.net/blog/2000/05/19/ [39] Davis Jr, F. D. (1986). A technology acceptance model for empirically boocom-goes-bust/ testing new end-user information systems: Theory and results. Doctoral [6] Venkatesh, V., & Bala, H. (2008). Technology acceptance model 3 and a dissertation, Massachusetts Institute of Technology. research agenda on interventions. Decision sciences, 39(2), 273-315. [40] Venkatesh, V. Theoretical models of Acceptance http: [7] Shim, J. P., Warkentin, M., Courtney, J. F., Power, D. J., Sharda, R., & //www.vvenkatesh.com/it/organizations/ Carlsson, C. (2002). Past, present, and future of decision support tech- theoretical-models.asp nology. Decision support systems, 33(2), 111-126. [41] Gefen, D., Karahanna, E., & Straub, D. W. (2003). Trust and TAM in [8] Bell, S.J.(1999) Image and consumer attraction to intra-urban retail ar- online shopping: An integrated model. MIS quarterly, 27(1), 51-90. eas: An environmental psychology approach, Journal of Retailing and [42] Benbasat, I., & Wang, W. (2005). Trust in and adoption of online recom- Consumer Services, Volume 6, Issue 2, 67-78. mendation agents. Journal of the Association for Information Systems, [9] Koufaris, M. (2002). Applying the technology acceptance model and 6(3), 4. flow theory to online consumer behavior. Information systems research, [43] Pavlou, P. A. (2003). Consumer acceptance of electronic commerce: In- 13(2), 205-223. tegrating trust and risk with the technology acceptance model. Interna- [10] Simon, H. A. (1955). A behavioral model of rational choice, Quarterly tional journal of electronic commerce, 7(3), 101-134. Journal of Economics 69, 99-118. [44] Suh, B., & Han, I. (2003). Effect of trust on customer acceptance of [11] Tversky, A., Kahneman, D. (1986) Rational Choice and the Framing of Internet banking. Electronic Commerce research and applications, 1(3), Decisions. Journal of Business, 59/4,Part 2, 251-278. 247-263. [12] Leitner, G. (1998) Stressfaktoren, Stresserleben und subjektive Attribu- [45] McKechnie, S., Winklhofer, H., & Ennew, C. (2006). Applying the tech- tionsmuster. Diploma Thesis, University of Vienna. nology acceptance model to the online retailing of financial services. In- [13] Leitner, G., Mitrea, O., & Fercher, A. J. (2013). Towards an Acceptance ternational Journal of Retail & Distribution Management, 34(4/5), 388- Model for AAL. In Human Factors in Computing and Informatics 672- 410. 679. [46] Pikkarainen, T., Pikkarainen, K., Karjaluoto, H., & Pahnila, S. (2004). [14] Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of Consumer acceptance of online banking: an extension of the technology decision under risk. Econometrica: Journal of the Econometric Society, acceptance model. Internet research, 14(3), 224-235. 263-291. [47] Featherman, M. S., & Pavlou, P. A. (2003). Predicting e-services adop- [15] Rotter, J. B. (1990). Internal versus external control of reinforcement: A tion: a perceived risk facets perspective. International journal of human- case history of a variable. American psychologist, 45(4), 489. computer studies, 59(4), 451-474. [16] Kelley, H. H. (1973). The processes of causal attribution. American psy- [48] Kirkpatrick, E. A. (1894). An experimental study of memory. Psycho- chologist, 28(2), 107. logical Review, 1(6), 602. [17] Weiner, B. (2012). An attribution theory of motivation. Handbook of the- [49] Ebbinghaus, H. (2013). Memory: A contribution to experimental psy- ories of social psychology, 1, 135-155. chology. Annals of neurosciences, 20(4), 155. [18] Ajzen, I. (1991). The theory of planned behavior. Organizational behav- [50] Gershberg, F., Shimamura, A. (1994) Serial position effects in im- ior and human decision processes, 50(2), 179-211. plicit and explicit tests of memory, Journal of Experimental Psychology: [19] Fishbein, M. (1979). A theory of reasoned action: some applications and Learning, Memory, and Cognition, 20, 1370-1378. implications. In: Howe et al.. 65-116. [51] Simonson, I. Tversky, A. (1992) Choice in Context: Tradeoff Contrast [20] Smith, S., Levin, I. (1998) Need for Cognition and Choice Framing Ef- and Extremeness Aversion, Journal of Marketing Research, 29. 281-295. fects, Journal of Behavioral Decision Making, 9(4): 283-290. [52] Stettinger, M., Felfernig, A., Leitner, G., Reiterer, S., & Jeran, M. (2015). [21] Haugtvedt, C. P., Petty, R. E., & Cacioppo, J. T. (1992). Need for cogni- Counteracting Serial Position Effects in the CHOICLA Group Decision tion and advertising: Understanding the role of personality variables in Support Environment. In Proceedings of the 20th International Confer- consumer behavior. Journal of Consumer Psychology, 1(3), 239-260. ence on Intelligent User Interfaces, 148-157. [22] Festinger, L. (1962). A theory of cognitive dissonance (Vol. 2). Stanford [53] Hamilton, R. (2003) Why Do People Suggest What They Do Not Want? university press. Using Context Effects to Influence Others Choices, Journal of Consumer [23] Brehm, J. W. (1966). A theory of psychological reactance. Academic Research, 29, 492-506. Press. [54] Felfernig, A., Gula, B., Leitner, G., Maier, M., Melcher, R., & Teppan, E. [24] Csikszentmihalyi, M. (1990). Flow: The Psychology of Optimal Perfor- (2008). Persuasion in knowledge-based recommendation. In Persuasive mance. Cambridge University Press. Technology, 71-82. [25] Felser, G. (2007). Werbe-und Konsumentenpsychologie. Spektrum. [55] Huber, J., Payne, W., Puto, C. (1982) Adding Asymmetrically Domi- [26] Weinberg, P., Gröppel-Klein, A., & Kroeber-Riel, W. (2003). Kon- nated Alternatives: Violations of Regularity and the Similarity Hypothe- sumentenverhalten. Vahlen. sis, Journal of Consumer Research, 9, 90-98. [27] Beckett, A., Hewer, P., & Howcroft, B. (2000). An exposition of con- [56] Felfernig, A., Schubert, M. & Zehentner, C. (2012) An Efficient Diag- sumer behaviour in the financial services industry. International Journal nosis Algorithm for Inconsistent Constraint Sets, Artificial Intelligence of Bank Marketing, 18(1), 15-26. for Engineering Design, Analysis, and Manufacturing (AIEDAM), Cam- [28] Donahue, G.M., Weinschenk, S., Nowicki, J. (1999). Us- bridge University Press, 26(1), 53-62. ability Is Good Business. http://half-tide.net/ UsabilityCost-BenefitPaper.pdf. [29] Van Pelt, A., & Hey, J. (2011). Using TRIZ and human-centered design for consumer product development. Procedia Engineering, 9, 688-693. Page 44