1 Introduction

A Signaling Game Approach to Databases Querying

Ben McCamish

Arash Termehchy

Behrouz Touri

Eduardo Cotilla-Sanchez

1 0 Department of Computer Science, University of Colorado , Boulder , USA 1 School of Electrical Engineering and Computer Science , Oregon State University , USA

Most users do not know the structure or content of databases and cannot precisely express their queries. Hence, it is challenging for database query interfaces to understand and satisfy users' information needs, i.e., intents. Ideally, we would like the user and query interface to establish a mutual understanding: the query interface understands how the user expresses her intents and/or the user learns to formulate her queries precisely. Researchers have proposed methods to achieve such a mutual understanding between users and databases [1, 3]. Current methods, however, mainly improve the mutual understanding of a user and a database for a single information need. Nevertheless, many users explore a database to find answers for various information needs potentially over a long period of time. Moreover, current approaches assume that the way a user expresses her intents remains generally intact over her course of interaction with the database. However, users may leverage their experience from previous interactions with the database to express their future intents more precisely.

1 Introduction

We propose a novel framework that models database querying as a game between two active and potentially rational agents: the user and query interface. The common goal of the players is to reach a mutual understanding on expressing intents in the form of queries. The players may reach this goal through communication: the user informs database of her intents by submitting queries, the database returns some results for the queries, and user provides some feedback on how much the returned results match her intents, e.g., by clicking on some desired answer(s). Moreover, the user may modify her query to better reflect her intent after exploring some answers. Both players receive some reward based on the degree by which the returned answers satisfy the intents behind queries. This interaction may reach some stable states, i.e., equilibria, where neither the database receives any reward by changing the way it answers queries nor the user has any incentive to change the way she expresses her intents. One obvious and desirable equilibrium is the state in which database precisely pinpoints the intent behind each query. Nevertheless, we show that the game has other equilibria, which are less desirable. We also propose a query answering algorithm for the database side, which lead to a non-decreasing reward for the user and database over the course of their interaction. We believe that our framework provides the basic tools to identify the equilibria of the interactions between users and databases in different settings and design methods to lead the interactions to more desirable equilibria.

Framework

Intent: An intent represents an information need sought after by a user. We assume that each intent is a query in a fixed query language, e.g., conjunctive queries. The set of possible intents is infinite. In practice a user has only a finite number of information needs in a finite period of time. Hence, we assume the number of intents for a particular user is finite. We index each intent over database instance I by 1 i m. Query: Because the user is not able to formulate her information need e, she submits query s 6= e to the database instead. Of course, the user still expects that the query interface returns the answers of intent e for s. A user in practice submits a finite number of queries in a finite time period. Hence, we assume that the set of all queries submitted by a user is finite. We index each query over the database instance I by 1 j n. Result: Given a query s over a database instance I, the query interface generally returns a set of tuples as the response to s. An obvious choice is to return s(I). Because the query interface knows that the input query may not precisely specify the user’s intent, it also considers alternative answers to satisfy the information need behind the query [1]. Strategies: A user strategy, P , is a m n row-stochastic matrix from the set of intents to queries. Each strategy of the database Qis a n o row-stochastic matrix from queries to results. Each pair (P; Q) is called a strategy profile.

Rewards: This interaction between the database and user can be modeled as a signaling game, with identical interests, played between the user and the database. Given that the query interface returns result l for intent e, we need some standard metrics to measure how effectively l answers e. We use standard effectiveness metric precision to measure the user satisfaction given a returned set of tuples. li represents the result tuple set returned after querying the database with intent ei. The precision of a result l for intent e over database I, denoted as p(e; l), is the fraction of its tuples that are in e(I). Our framework can be extended for other effectiveness metrics. We define the payoff as m u(P; Q) = X i=1

n o i X Pij X Qj` p(ei; l`): j=1 `=1 ( 1 ) This function is an expected payoff of the interaction between the user and query interface when the user maps a (random) intent ei to a query sj with probability Pij and the database maps the query sj to result l` with probability Qj`. is the prior probability over intents and p is precision.

Example 1: Consider a database about universities with schema U(name, abbreviation, state, ranking). Table 1(a) and (b) show a user’s intents and the queries she submits to the database to express these intents, respectively. Table 1 (c) and (d) illustrate two possible strategy profiles for these sets of intents and queries. For instance, given the strategy profile shown in Table 1 (c), if the user wanted to find the ranking for Michigan State University, i.e., intent e2, she submits query s1 that precisely represent this information need. If the database receives query s1, it returns the tuples in the database that satisfy s1, i.e., l2. However, if the user wants to find the rankings of the Mississippi State University and Missouri State University, she submits query s2 due to her lack of knowledge about the database content. Given query s2, the query interface returns the rankings of Mississippi State University and Missouri State University with equal probability. If the intents in Table 1 have equal prior probabilities, the strategy profile in Table 1 (c) is a Nash equilibrium. None of the players achieve a better payoff by unilaterally changing their strategies.

The strategy profile in Table 1 (c) provides the highest payoff for the user and database system given the intents and queries in Table 1 (a) and Table 1 (b). However, some Nash equilibria may not provide high payoffs. For instance, Table 1 (d) depicts another strategy profile for the set of intents and queries in Table 1 (a) and Table 1 (b). In this strategy profile, the user has a little knowledge about the database content and expresses al ofl her intents using a single query s2, which asks for the ranking of universities whose abbreviations are MSU. Given query s2, the query interface always returns the ranking of Michigan State University. Obviously, the query interface always returns the non-relevant answers for the intents of finding the rankings of Mississippi State University and Missouri State University. If all intents have equal prior probabilities, this strategy profile is a Nash equilibrium. For example, the user will not get a higher payoff by increasing her knowledge about the database and using query s1 to express intent e2. Clearly, the payoff of this strategy profile is less than the one of the strategy profile in Table 1 (c). Nevertheless, the user and the query interface do not have any incentive to leave this undesirable stable state once reached and will very likely stay in this state.

A Strict Nash Equilibrium is a strategy profile where neither party can unilaterally change their strategies and receive the same or better payoff.

Definition 2. Strategy profile (P; Q) is a Strict Nash Equilibrium iff u(P; Q) > u(P 0; Q) for all P 0 6= P and u(P; Q) > u(P; Q0) for all Q0 6= Q.

If we remove row e3 and column l3 from the user and database strategy in Table 1 (c) respectively and change s2,l1 for the database strategy to 1, we are left with a strict Nash. A strict Nash equilibrium is more stable than a Nash equilibrium. 4

An Adaptation Mechanism

After figuring out the desirable states of the game, one may want to design algorithms for the database side which lead the game to such states. In many relevant applications, the user’s learning is happening in a much slower time-scale compared to the learning of the database. So, one can assume that the user’s strategy is fixed compared to the timescale of the database adaptation. As in [2], we use Roth-Erev reinforcement learning mechanism for the database adaption as follows. a. Let R(0) > 0 be an n o initial reward matrix whose entries are strictly positive. Rj`(0) b. Let Q(0) be the initial database strategy with Qj`(0) = P`o=1 Rj`(0) > 0 for all j 2 [n] and ` 2 [o]. c. For iterations t = 1; 2; : : :, do i. If the user’s query at time t is s(t), return a result L(t):

P (L(t) = i0 j s(t)) = Qs(t)i0 (t): ii. User gives a reward rii0 given that i is the intent of the user at time t. Then, set Rj`(t + 1) =

Rj`(t) + ri` if j = s(t) and ` = i0 Rj`(t) otherwise : ( 2 ) ( 3 ) iii. Update the database strategy by for all j 2 [n] and i 2 [o].

Qji(t + 1) =

Rji(t + 1) Po `=1 Rj`(t + 1) ; Let u(P; Q(t)) denote the payoff in interaction t according to in Equation ( 1 ). A random process fX(t)g is a submartingale if it is absolutely integrable (i.e. E(jX(t)j) < 1 for all t) and E(X(t + 1) j Ft) X(t), where Ft is the -algebra generated by X1; : : : ; Xt. In other words, the expected value of X(t + 1) given the past, is not strictly less than the value of X(t).

Theorem 1. The sequence of u(P; Q(t)) is a submartingale.

Hence, the proposed reinforcement learning rule stochastically improves the efficiency of communication between the database and user. It is an interesting future work to explore whether this algorithm will converge to a globally optimal payoff.

1. Fagin , R. , Kimelfeld , B. , Li , Y. , Raghavan , S. , Vaithyanathan , S. : Understanding Queries in a Search Database System . In: PODS ( 2010 )

2. Hu , Y. , Skyrms , B. , Tarrès , P. : Reinforcement learning in signaling game . arXiv preprint arXiv:1103.5818 ( 2011 )

3. Khoussainova , N. , Kwon , Y. , Balazinska , M. , Suciu , D. : Snipsuggest: Context-aware autocompletion for sql . PVLDB 4 ( 1 ) ( 2010 )