-

Eficient Ranked Access over Joins

Nikolaos Tziavelis

0 0 Supervised by Prof. Mirek Riedewald and Prof. Wolfgang Gatterbauer Northeastern University , Boston, MA , USA

Join queries over multiple tables with many-to-many relationships (e.g., in graph analytics) can produce a huge output that is infeasible to compute. However, users may have particular preferences over the answers in the output, and may be interested in accessing only a small subset according to that ranking; either to retrieve the most important answers or the quantiles for a statistics summary. The thesis of this proposal is that for many such queries, access patterns, and ranking functions over the answers, ranked access can be performed eficiently, without first computing the output of the join. This is captured theoretically by non-trivial complexity guarantees and shown in practice with significant performance improvements over typical implementations in stateof-the-art database management systems (DBMSs) that sort the join output. We have so far given algorithms and complexity guarantees for the problems of ranked enumeration, direct access, and quantiles over equi-joins, as well as joins with complex inequality predicates. To complete the picture, we plan to extend our results to support more expressive queries, distributed computation, and stronger complexity guarantees toward instance-optimality. Besides addressing fundamental questions regarding the limits of query processing, this work opens up unexplored possibilities for the design of DBMSs. We clearly demonstrate how existing systems fall short in handling ranking over joins eficiently, with our algorithms outperforming them by orders of magnitude.

1. Problem Focus tuples and ℓ relations, the join output size is (ℓ).1 The question we ask is whether we can retrieve a few Join Queries. Joins are fundamental in data-processing select answers (according to the ranking) without matesystems, as they enable the composition of data from mul- rializing and sorting the entire output, or in other words, tiple sources. They are also notoriously critical for per- whether we can “push” the ranking deeper into joins. formance, as they can produce huge intermediate or final This turns out to be not as simple as reordering the two results. This is especially true when dealing with graph operations. For example, the top-10 tuples from each data [1], but can also occur with traditional relational joining relation do not necessarily produce the top-10 data if many-to-many relationships exist across tables join answers, and may not even join at all. Achieving our or non-equi-join conditions are used. A breakthrough goal requires novel algorithms where joining and ranking result in our understanding of join processing was that are interleaved. the worst-case complexity of traditional (binary) join implementations is suboptimal [2], which led to a num- Example 1. To study bird interactions, an ornitholober of Worst-Case Optimal Join (WCOJ) algorithms that gist uses a bird observation dataset B(Species, Family, ifll this gap [ 3, 4]. Besides, work on enumeration [5, 6] Cnt, Latitude, Longitude) and wants to extract pairs of has focused on returning a stream of answers as fast as observations for birds of diferent species that have been possible even if the full output is too large to compute. spotted in the same region. Pairs with the greatest comThese eforts toward strong worst-case guarantees have bined number of birds (Cnt) should also appear first: not only created excitement about their theoretical value, but are also aligned with the pervasive goal of database SELECT *, B1.Cnt + B2.Cnt as Weight systems to ensure predictable performance for complex WFHREORME BB1.BS1p,ecBiBe2s <> B2.Species queries or skewed data. AND ABS(B1.Latitude - B2.Latitude) < 1

Ranking Join Answers. Users may prefer some join AND ABS(B1.Longitude - B2.Longitude) < 1 answers over others based on their importance or rele- ORDER BY Weight DESC vance. For instance, higher importance may be assigned to newer (time) or more reliable (quality) data. We formal- This query contains inequality (<) and non-equality ize this by a ranking function that establishes a total or- (<>) predicates and its answers are ranked by SUM. der over the answers. A common example is SUM where The output size is Ω( 2) in the worst case, thus, regarddatabase tuples are assigned weights and the weight of a less of the join implementation, sorting will take time query answer is the sum of weights of the contributing Ω( 2 log ). Yet, our new algorithms can retrieve the tuples. Sorting the answers is clearly expensive; even top-10 answers or even the median answer asymptotiif we were to compute the join eficiently, all join an- cally faster, in time ( polylog ). swers must be produced to be sorted. For a database of

We consider a few diferent ways that a user may want

to access the ranked result of a join query.

Top-. The top- paradigm asks for the the first ranked answers for a given , which is typically considered to be a relatively small constant. Since the value of is known, it can be exploited for pruning.

Ranked Enumeration. Ranked enumeration asks for the answers to be returned in order, one-by-one until all of them are returned or the user stops the procedure. It generalizes top- because the value of is flexible and not fixed in advance. For this reason, we introduced the term “any-k” for this paradigm. This functionality can be useful for exploratory analysis tasks, where setting without first seeing some answers is dificult. The goal is to provide complexity guarantees for every possible value of . We denote by TT() the time required up to the th answer, which should be as small as possible no matter if is small or equal to the join output size.

Direct Access. A more general access pattern is direct access, where the indexes requested from the sorted array of answers do not need to be consecutive integers that start from 0 as in ranked enumeration, but can be arbitrary (e.g., all -quantiles). The goal is to build a data structure that can support all these accesses eficiently.

Quantiles. Quantile queries (or a variant called “selection”2 ask for only one access, such as the median answer. 2.2. Inequality Predicates Compared to direct access, we are not required to handle multiple accesses and no data structure for future accesses needs to be constructed. we designed any- algorithms [11] for SUM with strong guarantees: If the query size is constant, we were able to achieve TT() = ( + log ) for an input database of tuples. This is optimal since it takes Ω( ) to read the database and Ω( log ) to return answers sorted.

Remarkably, the user starts seeing the top answers after only linear time, even if the join output is much larger.

For the case where query size is non-constant, we proposed an algorithm [12] that achieves the best-known complexity. 3 Surprisingly, it is asymptotically faster than (generic, comparison-based) sorting for returning the entire output. This is possible because the query answers are not independent, but have a shared structure that the algorithm exploits. Our algorithm does not only apply to joins, but also to ranked enumeration of source-target paths in a DAG, and even more generally, to any problem solvable by Dynamic Programming. Beyond SUM, other ranking functions are supported if they satisfy appropriate monotonicity properties [12].

Figure 1a illustrates the advantage of any- over the approach of current DBMSs which follow the “JoinFirst” approach of applying ranking after the join. The query in this experiment is a join of 4 relations in a chain over uniform synthetic data. Each relation contains 104 tuples and each join value appears 10 times on expectation.

While our initial work focused on equality conditions, we

then extended it to more general join conditions [14]. For acyclic joins with any conjunctions or disjunctions of inequality predicates between adjacent relations in the join 1.2. Contrast to Prior Top- Joins tree, any- is possible with TT() = ( polylog + log ), i.e., within a polylogarithmic factor of the equiA well-known algorithm for top- joins is the instance- join guarantee. The key insight is a “factorized” repreoptimal Threshold Algorithm by Fagin et al. [7]. However, sentation of the output which essentially reduces the it only works for a restricted class of 1-to-1 joins. Follow- inequality-join to an equi-join over (polylogarithmically) up work extended the algorithm to more general joins [8, larger relations. 9] but under a “middleware” cost model where cost is Our result is quite surprising, given that existing measured in terms of distinct tuples accessed, while join DBMSs struggle to handle complex join predicates like cost is not taken into account [10]. This is in contrast those in Example 1 even without the ranking. Our alto the line of work on WCOJ or enumeration, which gorithms handle both simultaneously and outperform focuses on the RAM model of computation where the DBMSs by orders of magnitude. Figure 1b illustrates this primary concern is to avoid unnecessary joins and large in terms of the time to find the top-1000 answers for a intermediate results. Our work aims to bridge this gap path query on the RedditTitles [15] temporal graph. by adopting the RAM model. The 572 edges in the dataset represent posts from a source community to a target community identified by 2. Highlights of Current Results a hyperlink in the post title. The query returns paths of length ℓ with increasing timestamps, decreasing senti2.1. Any- Algorithms for Equi-joins ment (i.e., a sign of negative emotion propagation), and ordered by the SUM of readability scores.

For acyclic equi-joins, where join predicates are restricted to equalities and relations can be organized in a join tree, 1.1. Ranked Access Patterns 2The two problems difer in whether a percentage or an absolute index is given as the input.

PSQL

System X JoinFirst (OOM)

Anyk 0 2 4 6 8 10

Time (sec) (a) The “JoinFirst” approach of sorting all answers needs 5.9 sec for the top answer and 6.6 (b) For a path query with 2 inequality predicates on the RedditTitles dataset, sec for the last. In contrast, our any- takes 0.02 sec for the top answer (hence more than our approach performs significantly better than PostgreSQL (PSQL) and a 200 times faster) and 4.9 sec for the last with a variant called “Anyk-Rec”. So we get not commercial system optimized for in-memory computation (System X). Our only the first answer faster, but all answers faster. All existing DBMSs we tried, such as own in-memory “JoinFirst” approach runs out of memory (OOM) when the PostgreSQL (PSQL) here, follow an approach similar to “JoinFirst” and are outperformed. path length ℓ is more than 2. 2 4 6 8 10

2.3. Dichotomy Theorems

queries are also considered in our study of direct access and selection [17].

3.1. Supported Queries

3. Future Directions

For both direct access and quantile queries under

SUM, we precisely characterized the (self-join-free4) join queries that can be handled eficiently, and gave algorithms for those cases [17]. We consider ( polylog ) time as eficient, since it is close to the database size and independent of the join output size. We showed that, under certain hardness assumptions in fine-grained complexity, constructing a data structure for direct access is possible only for trivial cases. By relaxing the task to that of quantiles where only one access is required, then any binary join can be handled eficiently. For all other acyclic queries, which are provably hard, we showed that eficient (deterministic) approximation of quantiles is possible [18].

We have demonstrated how existing systems are lacking

in terms of support for join queries with ranking, by focusing mainly on conjunctive queries. A major next step is to extend our study to more expressive queries, such as those with recursion [21] or other types of operators.

For the supported ranking functions, while we have algorithms for some of the most common cases, we are lacking in terms of hardness results for others. Can the functions that are outside of our tractable classes (e.g., using MEDIAN as a weight aggregation function) be proven 2.4. Projections to be intractable? Or are there still cases potentially useOur results also cover join queries with projections, for- ful in practice that remain unexplored? malized by conjunctive queries. For ranked enumeration, we extended [12] a dichotomy of Bagan et al. [5] that was 3.2. Distributed Computation developed for unranked enumeration 5, showing that the class of queries that can be handled eficiently (i.e., with Our algorithms assume an in-memory computation TT() = ( + ) ignoring logarithmic factors) is pre- model. To make them more useful for big-data processing, cisely the same; it is those queries that are “free-connex”, we aim to explore techniques to distribute computation a certain restriction on the allowed projections. Thus, to many machines. Popular practical frameworks for the additional requirement of an ordered output does this task include Hadoop MapReduce or Spark which not change the tractability landscape and only costs a partition computationally intensive tasks into smaller, logarithmic factor. We note that follow-up work by Deep independent tasks to be executed in parallel. On the theet al. [19] has considered more expensive time guaran- oretical side, the MPC model has been used to analyze tees for queries beyond the free-connex class, in terms of distributed joins [22]. An intriguing question is whether preprocessing and delay between answers. 6 Conjunctive the the distributed computation setting requires new algorithms and techniques that are fundamentally diferent than the sequential case. 4This restriction is common in hardness results in this area [16]. 5Again, this is under certain hardness hypotheses and for queries without self-joins. 6We have argued that preprocessing and delay are less practically relevant than TT() as measures of success for enumeration. [20] 3.3. Toward Instance Optimality ranked inputs, in: VLDB, 2001, pp. 281–290. URL: http://www.vldb.org/conf/2001/P281.pdf.

The guarantees we have explored so far are concerned [10] N. Tziavelis, W. Gatterbauer, M. Riedewald, Optimal with worst-case time and space complexity, and it is open join algorithms meet top-k, in: SIGMOD, 2020, pp. whether stronger guarantees are possible. As aforemen- 2659–2665. doi:10.1145/3318464.3383132. tioned, the Threshold Algorithm [7] is instance-optimal [11] N. Tziavelis, D. Ajwani, W. Gatterbauer, M. Riedein the middleware model, a very strong guarantee which wald, X. Yang, Optimal algorithms for ranked enupartly led to the 2014 Gödel Prize for Fagin, Lotem, and meration of answers to full conjunctive queries, Naor. In addition to the simpler join model, instance- PVLDB 13 (2020) 1582–1597. doi:10.14778/ optimality crucially relies on the assumption that input 3397230.3397250. relations are given sorted. This assumption makes it pos- [12] N. Tziavelis, W. Gatterbauer, M. Riedewald, Anysible to retrieve the top- tuples without even reading k algorithms for enumerating ranked answers to the entire input. In contrast, our algorithms (and other conjunctive queries, CoRR abs/2205.05649 (2022). WCOJ) unavoidably have Ω( ) as a lower bound since doi:10.48550/ARXIV.2205.05649. the winning tuples could be anywhere in the input. Thus, [13] S. Deep, P. Koutris, Ranked enumeration of conjuncan open question is whether stronger guarantees are tive query results, in: ICDT, volume 186, 2021, pp. possible under this assumption in the RAM model. 5:1–5:19. doi:10.4230/LIPIcs.ICDT.2021.5. [14] N. Tziavelis, W. Gatterbauer, M. Riedewald, Beyond References equi-joins: Ranking, enumeration and factorization, PVLDB 14 (2021) 2599–2612. doi:10.14778/ [1] S. Sahu, A. Mhedhbi, S. Salihoglu, J. Lin, M. T. 3476249.3476306.

Özsu, The ubiquity of large graphs and surpris- [15] S. Kumar, W. L. Hamilton, J. Leskovec, D. Jurafsky, ing challenges of graph processing: extended sur- Community interaction and conflict on the web, in: vey, VLDB J. 29 (2020) 595–618. doi:10.1007/ WWW, 2018, pp. 933–943.

s00778-019-00548-x. [16] P. Koutris, J. Wijsen, Consistent query answer[2] A. Atserias, M. Grohe, D. Marx, Size bounds and ing for primary keys and conjunctive queries with query plans for relational joins, SIAM Journal negated atoms, in: PODS, 2018, p. 209–224. doi:10. on Computing 42 (2013) 1737–1767. doi:10.1137/ 1145/3196959.3196982.

110859440. [17] N. Carmeli, N. Tziavelis, W. Gatterbauer, [3] H. Q. Ngo, E. Porat, C. Ré, A. Rudra, Worst-case B. Kimelfeld, M. Riedewald, Tractable orders for optimal join algorithms, J. ACM 65 (2018) 16. direct access to ranked answers of conjunctive doi:https://doi.org/10.1145/3180143. queries, TODS 48 (2023). doi:10.1145/3578517. [4] T. L. Veldhuizen, Triejoin: A simple, worst-case [18] N. Tziavelis, N. Carmeli, W. Gatterbauer, optimal join algorithm, in: ICDT, 2014, pp. 96–106. B. Kimelfeld, M. Riedewald, Eficient comdoi:10.5441/002/icdt.2014.13. putation of quantiles over joins, in: PODS, 2023, p. [5] G. Bagan, A. Durand, E. Grandjean, On acyclic 303–315. doi:10.1145/3584372.3588670. conjunctive queries and constant delay enumera- [19] S. Deep, X. Hu, P. Koutris, Enumeration Algorithms tion, in: International Workshop on Computer Sci- for Conjunctive Queries with Projection, in: ICDT, ence Logic (CSL), 2007, pp. 208–222. doi:10.1007/ volume 186, 2021, pp. 14:1–14:17. doi:10.4230/ 978-3-540-74915-8_18. LIPIcs.ICDT.2021.14. [6] N. Carmeli, S. Zeevi, C. Berkholz, A. Conte, [20] N. Tziavelis, W. Gatterbauer, M. Riedewald, B. Kimelfeld, N. Schweikardt, Answering (unions Toward responsive DBMS: Optimal join algoof) conjunctive queries using random access and rithms, enumeration, factorization, ranking, random-order enumeration, TODS 47 (2022). and dynamic programming, in: ICDE tutoridoi:10.1145/3531055. als, 2022. URL: https://northeastern-datalab. [7] R. Fagin, A. Lotem, M. Naor, Optimal aggregation github.io/responsive-dbms-tutorial/. algorithms for middleware, JCSS 66 (2003) 614–656. doi:10.1109/ICDE53745.2022.00299. doi:10.1016/S0022-0000(03)00026-6. [21] M. Abo Khamis, H. Q. Ngo, R. Pichler, D. Suciu, Y. R. [8] I. F. Ilyas, W. G. Aref, A. K. Elmagarmid, Wang, Convergence of datalog over (pre-) semirSupporting top- join queries in relational ings, in: PODS, 2022, p. 105–117. doi:10.1145/ databases, VLDB J. 13 (2004) 207–221. 3517804.3524140.

doi:10.1007/s00778-004-0128-2. [22] P. Koutris, S. Salihoglu, D. Suciu, Algorithmic [9] A. Natsev, Y.-C. Chang, J. R. Smith, C.-S. Li, J. S. aspects of parallel query processing, in: SIGVitter, Supporting incremental join queries on MOD, 2018, p. 1659–1664. doi:10.1145/3183713. 3197388.