-

Toulouse: Learning Join Order Optimization Policies for Rule-based Data Engines

Antonios Karvelas

Yannis Foufoulas

0 1

Alkis Simitsis

Yannis Ioannidis

0 1 0 Athena Research Center , Athens , Greece 1 University of Athens , Athens , Greece

In recent times, several research works have explored the idea of leveraging machine learning techniques to improve or even replace core components of traditional database architectures, such as the query optimizer and selectivity and cardinality cost estimators. These eforts often rely on existing, cost-based optimizers and cost models to avoid a cold-start, and build on top of the optimizer's decisions. In this paper, we investigate whether learning could also be beneficial in rule-based optimizers for known and unknown workloads alike. As a proof of concept, we use MonetDB, an open-source, column-store analytics data engine, and explore whether a learning model based on Graph Neural Networks that is trained on a cost-based engine, such as PostgreSQL, could improve MonetDB optimizer's decisions. Our experimental results reveal deficiencies in MonetDB's query execution plans, especially for queries with long chains of join operators, and potential opportunities in exploiting learning techniques.

eol>Query processing Query optimization Join ordering Machine learning Graph neural networks Reinforcement learning Proximal policy optimization Cost-based optimization Rule-based optimization

1. Introduction usually with a single pass of the SQL statement.

The rules are well-crafted based on experience, theQuery optimization has been a long-standing chal- ory, and common practice, but they often miss lenge from the very early days of data manage- optimization opportunities as they tend to overlook ment systems. Traditional optimization techniques data properties. to search the space of alternative query execution A recent line of research exploits the advances plans and find the most eficient one include vari- made recently in machine learning (ML) technology ous heuristic and cost-based approaches, rule-based and explores the potential of the so-called, learning strategies, randomized algorithms, and so on [1, 2]. query optimizers, which aim at learning the behavior

Cost-based optimizers depend on accurate cardi- of query operators and query patterns over time and nality and selectivity estimates, and especially for tend to learn also from their previous decisions (i.e., intermediate result, to produce reasonable plans. In execution plans). In the context of learning query general, these techniques are successful given spe- optimization, earlier research eforts focus mainly cific assumptions such as attribute value indepen- on the problems of join ordering and end-to-end dence, uniformity, data independence, etc. When optimization. However, the vast majority of past these assumptions cannot be met the optimizers work focuses on cost-based optimizers, relying on typically fall back to an educated guess. In practice, their choices and respective cost models to avoid a getting such accurate estimates is a significant chal- cold-start and also train the proposed models. lenge. And the challenge is more evident in queries In this work, we present Toulouse, our optimizahaving a long chain of join operators, where one tion approach to rule-based data engines, and inshould decide in which order the joins should be vestigate the feasibility and potential benefit of apcomputed, forming the so-called join-ordering prob- plying learning techniques to a rule-based query lem. On the other hand, rule-based techniques rely optimizer. Our initial focus is on the join ordering on a set of rules to produce query execution plans, optimization problem. Generally speaking, without DataPlat’23: 2nd International Workshop on Data Plat- having a cost model and cost estimates assigned form Design, Management, and Optimization, March 28, to query plans and query operators, the problem 2023, Ioannina, Greece formulation should rely on optimization patterns $ sdi1600060@di.uoa.gr (A. Karvelas); johnfouf@di.uoa.gr for the query workloads at hand. (Y. Foufoulas); alkis@athenarc.gr (A. Simitsis); As a proof of concept, we used MonetDB [3] in yannis@di.uoa.gr (Y. Ioannidis) © 2023 Copyright for this paper by its authors. Use permitted under our investigation. A query in MonetDB undergoes Creative Commons License Attribution 4.0 International (CC BY an optimizer pipeline, which examines and applies a CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g 4C.0E).UR Workshop Proceedings (CEUR-WS.org) series of optimizer steps [4]. Our preliminary inves- components, such as data structures, indices, and tigation showed that a featurization scheme based query execution with learned components. solely on the optimizer steps is not suficient to Other approaches focus on the join ordering probproduce useful learning patterns. Subsequently, we lem. DQ [7] and ReJOIN [8] combine reinforcement tried to leverage a model trained on a cost-based op- learning (RL) with a human-engineered cost model timizer and obtained three interesting results. First, to automatically learn search strategies to naviMonetDB’s optimizer presents significant limita- gate the space of possible join orderings. RTOS [9] tions when it comes to queries having long chains of extends these by employing a Tree-LSTM to comjoin operators and thus, it seems that does not han- pute query cost. SkinnerDB [10] uses RL to learn dle very efectively the join ordering problem. Sec- optimal join orders during query execution and enond, that a model trained on a cost-based optimizer ables fast join order switching. AlphaJoin [11] uses following a supervised approach behaves reason- Monte Carlo Tree Search (MCTS) for join ordering ably well when used in MonetDB, without requiring and Adaptive Decision Network (ADN) to choose significant fine-tuning or massaging of MonetDB’s between plans produced by AlphaJoin (for long execution runtime and optimizers pipeline. We find queries) and PostgreSQL (for short queries). this result very interesting, as the alternative would There are also attempts to end-to-end learning be to develop a brand new rule-based optimizer step optimizers. Neo [12] replaces most traditional optiand add it to MonetDB’s optimizer pipeline. Third, mizer components with ML models and deep neural the approach can be further extended to support networks. Bao [13] follows a schema and data agquery and workload agnostic optimization using a nostic approach to learning optimization. It runs more elaborate, reinforcement learning approach. a query with a predefined set of hints and keeps

To the best of our knowledge, our work is the first the plan ranked first by a tree convolutional neural attempt to employ learning in rule-based database network. Microlearner [14] divides a complex cloud optimization. Our contributions can be summarized workload into a hierarchy of subsets, using them to as follows: learn micro-models independently and in parallel.

These methods focus on cost-based optimizers. • We argue that training a model using a set Although several of these techniques could also be of query optimization rules does not seem as used in our work, we consider them complementary efective as relying on a cost model. to our goal: explore whether a learning optimization • We show that applying a model trained on a approach could be beneficial for rule-based systems. cost-based optimizer to a rule-based system does seem to be efective, especially, if the underlying system is not optimized for join 3. The Toulouse approach ordering.

In this section, we present our attempts towards • We present a simple, supervised learning ap- learning optimization for rule-based query optimizaproach able to transfer efective schema and tion. First, we report on our investigation toward workload specific optimization policies to a building a learning model directly from query optirule-based system. mization rules. Next, we present a diferent direc• We present a more elaborate, reinforcement tion: employ a simple, supervised approach to train learning approach that transfers efective a model on a cost-based system and apply it to a schema and workload agnostic optimization rule-based system. Finally, we present a more elabpolicies to a rule-based system. orate, reinforcement learning approach that serves reasonably well previously unseen (i.e., not included in the model training) query workloads.

Outline. Section 2 presents related eforts on

learning query optimization. Section 3 describes the main principles and components of Toulouse.

Section 4 presents our experimental evaluation. Fi- 3.1. Learning optimization rules nally, Section 5 summarizes our findings.

Our first attempt was to learn directly from query

optimization rules. Interestingly, this attempt did 2. Related Work not produce very positive results. Under the hood, MonetDB uses a low level interThere are several approaches to learning query op- mediate language called MAL (MonetDB Assembly timization [5]. These include system frameworks, Language) that encodes the execution complexity e.g., SageDB [6], that aim at replacing core database of a SQL statement. Roughly speaking, a MAL program is equivalent to a query plan. Instead of using a single optimizer, MonetDB employs a collection of 20+ query optimizer transformers, each responsible for a single task such as removing scalar expression or aliases, avoid multiple calculations of the same operator, range propagation, code parallelization, optimize resource usage, garbage collection, operator rewriting, caching aggregation results, dead code removal, etc. [15]. There is also an optimizer that deals with join paths, i.e., looking for join operators and cascading them into multiple join paths, and considers strategies such as materialization of common subpaths. Additional optimizers can be implemented and added as needed.

Hence, query optimization in MonetDB is real- Figure 1: Example message passing network ized as a linear sequence of MAL transformations (optimizers) forming an optimizer pipeline. An op- transfer learning approach. In our current impletimizer may be called multiple times in the same mentation, we used PostgreSQL as the reference optimizer pipeline. MonetDB employs a cost model cost-based system and MonetDB as the target rulethat to date does not use any statistics on the actual based system. Hence, our approach could be sumdata, but relies on the size of rows provided as an marized as ‘develop (train) a model that learns how input to an optimizer. We found that this feature is the PostgreSQL optimizer works (with its successes not very mature yet and it does not seem to expose and mistakes) and employ it (inference) on Monthe costs to the runtime. etDB bypassing its optimization choices’.

Inspired by Bao [13] that gets training data by Several available methods in the literature are running a query with various hint sets, our first bound to a specific schema and fail to generalize attempt was to explore alternative variations of op- well and/or require expensive training. Still, pretimizer pipelines and employ both the optimizers vious approaches following a graph neural network and various common optimizer pipeline patterns as (GNN) modeling (without this being the only good features in our featurization scheme. This can be solution) provide promising results [5]. Inspired by realized easily, as MonetDB ofers the functionality this observation, Toulouse employs a GNN approach to enable or disable optimizers via a ‘set optimizer’ to collect structural information about all possible command. On the other hand, there are a few con- join decisions and learn how to pick the best one straints to consider. First, there are dependencies in each case, or at least try to approximate it and among the optimizers; e.g., static evaluation may learn more about the problem through this process. happen only after constants propagation. In ad- Next, we present the two techniques developed dition, there is a notion of the minimal optimizer in Toulouse, a supervised approach and a reinforcepipeline, which specifies a number of absolutely nec- ment learning approach. The first relies heavily on essary optimizers to run. These constraints limit the training data, and thus it is amenable to spesignificantly the degrees of freedom we can consider. cific schema and query workloads. The later aims

Our modeling and design eforts compared to the at providing a schema and query agnostic solution. out-of-the-box MonetDB optimization strategies, revealed limited benefits in generic optimizations 3.2.1. Preliminaries: GNN to the rescue for simple SPJ queries including aggregation and sort operations, but the results on join ordering were rather disappointing as the improvements seemed to be sporadic and random, which renders this path neither robust nor practical.

Graph Neural Networks (GNN) is a relatively new class of deep learning techniques that aim at solving the problem of having structural data of variable complexity, by aggregating data from their immediate neighbors. GNNs have been efective for prob3.2. Employ a cost-based learning model lems that can be described as graphs, entities that have variable size and structure and where the inWe explored next the idea of developing a model teractions between them are as important as their based on a cost-based query optimizer and inves- features. They can be best understood as message tigating whether this would be efective on a rule- passing networks, where every nodes exchanges data based system. This could be seen as a type of a with its directly connected nodes (see also Figure 1). (a) An example query graph (b) Applying a join decision where the sum calculates the means of the neighbor- nodes previously connected to either 1 or 2. and every edge ∈

At every step of the process, we choose an edge

= ( 1, 2) and

the two connected nodes 1 and 2. We remove those two nodes from the graph, add a new node 1 2 and reconnect it with all the is an implicit join predicate. where is the result of the previous level, ^ = + is the adjacency matrix with inserted self-loops (the + ), ^ = ∑︀ =0 ^ and Θ is the layer’s trainable weights.

GCNII is a continuation of GCN with the added features of initial residual connections and identity mapping, which help tremendously with the problem of oversmoothing (i.e., the more levels in a GNN, the less expressive are the later layers). To support this functionality, we change the GCN formulation as follows:

is the diagonal degree matrix, ′ = ((1− ) ^ −1/2 ^ ^ −1/2 + (0))((1− ) + Θ) cost where (0) represents the initial features that get added to each layer.

3.2.2. Join optimization as an MDP Similar to earlier eforts (e.g., [ 7]), we formulate

the join-ordering optimization problem as a Markov As a first approach to the problem, Toulouse employs a supervised approach using a simple GNN architecture. We first present the featurization scheme we use and then the architecture of the solution. or _ _

Every such a step has a cost, and as we describe

it as an MDP, we can assume that it follows the

Bellman’s Principle of Optimality, thus we can ap

proximate a function, or else a policy, that selects the best action at each step, assuming that all proceeding steps are also chosen optimally.

Therefore, the problem we deal with is to find a sequence of joins in the query graph that minimizes the total cost, until after considering all joins needed there is only one node left. Formally, in a query graph , with a cost model , we search for the sequence of joins 1 ∘ 2... ∘ with the minimum 1,..., ︀∑ =1 ( ).

3.2.3. A supervised approach

shortest path between any two graph nodes) in realistic query graphs, which is also a value that seems to balance nicely the trade-of between oversmoothing and performance boosting.

3.2.4. A reinforcement learning approach

Although our supervised approach provides a simple and fast solution, still our investigation shows that it does not capture efectively the case of unknown query workloads, i.e., queries and schemas that have not been seen by the model before. Related eforts have dealt with the problem of cost estimation for unseen queries considering transfer learning inspired techniques, such as zero-shot learning (e.g., [17]).

Although past work has considered zero-shot learning for learned cost prediction, we believe that the basic principles could also apply to the join-ordering Towards this end, we investigate a more complex

architecture that employs a Graph Convolutional

Neural Network (GCN) - Reinforcement Learning (RL) architecture, aiming at providing a more practical solution to the join-ordering problem, developing a model that could also cope with new query Featurization. As we aim at a more query and eral features as possible. Hence, for each node, we

the typical diameter (i.e., the length of the longest workloads. consider: scan) types, cardinality, and cost estimates. To ac- schema agnostic approach, we choose to use as gentwo Multilayer Perceptron (MLP) encoding layers, indicating whether the query contains specific pred• cardinality of an operator (here, scan or join) • width; the estimated average width of the

result (in bytes) • selectivity;

_/ scan operators, and the average of the two subtrees for the join operators (multiplying them is prohibited due to vanishing gradient issues) • total tuples; can be set to 0 for joins or the averages, see selectivity above.

for • total pages • • an aggregation.

; true for joins, false otherwise

; whether the node contributes in

We also consider a number of (boolean) features

icate expressions, such as filterEQ, filterNEQ, filter

Like, filterIs, filterIn, filterBetween, filterGTELTE,

and so on.

Finally, we complement our featurization scheme with features related to the edges of the graph, essentially capturing information about the join predicate they represent, and use statistics for the columns to be joined, such as: • index; whether an index exists for the column(s) • null_frac; the fraction of nulls in the column(s) • n_distinct; the number of distinct elements in the column(s) • correlation; a PostgreSQL metric showing the statistical correlation between physical row ordering and logical ordering of the column values, which is useful to determine whether for example an index scan can be eficient (similar info is captured in most modern databases).

Currently, Toulouse employs the PostgreSQL op

timizer as a source of the cost/statistics estimates, but another approach could also be used e.g., leveraging Bayesian Optimization to build performance models [18].

Architecture. In this approach, Toulouse employs a GCN architecture in combination with the popular Proximal Policy Optimization (PPO) algorithm.

In particular, it uses multiple levels of GCNII, combines them and passed them through two MLP heads (actor and value) that can be used for PPO.

The value head represents the cost of query so far and the actor head represents the probability of choosing a certain join at each step.

For each step, we maintain two graphs, the one described above and its line graph. A line graph of an undirected graph is another graph ( ) Figure 5: Network architecture of Toulouse RL approach that is created as follows: for each edge in , we make a vertex in ( ); for every two edges in that have a vertex in common, we make an edge value head. The positional features, , are passed between their corresponding vertices in ( ) [19]. through separate GCNII layers and we keep only In other words, we convert every edge of to a node the last one, which we concatenate with the layer in ( ) and every node of to an edge in ( ). results (a.k.a. tensors). The layer results are used Therefore, we end up with every node carrying its directly in the actor head and the pooled ones in the own features (of the edge turned to node) and of value head. The actor heads ends up in a categorical the features of the corresponding adjacent nodes distribution. The categorical distribution is used to that in the line graph were turned to edges. We sample join actions during training and choose the also concatenate global features such as the number best action with argmax during inference. of nodes, the diameter of the graph, etc. that help Reward function. An important part of any reenrich the graph representation. In addition, we cal- inforcement learning implementation is the reward culate positional encoding with random walks [20] function. For our model, we work as follows. For to encode positional information of graph nodes and the training of our model we use a multiplicity of make them more expressive. Nearby nodes have datasets (databases) and query workloads to achieve similar positional features and distant nodes have as broad and generic learning as possible. (We exdissimilar positional features. Still, we keep the plain this in more detail in the next section.) Then, position encoding separately from the features. Fi- we choose random queries from all the available nally, we pass each node through the model shown workloads (evenly distributed to avoid any bias toin Figure 5. ward a specific workload), create the graphs with the

The features are passed through the layers of tables of all the schemata, add all implicit connecGCNII. We pool the sum of each separate layer tions (join predicates), add also the join predicates and concatenate them at the end to compute the that we get from equality propagation, and finally, we proceed to use the model for choosing the next best join. Hence, at each step we compute the reward as follows: 4. Evaluation 4.1. Setup and implementation details cut_parameter = 10.0 In our experiments, we used the following setup. for database, query in queries: Hardware. Several rule-based systems are used eibase_cost = postgres_cost(query) ther as server databases deployed on server machines while number_of_nodes > 1: or embedded databases running on lightweight conchoose_join(query) ifgurations. We investigated both scenarios using cost = make_join(query) two setups: (S1) 20-core Intel(R) Xeon(R) CPU if cost > base_cost * cut_parameter: E5-2630 v4 2.20GHz (3.1GHz max), 142GB memrdeownaerd= =Tr-ue10.0 * cut_parameter ory; and (S2) 4-core AMD Ryzen 5 2500U 2GHz else: (3.6GHz max), 8GB memory.

reward = log(cost) Software. We used Pytorch (1.13.0) with Pytorchreward = - (reward / reward_running_std) Geometric (2.1.0) for the GNNs, GATv2 for the graph layer (although we also successfully tried Sage

The cut_parameter is set at a starting value ex- Conv and GINConv), Python for SQL query rewritperimentally configured (e.g., 10-20) and then we ing, PostgreSQL (13.3), and MonetDB 11.41.11. slowly reduce it as the training progresses. This Data and preparation. We tested two use case helps avoid bad plans from diluting our training scenarios: (C1) training and inference using two data, as instead of continuing when we consider a workloads on the same schema, and (C2) training join than costs much more than what the query on a multiplicity of schemas and workloads, and optimizer (e.g., PostgreSQL’s optimizer) can do inference on a new schema and a new workload. by default, we penalize it appropriately using the C1 resembles a common use case, which is typically cut_parameter and continue with the next query. seen in the experimental analysis of other learning Otherwise, we the cost and use running statistics optimization approaches. C2 represents a more chal(Welford algorithm) to normalize it. Another ap- lenging, but clearly more interesting case scenario. proach would be to penalize more the initial steps of Use case C1. Data generation, training, and testa query, rather the later ones; e.g., a non-performant ing was performed on PostgreSQL on S1, which join in the first or second steps is worse than a sub- was tuned for performance. For the graph layer, optimal join toward the tail of the chain of join 20 epochs were enough, with early stopping to the operators. best approximate result. We used two workloads:

It is worth mentioning two advantages we find (W1) The Join Order Benchmark (JOB)[16] queries in favor of the reinforcement learning approach. and data, which is based on the IMDB dataset and First, it shows a potential to alleviate a limiting comprises 113 queries containing a varying numdata generation performance challenge we faced ber of joins, ranging from joining 5 to 17 tables; with the supervised approach; that is, queries with and (W2) We created a workload comprising 1200 many joins (e.g., more than 10 joins) result into queries generated by randomizing various paramea huge number of potential join paths, which in ters and predicates of the JOB queries, changing turn increases significantly the training time. In efectively several properties such as operator selecaddition, the reinforcement learning approach is tivity, size of the intermediate results, etc. For the amenable to online improvement, as we could fine- evaluation, we used W2 and did not consider any tune the produced model at runtime using real query of the queries used in the training phase. running times and potentially have a system that Use case C2. We employed a variety of relational continuously learns as query workloads run. We datasets and workloads as described in the DBconsider this interesting optimization opportunity Gen benchmark [21]1. This collection comprises 21 as future work. datasets, including IMDB and IMDB_full that are the base of the JOB benchmark. Initially, we trained 3.2.5. Application to the rule-based system the model on PostgreSQL as described in (C1) on all datasets except the two IMDB ones. Then, we tested the model on both PostgreSQL and MonetDB using the JOB workload (IMDB dataset). In this case, the JOB workload is unknown to the model To close the loop, the model that has been trained on a cost-based system is then used (inference) on a rule-based system. Given a new query to run on the rule-based system, Toulouse proposes a plan with a specific join ordering, which in turn executes on the rule-based system bypassing the system’s optimizer. 1The DBGen benchmark can be found here: https://github.

com/DataManagementLab/zero-shot-cost-estimation (d) (e) (f) Figure 6: Toulouse supervised: runtime perf and % improvement vs. PostgreSQL (psql) and MonetDB optimizers trained –an experiment designed to test whether by fine-tuning the model and/or using additional our approach could potentially serve as a schema features, thus avoiding the risk of over-fitting our and workload agnostic solution. model.

An illustration of the scenarios tested is as follows: Evaluation on MonetDB (mdb). Toulouse has learnt eficient join paths from the psql optimizer, workload database hardware so next, we investigated whether it could leverage C1: supervised □ □ ■ □ ■ those on MonetDB and on two setups, S1 (same as □C:2:saRmLe with training■ ■ : dife□ r■ent tha n□■ training psql) and S2 (new). Figures 6b and 6e show the avg runtime performance with varying # and the 4.2. Model creation and evaluation % of improvement per query pattern, respectively, for the S1 setup (Figures 6c and 6f show the same Evaluating the supervised approach. First, we analysis on S2). Note that the more challenging present the results of our experimental analysis for queries involve 7-10 tables. These run better on the Toulouse’s supervised approach on the C1 scenario. S1 setup that has more memory and cpus. The less

For training, we used 200,000 query graphs cre- resource demanding queries involving 4-7 tables run ated by searching recursively in a subset of the JOB better on the S2 setup, whose resources sufice for queries for valid join combination that each creates these queries, as it has more modern and faster cpus. a diferent query graph, hence forming scenarios The plans specified by Toulouse (with a few excepthat require join decisions. Every potential join has tions) improve the mdb performance by an average an approximate best query runtime –should the join 42% on S1 and 36% on S2. Hence, join path knowlhappens– which can be used directly for regression. edge seems to be efectively transferable. Looking In our case, we set the minimum join value as 1 and deeper into the mdb plans, we realized that the mdb everything else as 0 for binary classification. Next, optimizer does not make optimal decisions for comwe fed this training data to our GNN structure, plex queries with many joins. This is an interesting using in our implementation the Adam optimizer ifnding towards future research in query optimizawith Binary Cross Entropy loss. tion: an ML based technique, trained on a diferent

Evaluation on PostgreSQL (psql). We tested engine and setup, can efectively complement a trathe queries with explicit join orders produced by ditional optimizer. The alternative, improving the Toulouse vs. queries optimized by the psql optimizer optimizer codebase, would potentially be orders of using the W2 workload. We ran each query 6 times magnitude more challenging to implement. and report the average results. Figure 6a shows the The supervised approach however does not peravg runtime performance of queries grouped by the form equally well on the use case C2 (results not # they involve. Figure 6d shows the % of im- shown here), which also adds a diferent workload provement per query pattern. The results show that to the equation. This motivated us to examine a although Toulouse has learnt the trend of the psql more complex architecture, with a richer feature set, optimizer’s decisions, still it does not outperform and a diferent design to cope with the challenge of psql. This is normal, as we did not try to beat psql optimizing a previously unseen workload. (d) (e) (f) Figure 7: Toulouse RL: runtime perf and % improvement vs. PostgreSQL (psql) and MonetDB optimizers

Evaluating the reinforcement learning approach. the diferent workloads used in the training phase. Motivated by the positive results of re-using a cost- Evaluation on MonetDB. Next, we evaluate how based learning model on a rule-based engine, next Toulouse could cope with an unknown workload we increased the challenge to work with an unknown on two diferent hardware setups S1 (same as psql) schema and workload. Hence, our second line of and S2 (new). Figures 7b and 7e show the avg runexperiments focuses on the efectiveness of our re- time performance with varying # and the % inforcement learning approach. As this approach of improvement per query id, respectively, for the S1 performs similarly to the supervised one on the C1 setup (Figures 7c and 7f show the same analysis on scenario, we omit the presentation of results for this S2). Notably, Toulouse manages to optimize almost case and focus on the more complex, C2 scenario. all queries by a significant fraction despite the fact

For starters, we trained the model on psql using a that the inference is performed with a new worksimilar process as before. In more detail, we altered load, on a new database, and on a new hardware the query generation code of DBGen [21] to output setup (S2). Figure 8 shows the average improveimplicit joins and used 18 diferent workloads from ment in MonetDB, on both hardware setups, for a diferent database schemas, leaving out the two varying number of tables (i.e., # ) per query. IMDB schemas completely during training and only On average, Toulouse improves query performance performed inference on these two schemas. This way, in MonetDB by 45.68% on S1 (server machine) and the learning happens in a schema-agnostic manner 24.4% on S2 (lower end machine). and allows to test whether it could be transferable Discussion. We measured the robustness of to diferent databases. In each epoch we collect 6400 Toulouse’s plans. The standard deviation of query steps and train in batches of 64 for 3 mini-epochs. performance is rather low for the entire workload,

Evaluation on PostgreSQL (psql). We tested the with the slight deviations efectively caused by typqueries produced by Toulouse vs. queries optimized ical runtime variability. Looking at the plans, we by the psql optimizer using the 113 queries of the verified that Toulouse consistently produced the JOB benchmark. We ran each query 6 times and same join paths for each query having all the other report the average results. Figure 7a shows the parameters of the experiment unchanged. avg runtime performance of queries grouped by the Caveats. Although PostgreSQL is the pre# they involve. Figure 7d shows the % of dominant system used in the majority of related improvement per query id. The results show that work both for model training and as a baseline, Toulouse manages to follow the general trend of clearly there are more advanced optimizers in other the psql optimizer’s decisions. Observe, that for database systems. We believe that using a more some queries, Toulouse RL performs worst than the advanced optimizer to train our models would only supervised approach (compare Figures 7d and 6d). make our approach perform better, but we leave this This is explained by the fact that in this scenario as a future work. The JOB benchmark (i.e., IMDB Toulouse deals with queries and schema that has schema) has been used extensively in the literature never seen before. Still, it seems to be able to apply as an analytics workload comprising queries with efective optimization strategies that has learnt from varying complexity of join chains. We find this as [4] MonetDB Docs, Optimizer pipelines, 2022.

https://www.monetdb.org/documentationSep2022/admin-guide/performancetips/optimizer-pipelines/. [5] D. Tsesmelis, A. Simitsis, Database optimizers in

the era of learning, in: ICDE, 2022. [6] T. Kraska, M. Alizadeh, A. Beutel, E. H. Chi,

A. Kristo, G. Leclerc, S. Madden, H. Mao, V. Nathan, Sagedb: A learned database system, in: CIDR, 2019.

Figure 8: Toulouse RL: % improvement in MonetDB [7] S. Krishnan, Z. Yang, K. Goldberg, J. M. running on two hw setups S1 (server) and S2 (low end) Hellerstein, I. Stoica, Learning to optimize join queries with deep reinforcement learning, 2018.

CoRR:abs/1808.03196. a useful first step for our analysis, still, we plan to [8] R. Marcus, O. Papaemmanouil, Deep reinforceinvestigate more real-world workloads in the con- ment learning for join order enumeration, in: tinuation of our work. Finally, our results reported aiDM@SIGMOD, 2018. [9] X. Yu, G. Li, C. Chai, N. Tang, Reinforcement here are based on our experimentation with Mon- learning with tree-lstm for join order selection, in: etDB; an excellent open source, analytics database ICDE, 2020. system that comes with one of the most popular [10] I. Trummer, J. Wang, D. Maram, S. Moseand successful rule-based optimizers. However, we ley, S. Jo, J. Antonakakis, Skinnerdb: Regretplan to extend our investigation to other rule-based bounded query evaluation via reinforcement learnengines as well, in an efort to generalize our findings ing, in: SIGMOD, 2019. to rule-based systems beyond MonetDB. [11] J. Zhang, Alphajoin: Join order selection à la alphago, in: PVLDB-PhD, 2020. [12] R. C. Marcus, P. Negi, H. Mao, C. Zhang, M. Al5. Conclusions izadeh, T. Kraska, O. Papaemmanouil, N. Tatbul, Neo: A learned query optimizer, in: PVLDB, We presented an investigation on whether learning 2019. could improve query optimization and in particular, [13] R. Marcus, P. Negi, H. Mao, N. Tatbul, M. Althe join ordering, in rule-based systems. Our initial izadeh, T. Kraska, Bao: Making learned query attempt to use rules and optimization strategies as optimization practical, in: SIGMOD, 2021. features was not very successful. Still, as we show in [14] A. Jindal, S. Qiao, R. Sen, H. Patel, Microlearner: A fine-grained learning optimizer for big data this paper, applying a model trained on a cost-based workloads at microsoft, in: ICDE, 2021. optimizer to a rule-based system for known and un- [15] MonetDB Docs, Mal optimizers, 2022. known workloads alike shows potential and deserves https://www.monetdb.org/documentationadditional research. Finally, our work shows that Sep2022/dev-guide/monetdb-internals/malML techniques could be a reasonable supplement optimizers/. to potential shortcomings of traditional optimizers [16] V. Leis, A. Gubichev, A. Mirchev, P. A. Boncz, without requiring changing their codebase. A. Kemper, T. Neumann, How good are query

Acknowledgements. This work has been partially optimizers, really?, Proc. VLDB Endow. 9 (2015) supported by EU’s Horizon 2020 projects INODE 204–215. and HBP-SGA3 with grant agreement numbers [17] B. Hilprecht, C. Binnig, One model to rule them all: Towards zero-shot learning for databases, in: 863410 and 945539, respectively. CIDR, 2022. [18] O. Alipourfard, H. H. Liu, J. Chen, S. VenkataraReferences man, M. Yu, M. Zhang, CherryPick: Adaptively unearthing the best cloud configurations for big [1] L. M. Haas, J. C. Freytag, G. M. Lohman, H. Pi- data analytics, in: USENIX NSDI, 2017. rahesh, Extensible query processing in starburst, [19] F. Harary, R. Z. Norman, Some properties of line in: SIGMOD, 1989. digraphs, Rendiconti del Circolo Matematico di [2] G. Graefe, W. J. McKenna, The volcano optimizer Palermo 9 (1960) 161—-168. generator: Extensibility and eficient search, in: [20] V. P. Dwivedi, A. T. Luu, T. Laurent, Y. Bengio, ICDE, 1993. X. Bresson, Graph neural networks with learn[3] S. Idreos, F. Grofen, N. Nes, S. Manegold, K. S. able structural and positional representations, in: Mullender, M. L. Kersten, MonetDB: Two decades ICLR, OpenReview.net, 2022. of research in column-oriented database architec- [21] B. Hilprecht, C. Binnig, Zero-shot cost models tures, IEEE Data Eng. Bull. 35 (2012) 40–45. for out-of-the-box learned cost prediction, Proc. VLDB Endow. 15 (2022) 2361–2374.