1. Introduction

Herdecke, Germany. $ Jerome.Thiessat@tu-dresden.de (J. Thiessat); Dirk.Habich@tu-dresden.de (D. Habich); Wolfgang.Lehner@tu-dresden.de (W. Lehner)

Steering the PostgreSQL query optimizer using hinting: State-Of-The-Art and open challenges

Jerome Thiessat

Dirk Habich

Wolfgang Lehner

0 0 Dresden Database Research Group, TU Dresden , Germany

2024

000 0 0002

In the field of query optimization, a tremendous amount of research has been delivered into fixing and adjusting existing optimizer solutions. Moreover, recent work has also shown that substantial performance gain can be achieved by setting query optimizer instructions appropriately. However, the sets of beneficial instructions may vary greatly for each query. In this contribution, we provide a state-of-the-art review on optimizer hinting and present some experimental evaluations showing hints should not be neglected during evaluation based on preliminary assumptions.

eol>Optimizer Configuration Query Optimization Search Space Traversal

1. Introduction

are (i) the review of the state-of-the-art research in the ifeld of optimizer hinting and (ii) providing experimental In the vast field of database management system (DBMS) evaluations of current challenges that need to be tackled design, query optimization has proven to be one of the to further advance in this research field. We do so at the backbone components of query performance. Modern example of the highly popular and open source DBMS query optimizers consist of the following three compo- PostgreSQL (PSQL). nents [ 1, 2, 3 ]: (i) plan enumerator, (ii) cost model, and Outline: In Section 2, we explain query hinting at the (iii) cardinality estimator. These components are hierar- example of PSQL. We will go into intricate details about chically dependent. This means that the plan enumerator the query optimizer hints present in PSQL and explain relies on the calculations of the cost model, which in turn which hints can be observed, how they behave, and how relies on the cardinality estimators’ results. Since tra- the systems’ hinting has evolved over the latest versions. ditional optimizers have shown to produce error prone Section 3 depicts the state-of-the-art systems for utilizing estimates [ 4, 5 ], current research establishes various ap- these hints to harness hidden optimizer potential. We proaches of refining plan enumerators [ 6, 7 ], cost models provide insights into leveraging hinting from the pre[8, 9], and also cardinality estimation [10, 11, 12]. These sented state-of-the-art in Section 4. We will then be able approaches all aim at adding, correcting, or substituting to obtain valuable information regarding the full potenexisting optimizer components to obtain faster query tial of hinting in query optimization based on the popular executions. Join-Order-Benchmark (JOB) [ 3 ]. Lastly, we open up the

One of the long established [13], yet only recently pop- discussion in Sections 5 and 6, where we discuss possible ular approaches [ 14, 15, 5 ], has been the targeted steer- highly valuable research directions, hinting in general, ing of optimizer behavior through configuration options and our reflections. called hints. These hints can generally be categorized in the class of enumerator approaches. However, instead of substituting or correcting an existing enumerator, hints 2. Query Hinting in PostgreSQL restrict the search space of enumerators, instructing the planner to ignore certain plans. Nevertheless, hints can not only be used for restricting plan enumeration. Recent research has shown that hinting can even be used to indirectly learn optimizer cost models [14, 15].

Contribution: Our core contributions in this paper

Modifying the PSQL optimizer can be done mainly by setting Boolean configuration parameters called query planner method configurations 1. We restrict our contribution to these Boolean hints as they allow us to steer the optimizer plan enumeration phase rather than interfering with the underlying complex cost model. In their nature, these configurations are mostly instructions to force the optimizer to restrict itself to certain operations while traversing a query’s plan space. However, since not all of these configurations can be forcefully switched

Scans Joins Partition Parallel Aggregation Misc from multiple sources. However, instead of processing every child node subsequently in parallel fashion, proBitmap Hash Pruning GMaetrhgeer AgHgraesghate Materialization cesses are spread across multiple child nodes to process Index Merge PartitJiooninwise- Append APgregsreogrtaetde Sort tchheilmd nsiomdeusltraantehoeurstlhyatnopaallroawllefliuslmly wpaitrhailnleolpexereactuotrisonofoaf Index-Only NLeosotepd PaArgtigtiroengwaitsee- Hash Geqo child node. For parallel hash, instead of building a hash Sequential AApspyenncd IncreSmoretntal taambolengfocroemamchonprporcoecsess,saesc.oTmhemloasnt hpaarsahlltealbhlienitscsohnasrisetds TID Memoize of async append. This hint allows parallel append plans V12 V13 V14/15 V16 DiUscsoeuOranglyed aogngfroergeaitginondahtianttsabcolenss,issutpopfohratsinhganshdaprrdeesdorstoeudrcaegsg.rTeghaetion.

Figure 1: PSQL hint development starting from version 12. Hash aggregation allows the use of hash functions for splitting, aggregating, and merging a target column. of 2, the term hint has established [ 14, 5 ]. Additionally, Presorted aggregation allows the query planner to conhint set refers to multiple hints, and hinting to the act struct plans that produce rows that are already sorted for of instructing the PSQL query optimizer with a hint set. further aggregation.

Notably, in PSQL, hints are global, which implies that Within the miscellaneous section are five hints. Mathey act upon the optimization of a whole query. This terialization allows to enable caching of intermediate means, that e.g., disabling the hint "Nested Loop Join" results in memory. Sort naturally allows the usage of will disable the usage of this operator (if possible) for the sort operations within a plan. Geqo decides whether entire query plan optimization phase. to use the genetic optimizer of PSQL. This optimizer is

Even the oldest still supported version (i.e., version 12) by default switched on if the number of joins in a query of PSQL has a plethora of these hinting options which exceeds twelve. Incremental sort enables the usage of may be highly valuable to investigate. A rigorous set of an optimized multi-key sorting method that allows to Boolean hints beginning from PSQL version 12 up to the take advantage of already sorted columns. Memoize enlatest version of this writing (i.e., version 16) can be found courages the use of nested loop join, as memoize caches in Figure 1. We observe that the base amount of hints is parameterized scans for nested loop joins only. already plentiful. While the core hints3 of PSQL consist Additionally, there are three options for usage in a parof six traditional hints, namely index, index-only, and titioned table environment. However, as these partitionsequential scan, as well as hash, merge, and nested loop ing hints require the building of partitioned tables, they join, there are multiple other options to be considered. remain mostly unused. While the ability to prune parti

Bitmap scans for example scan multiple tuple point- tioned tables remains on as for most hints, the partitioners, which are then sorted by their physical location to wise hints are of by default as partitioning requires adallow ordered access over all locations. Especially com- ditional attention and is not part of the default behavior bining multiple bitmaps is easy as multiple bitmap scans of PSQL. can be combined through simple logical operator combi- Moreover, there are also some hints that cannot be nation. TID Scans provide fast access with the knowl- completely switched of to ensure that a plan can be edge of tuple ids (i.e., physical location) of a row. Notably, created. Switching these hints of highly discourages tuple ids provide no identifying behavior like primary their use in the planning phase. These hints are marked keys as their value may change during the lifespan of in red in Figure 1. Moreover, the use of parallel hashing their respective database table. naturally requires hash joins to be enabled.

Additional non-core hints include hints for parallel ex- Now, we are able to further understand the possible ecution, aggregation, and general miscellaneous settings. implications of the presented PSQL hints. In the followWithin parallel execution, gather merge indicates the ing, we dive deeper into the state-of-the-art that utilizes use of the (gather) merge node within a plan that allows the presented hints to gain advantages during the optichild nodes of a query plan, or in fact, the whole plan mization queries. to be executed in parallel. While the gather node solely merges results, the gather merge node indicates that subnodes output sorted tuples which are then merged in a 3. Query Optimization Using sort-preserving manner. The parallel append works Optimizer Hints similarly to the regular append when combining rows

2as this may hinder the optimizer to construct a query plan at all 3https://www.postgresql.org/docs/16/planner-optimizer.html, accessed: March 4th, 2024 While modifying the core components of an optimizer

through diferent algorithms and machine learning models is currently investigated with great interest [ 8, 9, 11, 10, 12, 6, 7 ], the steering of optimization using optimizer Optimal Hint Set BAO hints remains to be a topic of deeper research. However, FG PG Default there are still some major contributors in this field.

BAO. The first one is the bandit optimizer (BAO) [ 14]. 2.4 BAO is a machine learning model that uses hints to indi- 2.2 rectly learn the optimizer’s cost model. BAO builds upon tro2.0 a commonly used machine learning technique derived ca from the successful use in learned query optimization, pF1.8 namely reinforcement learning. In its core, BAO uses a ued1.6 ifxed number of hint sets, which are used to obtain dif- peS1.4 ferent PSQL EXPLAIN plans during optimization. There, each of the resulting plans is used as an input for a Tree 1.2 Convolutional Neural Network (TCNN). The TCNN of 1.0 BAO is built similarly to a regular convolutional neural network with an additional initial tree flattening layer 10 20 30 40Train5i0ng S6p0lit [%7]0 80 90 100 to transform the input tree. The output of the TCNN is an estimated query cost value that is used to determine Figure 2: Evaluation of FASTgres and BAO on the real-world which of the input explain plans to use. This allows BAO workload Stack-Overflow [ 5 ]. to implicitly use the hints through PSQLs explain functionality. Lastly, Thompson sampling is used to enable edge is usually obtained only through rigorous analysis the reinforcement learning character of BAO, which bal- of query workloads or even individual queries, depending ances exploration and exploitation by guiding the data, on their versatility. As it is infeasible to have such inforupon which BAO is trained. mation a priori, and knowledge from previous queries

However, there are some considerations to account do not necessarily reflect future ones, we argue that defor when using BAO. First, BAO uses only the six main termining query spans eficiently for unseen queries is a hints of PSQL, which is merely a subset of the already challenging and under-investigated task. presented hints. Even inside these six hints (i.e., 26 = 64 FASTgres. Lastly, there is another approach, named hint set combinations), only a subset of five hint sets FASTgres [ 5 ] that breaks with the commonly used reinare feasible enough such that optimization time does forcement learning approaches for learned optimizers not start to exceed the actual runtime. Moreover, BAO and optimizer steering. FASTgres settles directly at the uses an indirect approach that necessitates evaluating plan enumerator by restricting its search space through multiple query plans before being able to decide on which hinting predictions. FASTgres uses a supervised clasto choose. Naturally, doing so does not scale well with sification approach to predict the most beneficial hint increasing numbers of hint sets used. Lastly, BAOs input set directly from incoming queries. Here, queries are plans rely on PSQL EXPLAIN plans, which have been predetermined into diferent query classes called conshown to produce notoriously error prone estimates [4, texts. In FASTgres, contexts can be defined in multiple 5]. granularities but are commonly distinguished by their

Autosteer. As a successor to BAO, the approach of table join groups. Within each context, a supervised graAutosteer [15] was developed. Just like BAO, Autosteer dient boosting model is used to predict a query’s best is a learned cost model by using query hinting. In this hint set. This approach naturally difers from previous approach, the existing challenge of selecting hint sets, work by using a divide-and-conquer approach to provide which were solved by manually applying expert knowl- locally well performing models that are robust against edge in BAO, were tackled. Their hint set traversal al- data and workload drifts. This robustness is achieved gorithm consists of iteratively combining eficient query not only by using local models but also by providing an spans [16]. For each combination, they determine the experience-based query anomaly detection that retrains query’s runtime and decide whether to keep the combi- context models once receiving abnormally performing nation or not, based on the default query execution time. queries. Additionally, since a supervised approach is apWhile this algorithm leverages on hint set traversal from plied, labeled data is required. Such labeled query data is BAO’s fixed hint sets, there are still some challenges. obtained by using a grid search strategy with aggressive

First, a rather large assumption is made by postulating timeouts determined by the currently best query-hint-set that not all hint sets are beneficial. While this may hold combination. true for certain workloads, in DBMSs and with specific Generally, FASTgres has shown to provide superior versions, the benefits of hints cannot be deemed static. results with the usage of supervised learning for hint set Additionally, a key assumption is the prior knowledge of prediction as observable in Figure 2. query spans (i.e., eficient hints) of a query. Such knowl- While the performance of supervised classification for 80 t]60 n u o C [ e c n rre40 u c c O 20 0

Positive Impact of Hint Deactivation: JOB (113 Queries) hint set prediction shows promising results, there are also challenges that need to be overcome. Just like BAO, FASTgres only utilizes six hints. While they traverse the whole search space rather than only a few combinations of hints, there is still the huge challenge of scalability.

Even with their provided aggressive timeout-strategy, search spaces that grow exponentially have to be evaluated fully. If we observe the Boolean hint sets from PSQL in Table 1, the hint set search space would constitute 222 ≈ 4.2 · 106 combinations, which is infeasible to be labeled in FASTgres’ strategy. Having such scalability restrictions naturally leaves the question open about how much potential is neglected when labeling queries.

To depict open challenges emerging from the stateof-the-art work on hinted optimizer steering, we now investigate the potential of hinting at the example of PSQL.

INDEX_ONLYN_SESESCIQNTA_DENSEDCX_AL_NOSCOMAPEN_RJGOEHIN_AJSOHIN_TJIDO_INSCPAANPSRAOMARRA_ATTH_EAARSPIHAPLEGINZHADAATTSHIOHE_NRBAI_TGMMGEARPG_ESINCCAA_NSSMYONERMCPT_ROAEIZPSEPOERNTD_AGGGEQO 4. Optimization Potential of Query Hinting Since analyzing these hints can become increasingly

time consuming, rather than taking a large workload like Stack-Overflow [ 14] with 6191 queries, we focus on JOB Optimizer hinting is a volatile task in its nature. There that contains real world data from IMDB and a query are many diferent influencing aspects that need to be workload of 113 analytical queries. Since JOB does not considered such that they become intangible rapidly. Ex- provide partitioned tables by default, the partition hints emplary, when considering the decision of whether to for pruning, join, and aggregate will be neglected in use a hash join in a query or not, the influence factors the further analysis. can depend on the collected statistics, available memory In our first experiment, we conduct an evaluation for building hash tables, possible parallel execution, the whether the option of turning of the usage of a hint systems hardware4, the systems build version5, and pos- pertains a positive impact on the overall runtime of a sibly many more. As subsets of these influence factors query or not. For each JOB query, the influence of each are subject to constant change, the proper determination hint was measured. The results can be observed in Figof operator choice for a single query becomes a tedious ure 3 for PSQL version 16.2. task that is infeasible to fulfill properly during runtime. Along the x-axis, every hint is displayed. The y-axis With these factors in mind, it is only natural to assume shows how many times in the whole workload a positive that, by default, every hint set can have an influence on performance impact was noted by allowing a hint to be a query. With this general assumption, we now investi- switched of from their default setting. Notably, every gate the PSQL query optimizer in further detail. For the hint has at least 20 cases out of 113 in which turning following experiments, we evaluated - unless otherwise them of is beneficial. In detail, allowing for nested loop stated - on a PSQL Docker Image that has been prop- joins to be switched of amounts for 26 positive impacts erly configured for its hardware (i.e., using PG-Tune 6) (i.e., 23%), while allowing bitmap scans to be switched and using an ANALYZE step to build initial database of even amounts for 88 positive impacts (i.e., 78%). This statistics. Additionally, each run has been evaluated once implies that every hint on their own may have useful with each query-hint-set combination being run once cases in which their impact cannot be neglected. before measuring, ensuring properly equal pre-warming. While these insights show that considering every hint The used hardware consists of an Intel Xeon Gold 6216 is a sensible choice, the exact impact of these hints reCPU (Skylake architecture) with 12 cores, 92 GiB of main mains ambiguous. Results of the average speedup gained memory, and 1.8 TiB of HDD storage. from these positive impacts can be taken from Figure 4.

As we observed in Figure 1, there are 22 hints in the Here, the y-axis displays the average speedup factor latest version of PSQL (i.e., v16) at the time of writing. that is obtained throughout the whole workload within the positively impacting hints. Notably, the option to turn of sequential scanning now provides average speedups of factor six, while bitmap scans with the highest previous occurrence only provide speedup factors of

4i.e., Memory, SSD, HDD availability

5i.e., their eficiency in implementing the operator and underlying cost factors 6https://pgtune.leopard.in.ua/, accessed April 25th, 2024 Average Speedup per Positive Occurrence: JOB (113 Queries)

Average Gain per Positive Occurrence: JOB (113 Queries)

INDEX_ONLNYS_EESSIQCNTA_EDSNDEC_XAL_ONSMOCEAPRN_JGOEHIN_AJSOHIN_TJIDO_INSCPAAPNSRAMOARAR_ATTH_EAARSPIAHPLEGINZHAADATTSHIHOEB_RNAI_TGMMGEARPG_ISENCCAA_SNSMYONERPCMTR_OAEIPZSEPOERNTD_AGGGEQO

quential scan, and merge join, which are part of the six core hints. However, we also find fundamental potential in the parallel hashing, Geqo, and memoize options.

Overall, we observe that within the JOB workload, all hints may vary in occurrence, their speedup, and their absolute time gain without indicating rules that can be followed. Our empirical studies on other PSQL versions show that even between versions, there are no rules to be delineated that could lead to inferences that can be made for future queries.

When focusing on single queries in the workload, we obtain a more detailed view about the query variety. We show an excerpt in Figure 6 due to space restrictions, while noting that the varying behavior can be observed throughout the whole workload.

We notice a variety of diferent behaviors. Query 1 from Figure 6a contains five joins and four filters of which two are wildcard filters. The three major hints are sequential scan, nested loop join, and hash join. Having potential speedup factors of over twelve indicates that the default PSQL optimizer had issues correctly determining scan and join operators, favoring sequential scans over index and index-only scans, which indicates too low selectivity estimation for scan operations. Additionally, the potential gain through hash joins indicates that the default optimizer might have estimated too high selectivity join results, leading to the choice of hash joins over nested loop or merge joins. On the other hand, disabling nested loop joins results in catastrophic deterioration of execution time, reinforcing the thought that nested loop joins are the operators that should be predominantly used in this query. Query 8 contains five joins and two iflter predicates. Figure 6b displays a scenario, where the default optimization is already surprisingly eficient. However, even then, small improvements are still possible on multiple hints with disabling merge joins and sorting as the predominant ones. Query 32 contains six joins with one filter predicate. Figure 6c shows a result that is quite common in our analysis. There are a lot of hints which carry speedup potential, while some hints may lead to catastrophic execution times with respect to the default evaluation.

With these experiments, we could show at the example of PSQL and JOB, that there is no panacea for finding the best hints of a query or even workload a priori of evaluation. This leads us to the conclusion that query hinting is a complex topic that necessitates further investigation to properly infer on the usefulness of each query-hint-set combination.

INDEX_ONLYN_SESESCIQNTA_DENSEDCX_AL_NOSCOMAPEN_RJGOEHIN_AJSOHIN_TJIDO_INSCPAANPSRAOMARRA_ATTH_EAARSPIHAPLEGINZHADAATTSHIOHE_NRBAI_TGMMGEARPG_ESINCCAA_NSSMYONERMCPT_ROAEIZPSEPOERNTD_AGGGEQO

just above one. These results imply that the occurrence of positive hints does not need to correlate with proper speedups across the workload.

Moreover, while speedup by factor is a valuable indicator of how much factor-wise gain we can obtain by switching individual hints on or of, another valuable insight can be achieved by looking at the absolute time impact of these hints. Results portraying the absolute gain can be observed in Figure 5.

Along the y-axis, we display the time savings that can be achieved from allowing a hint to be switched of 5. Future Research at logarithmic scale. Again, contrary to the indications from Figures 3 and 4, we notice that fundamental time The potential for future research regarding optimizer savings lie within the nested loop join, index scan, se- hinting is manifold. As we have shown, every hint has to 12 10 INDEX_ONLNYS_EESSIQCNTA_EDSNDEC_XAL_ONSMOCEAPRN_JGOEHIN_AJSOHIN_TJIDO_INSCPAAPNSRAMOARAR_ATTH_EAARSPIAHPLEGINZHAADATTSHIHOEB_RNAI_TGMMGEARPG_ISENCCAA_SNSMYONERPCMTR_OAEIPZSEPOERNTD_AGGGEQO

INDEX_ONLNYS_EESSIQCNTA_EDSNDEC_XAL_ONSMOCEAPRN_JGOEHIN_AJSOHIN_TJIDO_INSCPAAPNSRAMOARAR_ATTH_EAARSPIAHPLEGINZHAADATTSHIHOEB_RNAI_TGMMGEARPG_ISENCCAA_SNSMYONERPCMTR_OAEIPZSEPOERNTD_AGGGEQO

INDEX_ONLNYS_EESSIQCNTA_EDSNDEC_XAL_ONSMOCEAPRN_JGOEHIN_AJSOHIN_TJIDO_INSCPAAPNSRAMOARAR_ATTH_EAARSPIAHPLEGINZHAADATTSHIHOEB_RNAI_TGMMGEARPG_ISENCCAA_SNSMYONERPCMTR_OAEIPZSEPOERNTD_AGGGEQO (a) Single hint influence on query 1a.

(b) Single hint influence on query 8c.

(c) Single hint influence on query 32a. be considered when trying to optimize queries. However, as a whole to be able to extract all possible gains. Morewithin each query, we have seen that some hints have over, we showed that promising hints cannot be simply less influence than others. For example, an algorithmic predetermined without extensive analysis and that the solution with eficient hint pruning methods and possi- impact of hint sets varies between diferent queries and bly early stopping mechanisms that can be run without also across diferent PSQL versions. prior knowledge of eficient hints may prove promising. Even though [15] provides an algorithmic solution for By radically reducing the search space, this might allow their hint set traversal problem, there are aspects that easier scalability to traverse query-hint-set combinations. still do not sufice for a scalable traversal solution. As Such an algorithm could evaluate each hint set on their shown in our evaluations, the usage of a predetermined own either from a subtractive (i.e., step-wise disabling) set of hints for initial evaluation, namely query spans, is or additive (i.e., step-wise enabling) starting point. Ei- not derivable beforehand and, thus, impractical. ther solution would provide insight into the one-ring We could show that the research field for hint set neighborhood (i.e., neighboring hint sets that only difer traversal ofers plenty of optimization potential for future by one hint being switched of or on) of changing hints work. from a default situation, in which either all hints are switched on or of. Additionally, further steps could be References decided based on these initial results. Such further steps can include the use of stopping criteria based on previous results, the current one-ring neighborhood results, previously observed queries, and possibly many more.

Lastly, even though current learned models provide reasonable speedup gains, the development of a tailored featurization for query hinting seems lucrative, as currently, only featurization options that are tailored for cardinality estimation are in use. Potentially, such a dedicated featurization method may require smaller input vectors and generalize better, depending on the chosen representation.

6. Conclusion Lastly, we summarize our findings in this contribution.

We showed that there is already valuable research in the ifeld of query optimizer hinting that focuses on a small set of hints that they deem feasible. Additionally, we could show at the example of JOB and PSQL, that hinting is a manifold challenge that cannot be heuristically reduced to a mere subset of hints but must be considered ment learning for join order enumeration, in: aiDM@SIGMOD 2018, 2018, pp. 3:1–3:4. [7] S. Krishnan, Z. Yang, K. Goldberg, J. M. Hellerstein, I. Stoica, Learning to optimize join queries with deep reinforcement learning, CoRR abs/1808.03196 (2018). [8] R. Marcus, P. Negi, H. Mao, C. Zhang, M. Alizadeh, T. Kraska, O. Papaemmanouil, N. Tatbul, Neo: A learned query optimizer, Proc. VLDB Endow. 12 (2019) 1705–1718. [9] Z. Yang, W. Chiang, S. Luan, G. Mittal, M. Luo, I. Stoica, Balsa: Learning a query optimizer without expert demonstrations, in: SIGMOD, 2022, pp. 931– 944. [10] M. Stillger, G. M. Lohman, V. Markl, M. Kandil, LEO - db2’s learning optimizer, in: VLDB, 2001, pp. 19– 28. [11] A. Kipf, T. Kipf, B. Radke, V. Leis, P. A. Boncz, A. Kemper, Learned cardinalities: Estimating correlated joins with deep learning, in: CIDR, 2019. [12] L. Woltmann, C. Hartmann, M. Thiele, D. Habich, W. Lehner, Cardinality estimation with local deep learning models, in: aiDM@SIGMOD, 2019, pp. 5:1–5:8. [13] D. K. Burleson, Oracle high-performance SQL tuning, McGraw-Hill, Inc., 2001. [14] R. Marcus, P. Negi, H. Mao, N. Tatbul, M. Alizadeh, T. Kraska, Bao: Making learned query optimization practical, SIGMOD Rec. 51 (2022) 6–13. [15] C. Anneser, N. Tatbul, D. E. Cohen, Z. Xu, P. Pandian, N. Laptev, R. Marcus, Autosteer: Learned query optimization for any SQL database, Proc.

VLDB Endow. 16 (2023) 3515–3527. [16] P. Negi, M. Interlandi, R. Marcus, M. Alizadeh, T. Kraska, M. T. Friedman, A. Jindal, Steering query optimizers: A practical take on big data workloads, in: SIGMOD, 2021, pp. 2557–2569.

[1]

P. G.

Selinger , M. M. Astrahan , D. D.

Chamberlin , R. A.

Lorie , T. G. Price, Access path selection in a relational database management system , in: SIGMOD, ACM , 1979 , pp. 23 - 34 .

[2]

Perron ,

Shang ,

Kraska ,

Stonebraker , How I learned to stop worrying and love reoptimization , in: ICDE , 2019 , pp. 1758 - 1761 .

[3]

Leis ,

Gubichev ,

Mirchev ,

P. A.

Boncz ,

Kemper , T. Neumann, How good are query optimizers, really? , Proc. VLDB Endow . 9 ( 2015 ) 204 - 215 .

[4]

Leis ,

Radke ,

Gubichev ,

Mirchev ,

P. A.

Boncz ,

Kemper , T. Neumann, Query optimization through the looking glass, and what we found running the join order benchmark , VLDB J . 27 ( 2018 ) 643 - 668 .

[5]

Woltmann ,

Thiessat ,

Hartmann ,

Habich , W. Lehner, Fastgres: Making learned query optimizer hinting efective , Proc. VLDB Endow . 16 ( 2023 ) 3310 - 3322 .

[6]

Marcus ,

Papaemmanouil , Deep reinforce-