A Bi-objective Optimization Framework for Query Plans Piotr Przymus1 , Krzysztof Kaczmarski2 , and Krzysztof Stencel3 1 Nicolaus Copernicus University, Poland, eror@mat.umk.pl 2 Warsaw University of Technology, Poland, k.kaczmarski@mini.pw.edu.pl 3 The University of Warsaw, Poland, stencel@mimuw.edu.pl Abstract. Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in general- purpose computing to solve problems that can benefit from massive par- allel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborate on hybrid CPU/GPU computation and build optimisation methods that seek the equilibrium between these two computation plat- forms. The method is based on heuristic search for bi-objective Pareto optimal execution plans in presence of multiple concurrent queries. The underlying model mimics the commodity market where devices are pro- ducers and queries are consumers. The value of resources of computing devices is controlled by supply-and-demand laws. Our model of the opti- mization criteria allows finding solutions of problems not yet addressed in heterogeneous query processing. Furthermore, it also offers lower time complexity and higher accuracy than other methods. 1 Introduction General-Purpose computing on Graphics Processing Units (GPGPU) involves utilization of graphics processing units (GPU) in tasks traditionally handled by central processing units (CPU). GPUs offer a notable processing power for streams. Execution of database queries is an example of a successful application of GPGPU. The current research focuses on using the GPU as a co-processor [5]. GPU as co-processor may accelerate numerous database computations, e.g. rela- tional query processing, query optimization, database compression or supporting time series databases [5,13,14]. An application of GPU requires transferring data from the CPU memory to the graphical device memory. The data transfer is usually time-consuming. It may diminish the gain of the acceleration credited to GPU. This situation can be improved by using lightweight compression methods that can significantly reduce the costs associated with communication [13,14]. However, this does not solve all the problems. In particular, GPU is optimized for numerical computation. Thus, only selected operations will benefit from GPU. Small data sets are another problem. For such sets the data transfer may dominate processing time and A Bi-objective Optimization Framework for Query Plans 343 destroy the performance gain. Therefore, joint processing capabilities of both CPU and GPU are worth considering. Furthermore, as it is common to have more than one GPU in a computer, a potential use of various GPU devices should be considered. This type of query plans is called heterogeneous. The previous research efforts focused on the creation of query plans based on a cost model. This approach finds plans with the best throughput. However, it does not allow modelling all phenomena that can occur in heterogeneous sys- tems. Performing a query as soon as possible is not always cost effective [8]. For this reason, we propose a query processing model based on concepts of markets that are known to be suitable for describing the interactions in a heterogeneous world. They have already gained a considerable interest in the context of task processing in heterogeneous systems [6]. In market models, manufacturers (pro- cessing devices) compete with each other for customers (query plans). Similar competition occurs among customers. In this paper, we propose a query optimization model based on the commod- ity market. A query plan is bi-objectively optimized to minimize: the processing time and the value of consumed resources. For the user, a small difference in execution time can be negligible. Thus, it is worth optimizing a query, so that the execution time satisfies the user while other costs are minimized. In this case, the cost may be, e.g. the responsiveness of the system, power consumption, heat production, etc. One can also consider expressing the cost in financial terms. 2 Preliminaries 2.1 GPU and Heterogeneous query processing From the parallel processing’s point of view, CPU accompanied by a GPU co- processor is a shared nothing architecture. A GPU card has its own memory or a separate area in the CPU main memory. Thus, the data has to be explicitly transferred from the CPU main memory to the GPU main memory. Similarly, the results produced by GPU have to be transferred back to the CPU main memory. This data transfer often introduces significant overhead. Thus, it is im- portant to include the transfer cost in the total execution time of an operation. This cost is also a component of the execution time prediction. Contemporary computer systems often include more than one GPU. Then, it is possible to combine multiple computational units in a single query plan. Such plans are called heterogeneous query processing. Each device may have a different communication cost (e.g. PCIe or shared memory) with the CPU main memory. Furthermore, devices can often communicate directly between each other. Therefore, the main problem of heterogeneous query processing is the construction of such a query plan that uses only computational units from which query performance will benefit most and yet will minimize used resources. Bress et. al. [5] identified problems of hybrid (CPU/GPU) query processing which are also true in heterogeneous query processing: 344 P. Przymus, K. Kaczmarski, K. Stencel Problem 1 Execution Time Prediction - as multiple database operations may be executed concurrently it is hard to predict influence of concurrent tasks on execution times. Problem 2 Critical Query - since the GPU memory, the concurrent GPU ker- nels execution and the PCIe bus bandwidth are all limited, only the critical queries should be selected to use GPU (i.e., queries that benefit from GPU usage and are important from global perspective). Problem 3 Optimization Impact - as concurrent heterogeneous queries will in- fluence each other, it is important to consider this aspect in the planning process. 2.2 Commodity market approach in query processing context In this paper we address these problems by showing that they may be solved by applying a supply-and-demand pricing model taken from a commodity market. In such a market resource owners (processing devices) price their assets and charge their customers (queries) for consumed resources. Other pricing models may also be used [6]. In the supply-and-demand model when supply (available resources) or de- mand (needed resources) changes, the prices will be changed until an equilibrium between supply and demand is found. Typically the value of a resource is influ- enced by: its strength, physical cost, service overhead, demand and preferences [6]. A consumer may be charged for various resources like CPU cycles, memory used, the bus usage or the network usage. Typically, a broker mediates between the resource owners and the consumer. The resource owners announce their val- uation and the resource quality information (e.g. estimated time) in response to the broker’s enquiry. Then, the broker selects resources that meet the con- sumer utility function and objectives, like cost and estimated time constraints or minimization of one of the objectives. 2.3 Bi-objective optimization Bi-objective optimization is a problem where optimal decisions need to be taken in the presence of trade-offs between two conflicting objectives. It is a special case of multiple criteria decision making. Typically there are no solutions that meets all objectives. Thus, a definition of an optimum solution set should be established. In this paper we use the predominant Pareto optimality [11]. Given a set of choices and a way of valuing them, the Pareto set consists of choices that are Pareto efficient. A set of choices is said to be Pareto efficient if we cannot find a reallocation of those choices such that the value of a single choice is improved without worsening values of others choices. As bi-objective query optimization is NP-hard, we need an approximate solution [12]. 3 Heterogeneous Query Planer The main aim of the new query planner is to propose a solution to the prob- lems listed in Section 2.1, i.e., Execution Time Prediction, Critical Query and A Bi-objective Optimization Framework for Query Plans 345 Optimization Impact. Furthermore, this planner also addresses heterogeneous GPU cards and distributed processing. In this paper we propose a method to build heterogeneous query plans based on the economics of commodity markets. It is characterized by the fact that the resource producers determine the cost of their resources and resource consumers jostle for resources. Furthermore, the resources owners provide information on the quality of their resources, i.e., the estimated processing time. 3.1 Notation Table 1 contains a summary of the notation used in this paper. Assume a set of units U , a logical query sequence QSlog and a dataset D. The goal is to build a heterogeneous query sequence. Let QShet be a heterogeneous query sequence de- fined in Equation (1). Let Dk+1 be the data returned by an operation Aouki (Dk ). k The first row of QShet is created by replacing each operation oi ∈ QSlog with an algorithm Aouij ∈ APoi . The second row is created by inserting an operation Muk ,uk0 (D) that copies the output of an algorithm on the unit u to the input of the current algorithm on the unit u0 . Symbol Description U = {u1 , u2 , . . . un } set of computational units available to process data D dataset Muk ,uk0 (D) if uk 6= uk0 move D from uk to uk0 else pass oi ∈ O database operation oi from set of operations O Aouik algorithm that computes the operation oi on uk APoi = {Aoui1 , Aoui2 , . . . , Aouin } algorithm pool for the operation oi trun (Aouij , D) estimated run time of the algorithm Aouij on the data D tcopy (Mui ,uj , D)) estimated copy time of the data D from ui to uj crun (Aouij , D) estimated run cost of the algorithm Aouij on the data D ccopy (Mui ,uj , D) estimated copy cost of the data D from ui to uj QSlog = o1 o2 . . . on logical query sequence QShet heterogeneous query sequence see Eq. 1 ft , g t estimated algorithm run time and copy time see Sec. 3.2 fc , g c estimated algorithm run cost and copy cost see Sec. 3.3 fb , g b estimated algorithm run and copy bi-objective scalar- ization see Sec. 3.4 Fx (QShet ) sum of fx and gx over columns of QShet where x ∈ {t, c, b} see Eq. 2 Table 1: Symbols used in the definition of our optimisation model Aou1i1 (D1 ), Aou2i2 (D2 ), Aou3i3 (D3 ), . . . , Aounin (Dn )   QShet = (1) Mu ,ui1 (D1 ), Mui1 ,ui2 (D2 ), Mui2 ,ui3 (D3 ), . . . , Muin ,u∗ (Dn ) ∗ 346 P. Przymus, K. Kaczmarski, K. Stencel X X Fx (QShet ) = fx (Aou , D) + gx (Mu0 ,u00 , D) (2) Ao u (D) Mu0 ,u00 (D) 3.2 Single Objective Heterogeneous Query Planer Procedure OptimalSeq(QSlog , u∗ , x) Input: QSlog = o1 o2 o3 . . . on - logical query sequence, u∗ - base unit, x ∈ { t - time, c - cost, b - bioptimization } - optimization type Result: QShybrid 1 seq_list = []; 2 for u in U do 3 Qu = Su (QSlog , u∗ ); 4 QFu = Fx (Qu ) ; /* e.g. Ft (Qu ) */ 5 append (u, Qu , QFu ) to Seq_list; 6 end 7 QShybrid = pop minimum Qu (by QFu ) sequence from Seq_list; 8 for (u, Qu , QFu ) in Seq_list do 9 A, B, C = DiffSeq (QShybrid , Qu , u, x); 10 val, start, end = M axSubseq(A, B, C); 11 if val > 0 then 12 Qhybrid (start : end) = Qu (start : end) ; /* subarray subsitution */ 13 end 14 end 15 return Qbase In this section, we introduce the algorithm that searches for a heterogeneous query plan, i.e., a plan that operates on more than two devices. For simplicity let us assume that x = t, ft (Aou , Di ) = trun (Aou , Di ) and gt (Mu,u0 , D) = tcopy (Mu,u0 , D). Later in this article we will define functions fc , gc and fb , gb to fit the model of the commodity market and the bi-objective optimization. Let Su (QSlog , u∗ ) return such QShet that each operation oi ∈ QSlog is replaced with an algorithm from unit u algorithm pool, i.e., Aoui ∈ APoi and the base device is set to u∗ . Note that there is a specially designated computing unit u∗ from which the processing starts. It also collects the data in the end of processing, since the GPU computing is controlled by a CPU side program. The algorithm OptimalSeq starts by creating a query sequence for each com- puting unit and estimating the processing cost for each item of this sequences (lines 2-6). Next, one sequence (which minimizes Fx (Qu )) is selected as the base sequence. It will be improved in later steps (line 7). Then, the algorithm iterates over remaining query sequences in QShybrid in order to find such segments in the remaining query sequences which improve original sequence (by replacing A Bi-objective Optimization Framework for Query Plans 347 Procedure DiffSeq(seqbase , sequ , u, x) Input: seqbase - base sequence, sequ - unit query sequence, u - sequ unit, x ∈ { t - time, c - cost, b - bioptimization } - optimization type Result: A - operations improvement array; B, C - copy to/from unit arrays 1 A, B, C = [], [], []; 2 for i in enumerate columns seqbase do 3 Aoub (Di ), Muf ,ut (Di ) = seqbase [i]; 4 Aou (Di ), Mu,u (Di ) = sequ [i]; 5 append fx (Aoub , Di ) − fx (Aou , Di ) to A ; /* e.g. ft (Aou , D) */ 6 append gx (Muf ,u , Di ) to B ; /* e.g. gt (Mu,u0 , D) */ 7 append gx (Mu,ut , Di ) to C; 8 end 9 return A, B, C corresponding segment of the original sequence). This is done by calculating the improvement and copy cost arrays in DiffSeq and finding maximal sequence segment in MaxSubseq. A following variant of the proposed algorithm should also be considered. Suppose that only one query sequence segment may be in- serted (i.e., choose one sequence segment from remaining k − 1 sequences with the biggest), this minimizes number of involved computational units and reduces overall communication costs. The procedure DiffSeq simply calculates element wise difference between two query sequences fx (Aoub , Di ) − fx (Aou , Di ) and copy costs from/to unit. The pro- cedure MaxSubseq is based on Kadane’s algorithm for maximum subarray prob- lem [1]. It scans through the improvement array, computing at each position the maximum subsequence ending at this position. This subsequence is either empty or consists of one more element than the maximum subsequence ending at the previous position. Additionally, the copy to and copy from costs are included in the calculation of the maximum subsequence (B and C arrays). The algorithm returns the maximum improvement for a subsequence (which may be zero if the subsequence does not improve the original query), the start and end items of subsequence. The complexity of OptimalSeq is O(k ∗ n) where k is the number of devices (usually small) and n is the number of operations of the sequence. Su ,Fx , DiffSeq and MaxSubseq have the complexity O(n). 3.3 Economics in Heterogeneous Environment To cope with the problems mentioned in Section 2.1, additional criteria are necessary – in this work an approach based on a simple economic model is proposed. Each consumer (client) has a query budget that can be used to pay for the resources used to process queries. Each computational unit is a service provider (producer) of services available in units algorithm pool APui . Each service provider establishes its own pricing for execution of any service from APui . Pricing of the service depends on: 348 P. Przymus, K. Kaczmarski, K. Stencel Procedure MaxSubseq(A, B, C) Input: A - operations improvement array; B, C - copy to/from unit arrays Result: maximum_improvment, start, end 1 max_ending_here = max_so_far = 0 ; 2 begin = tbegin = end = 0 ; 3 for i, x in enumerate(A) do 4 max_ending_here = max(0, max_ending_here + x) ; 5 if max_ending_here = 0 then 6 tbegin = i ; 7 max_ending_here -= B[i]; 8 end 9 if max_ending_here - C[i] >= max_so_far then 10 begin = tbegin ; 11 end = i ; 12 max_so_far = max_ending_here ; 13 end 14 end 15 return max_so_far - C[end], begin, end – the estimation of needed resources (the size of the data D, the performance of the task Aouii ), – pricing of needed resources (the load of device ui – the greater the load on the device, the higher cost of using the device), – the preference of the device (e.g. device may prefer larger jobs and/or tasks that give a greater acceleration on the GPU). First, pricing for using the resources of computational unit is established. This depends on the previous load of the device: the higher demand for computational unit, the higher price for using it. This is a periodic process which calculates prices every ∆tup seconds by calculating computational unit price Pu . Let 0 < Lcurr < 1, 0 < Lprev < 1 be current and previous computational unit load factors. Additionally, let Lth be a threshold below which prices should decrease, and Pmin be the minimal price. Then the price is calculated using the following formula4 : ( ∆Pu max(Pmin , Pu · (1 + (1−∆U ) ) if (∆P > 0 ∧ ∆U > 0) ∨ (∆P < 0), Pu := (3) Pu otherwise, where ∆Pu = Lcurrent − Lthreshold and ∆Uu = Lcurrent − LP revious . This is similar to the dynamic pricing model proposed in [16] with exception to the ∆Pu pricing formula i.e., we use max(Pmin , Pu ·(1+ (1−∆U ) ) instead of max(Pmin , Pu · (1 + ∆Pu )), this modification reduces the excessive growth of prices. To reflect the preference of the device in price we need to define a function re- turning speedup factor between base device u∗ (defined in the previous section) 4 Slightly abusing notation we will also denote the new price by Pu . A Bi-objective Optimization Framework for Query Plans 349 and current device: speedup(Aou , Di ) = trun (Aou∗ , Di )/trun (Aou , Di ). Then we de- #Di #Di fine a cost function as crun (Aou , Di ) = speedup(A o ,D ) · Pu , where speedup(Ao ,D ) i i u u part combines the estimation of needed resources and the preference of the de- vice. A computational unit with high speedup on given operation will get a discount per data size when pricing this operation. Similarly, operations with a lower speedup factor will be charged more per quantity. Additionally it is ob- served [13] that often speedup depends on the size of processed data (usually low speed-up on small datasets) so discount depends on data size. It is also important to include cost of data transfer, let us define it as ( 0 if u,u’ share memory ccopy (Mu,u0 , D) := #Di bandwidth(#Di ,u,u0 ) · (Pu + Pu )/2 otherwise 0 where bandwidth returns estimated bytes per second between u and u0 compu- tational units. If direct data transfer is not available between u and u0 devices, then transit device will be used (e.q. two GPU cards without direct memory access will communicate using CPU RAM). Now let fc (Aou , Di ) = crun (Aou , Di ) and gc (Mu,u0 , D) = ccopy (Mu,u0 , D). A solution minimizing the cost may be found under the previous assumptions and using procedure OptimalSeq. 3.4 Bi-objective Heterogeneous Query Planer As finding Pareto optimal bi-objective query plan is NP-hard (bi-objective short- est path problem) [12], we will use previously described OptimalSeq single ob- jective approximation algorithm and extend it to bi-objective case. We will use a priori articulation of preference approach which is often applied to multi-objective optimization problems. It may be realized as the scalarization of objectives, i.e., all objective functions are combined to form a single function. In this work we will use weighted product method, where weights express user preference [11]. Let us define: fb (Aou , D) = crun (Aou , D)wc · trun (Aou , D)wt , gb (Mu,u0 , D) = ccopy (Mu,u0 , D)wc · tcopy (Mu,u0 , D)wt . where wt and wc are weights which reflect how important cost and time is (the bigger the weight the more important the feature – values of fb , gb are higher than 1). It is worth to mention that a special case with wt = wc = 1 (i.e., without any preferences) is equivalent to Nash arbitration method (or objective product method) [11]. 4 Preliminary Experimental Results 4.1 Simulation settings In order to evaluate this model we prepared a proof of concept and evaluated it us- ing custom developed simulation environment. Simulation environment was de- 350 P. Przymus, K. Kaczmarski, K. Stencel Device o1 o2 o3 o4 o5 o6 Option CPU1 CPU2 GPU1 GPU2 GPU1 20 11 6 0.38 14 15 Unit threshold 0.75 0.75 0.4 0.4 GPU2 5 11 6 0.33 4.66 5 Unit minimal price 5 5 70 70 CPU2 1 1 1.09 1 1.27 1.36 (b) Pricing model configuration (a) Average speedup of operation oi on given device compared to CPU1 Table 2: Simulation environment configuration veloped using Python and SimPy framework. All presented experiments are de- rived from simulation. There where four devices defined in environment: CPU1, CPU2, GPU1, GPU2. Data transfer bandwidth between CPU* ↔ GPU* was measured on real system, bandwidth of GPU1 ↔ GPU2 was calculated using CPU1 as transit device. Following weights in bi-objective scalarization were used wt = wc = 1 (i.e., without any preferences setting). Other settings of the sim- ulation environment are gathered in Tables 2b and 2a. Simulation environment generates new query sequences, when spawn event occurs. Spawn event is gener- ated randomly in a fixed interval and generates randomly set of query sequences (with fixed maximum). Every query sequence consists of maximally six oper- ations and operates on random data volume. In the simulation the processed data size has a direct (linear) influence on processing speed. Each device has got a limited number of resources; a database operation can be performed only if needed resources are available. In other cases the operation is waiting. After generating desired number of query sequences the simulation stops spawning of new tasks and waits until all generated query sequences are processed. 4.2 Simulation Results Figure 1a presents simulated execution time of three scheduling frameworks processing a pool of generated sequences of queries. To each generated query sequence an optimization criterion (time, cost or bi-optimization) was assigned with equal probability 1/3. Optimization criteria are only used if query schedul- ing framework supports it, otherwise default criteria is used. All scheduling frameworks process exactly the same pool of generated query sequences. Compared frameworks are based on OptimalSeq algorithm but use different objective function. Time objective planner uses only ft and gt functions as opti- mization criteria; this means that it has no idea on load of each of devices. Self Tuning Planner is based on idea presented in Breß et. al. [3], i.e. it maintains a list of observed execution times on data D for each algorithm Aouij . Observations are interpolated (using e.q. cubic splines) to form new estimated execution time function. And finally Bi-objective planner is a proof of concept implementation of the model described in this work. A Bi-objective Optimization Framework for Query Plans 351 (a) Simulated efficiency of Bi-objective (b) Efficiency of Bi-objective Heteroge- Heterogeneous Query Planer, Time based neous Query Planer for various optimiza- Query Planer and Self-Tuning Query Plan- tion tasks ner (c) Simulated load of devices (0 < load < 1) (d) Pricing of device Fig. 1: Simulation results. Note that time is in simulation ticks. As expected Time Objective Planner is the slowest one since it has no knowl- edge on the current load of devices. A solution suggested in [3] performs better. However, there are two problems with this approach: first it adds an additional overhead due to the interpolation of the observed execution times [3]; Secondly as may be observed in 1a it takes some time before it adapts to a new situa- tion (in early stage it performs similarly to the Time Objective Planner). This is due the fact that it does not immediately react to load change of the device. Instead, it has to gather enough observations before adapting. The best perfor- mance is gained when using Bi-objective planner, this is due to the three types of optimization and the cost model which assures proper load balancing. As our framework support different types of optimization in the Figure 1b, we present an impact of optimization type on processing performance. As it may be observed, time optimization is the most appropriate for query processing with high priority or with execution time constraint (like interactive queries or ad hoc data mining). Cost optimization is appropriate for operations with low priority or without time constraint (like batch processing or periodic jobs). Optimization 352 P. Przymus, K. Kaczmarski, K. Stencel of both cost and time (without preferences) leads to moderate processing speed but with better load balancing which is discussed later. As proposed economic model is an important part of presented framework, in Figure 1 interaction between device load 1c and established device pricing 1d is illustrated. Notice how increased load influences unit pricing according to the formula 3. It is worth noting that pricing model may be tuned for specific applications (see Table 2b for this simulation settings). 4.3 Discussion In Section 2.1 we cite three challenges of Hybrid Query Processing initially pre- sented in Breß et.al. [3]. As our bi-objective optimization framework was designed in order to address this challenges, an evaluation in the context of the former mentioned problems is needed. We address Critical Query problem by allowing different optimization targets for queries. Choosing time optimisation allows to a priori articulate importance of a query. Also, the bi-objective optimization tends to promote queries which may gain more on particular devices (due to the cost- delay trade-off and the fact that the cost objective is designed to promote tasks with greater speed-up 3.3). The problem of Execution Time Prediction is ad- dressed indirectly with bi-objective optimisation. This is because the bi-objective optimisation combines the cost objective function, which uses a current device load when pricing a device, with the execution time objective. So in most cases it is preferred to optimize both cost and time (without preferences towards any) through time/cost trade-off. Lastly different types of optimization apply also to Optimization Impact challenge. Choosing optimization criteria specifies a pos- sible impact on other queries. Although, the preliminary results are promising and seem to confirm this, an extended evaluation is needed in future. 5 Related Work Multiobjective query optimization was considered i.a. in Stonebraker et.al. [17] where a wide-area distributed database system (called Mariposa) was presented. An economic auction model was used as cost model. To process a query a user supplied a cost-delay trade-off curve. Because defining this kind of input data was problematic Papadimitriou et.al. proposed a new approach where an algorithm for finding -Pareto optimal solutions was presented. The solution was that a user would manually choose one of presented solutions. This work differs both in an optimisation method and an economic model involved. In our framework a user supplies an optimization objective for a query a priori (time, cost or bi-objective). Also as our model addresses the optimisation of co-processing interaction a simpler commodity market model could be used instead of a bidding model. An extended overview on utilization of a GPU as a coprocessor in database operations may be found in [3]. Breß et. al. [3] proposed a framework for op- timisation of hybrid CPU/GPU query plans and present two algorithms for A Bi-objective Optimization Framework for Query Plans 353 constructing hybrid query sequences. The first algorithm selected the fastest al- gorithm for every element of a query sequence (including the cost of transfer between devices) with complexity O(n). Unfortunately, this algorithm had two flaws [3]: the constructed plan could generate too frequent data transfers be- tween devices, which may significantly affect the performance of data processing and also an optimal plan was not always generated. To overcome those prob- lems they proposed the second algorithm. It searched for a continuous segment of operations on GPU that could improve the base CPU sequence. In order to find an optimal solution this algorithm generates all possible GPU sequences. Its complexity is obviously higher: O(n2 ). Our work extends this approach by allow- ing possible many various co-processing devices (Heterogeneous Query Planer in Section 3.1). Secondly our work incorporates commodity market model as well as bi-objective optimisation for better performance overcoming problems men- tioned in 2.1. Additionally, the algorithm OptimalSeq presented in our work may be used to produce a similar solution as the second algorithm by Breß et.al.[3] but with better complexity (in case of two devices O(n)). It is worth to mention two surveys: the first one describing economic models in grid computing [6] and the second one describing methods for multi-objective optimisation [11]. 6 Conclusions and Future Work In this paper, we proposed a bi-objective optimization framework for heteroge- neous query plans. We also presented an algorithm for creating query sequences in a heterogeneous environment with a single objective. This algorithm may be used to construct query sequences similar to [3] but with better complexity. For the purposes of this bi-objective optimization we designed a model including time and cost objectives function. The cost objective function and pricing model is build on foundations of commodity market economic model. The preliminary experiments are very promising. We achieved good load balancing of the simulated devices combined with better optimization results. In future work, an extended evaluation of the presented framework is needed, including; examination of parameters’ influence on the model behaviour, careful assessment against Hybrid Query challenges. Another interesting field is exten- sion of this model beyond CPU/GPU co-processing. Finally, the framework will be evaluated in a prototype time-series database [14,13]. References 1. J. Bentley. Programming pearls: algorithm design techniques. Commun. ACM, 27(9):865–873, Sept. 1984. 2. S. Breß, F. Beier, H. Rauhe, E. Schallehn, K.-U. Sattler, and G. Saake. Auto- matic selection of processing units for coprocessing in databases. In Advances in Databases and Information Systems, pages 57–70. Springer, 2012. 354 P. Przymus, K. Kaczmarski, K. Stencel 3. S. Breß, I. Geist, E. Schallehn, M. Mory, and G. Saake. A framework for cost based optimization of hybrid cpu/gpu query plans in database systems. Control and Cybernetics, pages 27–35, 2013. 4. S. Breß, S. Mohammad, and E. Schallehn. Self-tuning distribution of db-operations on hybrid cpu/gpu platforms. Grundlagen von Datenbanken, CEUR-WS, pages 89–94, 2012. 5. S. Breß, E. Schallehn, and I. Geist. Towards optimization of hybrid cpu/gpu query plans in database systems. In New Trends in Databases and Information Systems, pages 27–35. Springer, 2013. 6. R. Buyya, D. Abramson, J. Giddy, and H. Stockinger. Economic models for re- source management and scheduling in grid computing. Concurrency and compu- tation: practice and experience, 14(13-15):1507–1542, 2002. 7. W. Fang, B. He, and Q. Luo. Database compression on graphics processors. Pro- ceedings of the VLDB Endowment, 3(1-2):670–680, 2010. 8. D. Florescu and D. Kossmann. Rethinking cost and performance of database systems. ACM Sigmod Record, 38(1):43–48, 2009. 9. M. J. Franklin, B. T. Jónsson, and D. Kossmann. Performance tradeoffs for client- server query processing. In ACM SIGMOD Record, volume 25, pages 149–160. ACM, 1996. 10. D. Kossmann. The state of the art in distributed query processing. ACM Com- puting Surveys (CSUR), 32(4):422–469, 2000. 11. R. T. Marler and J. S. Arora. Survey of multi-objective optimization methods for engineering. Structural and multidisciplinary optimization, 26(6):369–395, 2004. 12. C. H. Papadimitriou and M. Yannakakis. Multiobjective query optimization. In Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 52–59. ACM, 2001. 13. P. Przymus and K. Kaczmarski. Dynamic compression strategy for time series database using gpu. In New Trends in Databases and Information Systems. 17th East-European Conference on Advances in Databases and Information Systems September 1-4, 2013 - Genoa, Italy, 2013. 14. P. Przymus and K. Kaczmarski. Time series queries processing with gpu support. In New Trends in Databases and Information Systems. 17th East-European Con- ference on Advances in Databases and Information Systems September 1-4, 2013 - Genoa, Italy, 2013. 15. A. Raith and M. Ehrgott. A comparison of solution strategies for biobjective shortest path problems. Computers & Operations Research, 36(4):1299–1331, 2009. 16. O. O. Sonmez and A. Gursoy. Comparison of pricing policies for a computa- tional grid market. In Parallel Processing and Applied Mathematics, pages 766–773. Springer, 2006. 17. M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: a wide-area distributed database system. The VLDB Journal, 5(1):48–63, 1996. 18. T. Westmann, D. Kossmann, S. Helmer, and G. Moerkotte. The implementation and performance of compressed databases. ACM SIGMOD Record, 29(3):55–67, 2000.