Temporal Preprocessor: Towards Temporal Applications Development © Boris Kostenko Moscow State University bkostenko@acm.org Ph.D. advisor: Sergey D. Kuznetsov Abstract back into the temporal one and returned to the user. We also selected layered approach, but we tried to integrate Effective querying and managing of temporal our layer into a source code of application and/or databases represent an unanswered challenge database system. Thanks to such integration, one can be to the modern research community. In this sure that a resulting application is working as effective paper, we introduce a temporal preprocessor as if a developer coded it himself. This integration can that can be used to aid in creation of temporal be done by using temporal preprocessor or compiler on applications and for testing various methods an application source code. We decided to start with the and approaches in temporal databases PHP programming language, because it has clear syntax implementations. We outline our proposal, and semantics and is powerful enough to manipulate current results and directions for further with arrays. Moreover, it has dynamic declarations, thus research and development. we can create functions and execute them in runtime. 1 Introduction 2 Related works Most database applications that are widely used in our During the last two decades, many researches were everyday life manage time-related data. We can name conducted in the temporal databases area of research. nearly any of reservation, schedule, planning, Some of them dealt with design of temporal databases. monitoring and managing applications as an example. Others propose methods of effective indexing, Databases that store time-varying data are called implementation of join statements and so on. One of the temporal databases. first temporal query language TQuel and its partial During our research in the area of temporal implementation for the Ingres DBMS was introduced in databases, we decided to create a temporal extension [2]. Later many papers introduce temporal query that would meet the following requirements. First, it languages that usually extend SQL query language [3, must provide an effective solution for developing 6]. However, there were always two approaches to temporal applications implemented over commercial temporal database management system implementation: database management systems (DBMS) with minimum either build it from scratch or create a temporal additional effort from a developer. Second, this solution middleware. In [7, 8] the second (layered) approach must be flexible, so we can easily modify, test and concepts are discussed in detail. In [1] author gives compare different approaches and methods of temporal direct solutions how to express temporal queries in a queries implementation with each other. SQL language. However, these “SQL” solutions are not Temporal databases have been the focus of much always good because of many nested selects, joins, research work and many solutions have been proposed unions and dynamic tables manipulations. so far. Most temporal database system prototypes were Unlike [2, 3, 6] we do not start with temporal query based on layered approach, when a special layer or language definition, but we provide query functions for stratum between temporal user application and ordinary most useful temporal requests, so a developer just needs database management system was established. Such to provide a temporal clause in addition to ordinary layer converts temporal queries received from a user SQL query. On the other hand, we have additional into statements in SQL (or any other) query language opportunities for particular queries optimizations. and passes them to the DBMS. After that, the Another distinguishing feature of the proposal is the corresponding result from the database is translated tremendous flexibility both for developer who can alter predefined algorithm any time and for researcher who Proceedings of the Spring Young Researcher’s can easily add new methods and check them instantly. Colloquium On Database and Information Systems A public available working temporal DBMS – SYRCoDIS, Moscow, Russia, 2007 TimeDB [4, 5] is implemented on top of IBM Cloudscape 10 or Oracle 10g as a layer between user Changes to optimizer lead us to different methods of application and underlying DBMS. Thus, it needs temporal queries handling as well. additional resources to keep it alive in compare with Here we need to note that considerable part of logic compilation time solution, proposed in this paper. is encoded in “Rules” for optimizer and “Templates” We do not want to oppose our proposal to the for code generator. Because we selected PHP as our temporal DBMS as our solution complement to them in first target language and it is possible to dynamically the areas where developers need to create their own create a function from its source code and call it after temporal applications over relational DBMS because that then we can perform all needed temporal tasks in there exists no adequate product on a market or a runtime. Thus, we can create a ready to use temporal temporal DBMS cannot be used for any reason. database extension for PHP. A developer just needs to Moreover our preprocessor can be used with temporal include a library file, provide temporal processor with DBMS to help to reduce errors in a source code (and in table definitions that include field names and their types runtime if desired) and allow developers to formulate (with temporal ones), and call required temporal queries query statements clearly in a way they prefer. with temporal processor runtime proxy. As noted above, many researches of temporal 3 Architecture and implementation issues database extensions started with introduction of a query language extension and then demonstrated how The proposed extension has 3-modules architecture (see different temporal requests could be expressed with this Figure 1). Temporal queries are processed in the query language. Unlike them, we chose another following sequence: query > parser >> parsed query > approach. We assume that a developer knows what optimizer >> script (plan) > code generator >> source information he wants to retrieve from a database. code. Parser receives a query from a user and converts it Actually, he does not even need a database query result. to the internal representation format (parsed query). The main aim of database data retrieval is to populate Next optimizer analyzes parsed query and selects some structures or arrays in a program with certain data. processing method according to the database tables Moreover, it is often more natural for a developer to definitions. Optimizer generates a script (plan) that formulate his data requirements in natural language than describes how to get temporal query result from the to translate them into a query language and then current DBMS. After that, the code generator takes the retranslate query execution results back. We also script and produces a source code. The returned source suggest that optimizations that are more specific are code is the code that performs the specified temporal better than common ones. Therefore, we introduce not query on top of an existing relational database. This query language but query functions. source code is stored as a function and developer just To understand better differences between them needs to replace an initial query execution with the call consider the next two “queries”: query(“select * from of this function. We use placeholders in temporal employees where company_id = 17”)’, and queries so each function call can be parametrized ‘query_with_company_id(“select * from employees”, according to the current program state. 17)’. The results of these two calls will be identical (function names are self-explanatory), but in general with the second approach we will get more stable and Query Parsed query Script (plan) errorless solution. Moreover we can use next query ‘query_employees_table_with_company_id(17)’ that is even more specific. In most cases later variants are better to use, because they are more obvious to Parser Optimizer Code generator developer, there are more ways of optimizations and there is less chance of a wrong usage. So a developer Rules Templates can create such parameterized query functions for often Database used queries. In any case one can always use the most metadata general query function to express his query. Source code In case of temporal databases, we use several DBMS predefined query functions to formulate statements that use time and time relations. Thus, we selected most useful and high-usage queries to create convenient Figure 1. Preprocessor architecture functions for them. Almost all “single state” requests can be performed with query_on_time() function, which In the introduced architecture, each module limits query to a specified point of time. In order to use performs single step regardless of other modules and relations and aggregate functions on timeline we need steps. Thus, we can easily extend, modify or replace to add functions like query_following_events() and any module until we follow input/output format query_timeline_aggregates(). To successfully execute conventions. Fox example, we can extend code queries with projections and database modifications we generator, and get a temporal processor for another needed a set of functions for time intervals set programming language, including DBMS languages. operations. These functions provide us with an ability to fold time intervals and merge them to archive normalized state of valid-time interval in database if [3] R.T. Snodgrass, M.H. Boehlen, C.S. Jensen, and A. needed. Steiner. Adding Valid Time to SQL/Temporal. There are two components in querying with Change proposal, ANSI X3H2-96-501r2, ISO/IEC functions: one can use only simple query function but JTC 1/SC 21/WG 3 DBL-MAD-146r2, November formulate all queries fully, or formulate query only once 1996. and create a query function for it and after that call this [4] TimeDB - A Bitemporal Relational DBMS. Web function with actual parameters. But in case of temporal site, May 2005. queries we know how to process temporal columns, http://timeconsult.com/Software/Software.html that’s why we provide some predefined query functions, [5] TimeDB - A Bitemporal Relational DBMS. get less information about particular queries from a TimeDB release version 2.2. (Zip-archive, Java and developer, and provide some optimizations. JDBC), May 2005. The proposed temporal preprocessor is responsible http://timeconsult.com/Software/TimeDB 2.2.zip for implicit query functions generation based on user [6] David Toman. Point-Based Temporal Extension of queries. So a developer can provide more specific Temporal SQL. In Proceedings of the 5th “Rules” and “Templates” and achieve better International Conference on Deductive and Object- performance, or just use standard ones. Oriented Databases, pages 103-121, 1997. [7] Kristian Torp, Christian S. Jensen, and Michael 4 Results achieved and future work Böhlen. Layered Temporal DBMS: Concepts and Techniques. In Database Systems for Advanced The research and implementation are not finished yet, Applications '97, Proceedings of the Fifth thus we cannot provide performance graphs, and International Conference on Database Systems for comparison charts between different methods. Advanced Applications (DASFAA), pages 371-380, However, current results show that we have already 1997. achieved simplicity of use and flexibility for further [8] K. Torp, C. S. Jensen, and R. T. Snodgrass. research and experiments. We also noted that temporal Stratum Approaches to Temporal DBMS applications development process is speed up, because Implementation. In Proceedings of IDEAS, Cardiff, less time required formulating correct queries and Wales, pages 4-13, 1998. testing them. Parser facilities are limited now and we plan to extend them in order to parse rather complex temporal queries. Another goal is implementation of code generators for more programming languages so we will have the ability to compare performance issues of various methods. We also need to add reliable performance counters and find adequate formula to estimate and to compare effectiveness of different approaches. Further research of how to retrieve information grouped by timeline fields and how to aggregate them better is desired as well. 5 Conclusion In this paper, we introduced a solution for effective and easy development of temporal applications on top of existing relational DBMSs. The proposed solution also provided us with an opportunity to test and compare different models and approaches in temporal databases implementation. The achieved results gave us new ideas and showed possibilities of success in further research and development. References [1] Richard T. Snodgrass. Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann Publishers, Inc., San Francisco, July, 1999, 504+xxiii pages. [2] Richard T. Snodgrass. The Temporal Query Language TQuel. ACM Transactions on Database Systems 12(2), June 1987, pp. 247–298.