Towards a grammar formalism for retrieving information on the Semantic Web ∗ Eunhye Shin Sujin Yoo Seongbin Park Korea university Korea university Korea university dc20005245@korea.ac.kr mynameislydia@korea.ac.kr hyperspace@korea.ac.kr ABSTRACT The motivation of using a two-level grammar is that it al- In this paper, we report our ongoing research on a grammar lows to specify context-sensitive information in an intuitive formalism for retrieving information on the Semantic Web. way. Our approach allows a user to express facts that con- We view the Semantic Web as a collection of databases and tain certain parameters which reflect structures of databases use a two-level grammar by which one can specify context- that constitute the Semantic Web. In other words, the for- sensitive constraints that need to be satisfied. To retrieve malism allows users to specify certain facts together with information, a user can simply specify keywords and our placeholders that can be instantiated using the data stored system can show a result that is a string derived using the in databases. two-level grammar. The structure of this paper is as follows. Section 2 describes related research works. Section 3 explains the idea behind Categories and Subject Descriptors our approach using illustartive examples. Section 4 describes H.3.3 [Information Search and Retrieval]: Query for- the structure of the system that we implemented. Section 5 mulation; D.4 [Information Systems Applications]: Mis- concludes the paper and discusses research directions. cellaneous; H.5.4 [Hypertext/Hypermedia]: Architectures 2. RELATED WORKS 1. INTRODUCTION There are two research areas that are related to our re- The Semantic Web is an environment where information is search. One is research about two-level grammar and the represented in a machine understandable way. One way other is retrieving information on the Semantic Web. Two to view the Semantic Web is that it consists of databases level grammar was introduced to define the syntax of AL- that can help computational agents perform various kinds GOL 68 by van Wijngaarden [3]. There are two types of of tasks [1]. rules in a two-level grammar. One is a metarule and the other is a hyperrule. A metarule is a context-free produc- In this paper, we propose an approach to write context- tion rule and it can provide possible values for a metanotion sensitive constraints using a grammar in order to retrieve in a hyperrule. A hyperrule can describe context-sensitive information on the Semantic Web. To this end, we use a conditions and this is a machanism by which we can model two-level grammar that is a 6-tuple (M, V, T, RM , RV , S), constraints in a database or between databases. A two-level where M is a finite set of metanotions, V is a finite set grammar has been used in specifying the syntax of a natural of syntactic variables such that M ∩ V = ∅, T is a finite language which reflects grammatical constraints [4]. It has subset of V + , RM is a finite set of metarules X → Y , where been also applied to define a programming language [2, 5]. X ∈ M , Y ∈ (M ∪ V )∗ or Y is a regular expression, and One way to retrieve information on the Semantic Web is to for all W ∈ M , (M, V, RM , W ) is a collection of context- use a query language, but traditional database query lan- free grammar and regular expression rules, RV is a finite guages are not appropriate and users need a semantic query set of hyperrules of the form H0 → H1 , H2 , · · · , Hm , where language such as SPARQL [6, 7]. In the mean time, for end m ≥ 1 and H0 ∈ (M ∪ V )+ , Hi ∈ (M ∪ V )∗ for i ≥ 1, Hi users, a system such as SPARK [8] that can convert keyword is a hypernotion, and S is a string of positive length over queries into SPARQL queries can be helpful. M ∪ V , respectively [2]. ∗Corresponding author 3. ILLUSTRATIVE EXAMPLES In this section, we show how we can describe context-sensitive information using a two-level grammar. A string that can be derived using the grammar corresponds to certain infor- mation that can result from combining data that exist in the databases. There are two types of examples. The first exam- ple shows the case where we use one database that contains some number of tables. The second example shows how we can use two databases. DIR 2013, April 26, 2013, Delft, The Netherlands. 3.1 Example 1 Figure 1 shows the ER-Diagram of an example database (Travel database). There are five tables, where PK refers to Table 3: Showplace Table a primary key and FK refers to a foreign key, respectively. S_id P_id Name Open_time 401 300 Buckingham Palace 11pm(10pm, Sun) 420 302 Hokkaido 11pm Table 4: Package Table P_id A_id Name cost 300 204 15 days in West Europe $2700 302 200 Hot Spring in Tokyo $1400 ⇒ 2012 − 7 − 2, 1001 1001 Sujin travels S ID S P A ⇒ 2012 − 7 − 2, Sujin travels S ID S P A ⇒ · · · ⇒ 2012 − 7 − 2, Sujin travels Hokkadio with Hot Figure 1: Travel database Spring in T okyo package in Hana tour agency. Tables 1 to 5 show data stored in the database tables. Figure 2 shows how parameters in hyperrules can be instan- tiated and our system shows the string at the end of the derivation. The derivation starts by using the first hyper- Table 1: Itinerary Table rule, START : I U S P A, where each of I U S P A is replaced I_id U_id Neccessary_time date by the corresponding metarule; i.e., I is replaced by DATE, 350 1002 2 hours 2012-8-1 U ID, U is replaced by U ID U NAME travels S ID, etc. A 353 1001 5 hours 2012-7-2 hyperrule which starts with “where” is applied in order to check context-sensitivity. For example, the U ID from I (i.e., DATE, U ID) and the U ID from U (i.e., U ID U NAME travels S ID) disappear when the hyperrule, “where U ID is Table 2: User information Table U_id S_id Name Address U ID :true” is applied. 1001 420 Sujin Yangcheon-gu, Seoul 1002 401 Bob Gangnam-gu, Seoul 3.2 Example 2 In order to show how the same approach can be used with multiple databases, we added a university database that Given this database, we can write metarules and hyperrules consists of two tables. In addition, we modified the travel so that a string whose structure looks ’( ), ( ) travels ( ) with database. Customers in the travel database are students in ( ) package in ( ) agency.’ can be derived as follows, where the university database. Buy table contains the information ( ) is a placeholder that can be instantiated with the data about package purchasing for each customer and user name stored in the database tables. of travel database corresponds to student name of university database (figure 3). The metarule is as follows. Tables 6 to 8 show the data contained in the database tables. I :: DATE, U ID Given these databases, we can write metarules and hyper- U :: U ID U NAME t r a v e l s S ID rules so that a string whose structure looks ’( ) is a Korea S :: S ID S NAME with P ID University student, majors in ( ) and takes a trip with ( ) P :: P ID P NAME pack age i n A ID package.’ can be dervied as follows, where ( ) is a place- A :: A ID A NAME agency . holder that can be instantiated with the data stored in the database tables. The hyperrule is as follows. The metarule is as follows. START : I U S P A where U ID i s U ID : true RELATION : : USER STUDENT where S ID i s S ID : true UNI : : UNI .NAME DEPART where P ID i s P ID : true TRAVEL : : TRAVEL.NAME PACKAGE where A ID i s A ID : true Assuming that the tables contain data shown in table 1 to Table 5: Agent Table table 5, a possible derivation looks as follows. A_id Name Address Phone_num ST ART ⇒ I U S P A ⇒ DAT E, U ID U S P A 200 Hana tour Seocho-gu, Seoul 02-993-2941 ⇒ 2012 − 7 − 2, 1001 U S P A 204 E agent Bucheon, Gyeonggi 031-424-4421 ⇒ 2012 − 7 − 2, 1001 U ID U N AM E travels S ID S P A Table 6: Buy Table B_id U_id P_id Date 465 1001 300 2011.11.01 274 1002 302 2011.10.10 Figure 2: Derivation of a string Table 7: Student Table S_id P_id D_id Name Phone_num ID_num 1 150 100 Sujin Yoo 243-5678 051901 2 150 100 Bob 234-5784 011901 U NAME o f TRAVEL.NAME and U NAME o f USER : true (3) The following hyperrules verify whether data is con- nected correctly in Univ DB. D ID o f UNI .NAME i s same D ID o f DEPART i s same , D ID o f UNI .NAME and D ID o f DEPART : true (4) The following hyperrules verify whether data is con- nected correctly in Travel DB. P ID o f TRAVEL.NAME i s same P ID o f PACKAGE, P ID o f BUY i s same P ID o f PACKAGE, P ID o f TRAVEL.NAME, P ID o f PACKAGE and P ID o f BUY : true (5) The following hyperrule verifies whether purchasing data is connected correctly in Travel DB. Figure 3: University database and modified Travel where U ID i s U ID : t r u e database Figure 4 shows how parameters in hyperrules can be instan- STUDENT : : S Name tiated and our system shows the string at the end of this USER : : U Name figure. UNI .NAME : : S Name i s a Korea U n i v e r s i t y s t u d e n t , majors i n D ID 4. TLG SYSTEM DEPART : : D ID D Name Our system (TLG sysetm) has been implemented using Java TRAVEL.NAME : : U NAME and t a k e s a t r i p with and HSQLDB system [9]. The operation starts by taking a U ID BUY P ID keyword from a user. The Searching scale setting module BUY : : U ID P ID assigns a column according to the keyword. The Result PACKAGE : : P ID P NAME pack age . creating module matches the keyword against data in the assigned column. Finally, the Result printing module shows the result from result creating module as a string. Figure There are five types of hyperrules. 5 shows the structure of the TLG system, where numbers inside small circles correspond to steps involved in sequence. (1) The following hyperrule is the start rule. START : UNI RELATION TRAVEL 5. CONCLUSIONS (2) The following hyperrules verify whether student name and user name are the same or not. Table 8: Department Table S NAME o f UNIV.NAME i s same S NAME o f STUDENT D_id Name Phone_num o f RELATION, U NAME o f TRAVEL.NAME i s same 100 Computer Science Education 32-1234 U NAME o f USER o f RELATION, S NAME o f STUDENT 102 Mathematics 32-3456 i s same U NAME o f USER, S NAME o f STUDENT, Semantic Web browser [11] and how to extend the idea of a Semantic Web expression [12] in the context of linked data [13]. 6. REFERENCES [1] C.C. Marshall and F.M. Shipman. Which Semantic Web?. In Proceedings of the 14th ACM Conference on hypertext and hypermedia, 2003. [2] B. Edupuganty and B.R. Bryant. Two-level Grammar as a Functional Programming Language. In The computer journal, vol 32, no 1, 1989. [3] A.van Wijngaarden. Report on the Algorithmic language ALGOL 68. Numer 14: 79-218, 1969. [4] B.R. Bryant, D, Johnson, and B. Edupuganty. Formal specification of natural language syntax using two-level grammar. In Proceedings of the 11th International Figure 4: Derivation of a string Conference on Computational Lnguistics, 1986. [5] J. Maluszynski. Towards a Programming Language Based on the Notion of Two-Level Grammar In Theoretical Computer Science, 13-43, 1984. [6] R. Fikes, P. Hayes, and I. Horrocks. OWL-QL-a language for deductive query answering on the Semantic Web In Journal of Web Semantics: Science, Services and Agents on the World Wide Web. vol 2, issue 1, 19-29, 2004. [7] M. Arenas, C. Gutierrez, D.P. Miranker, J. Pérez, and J.F. Sequeda. Querying Semantic Data on the Web. In SIGMOD Record, vol. 41, no. 4, 2012. [8] Q. Zhou, C. Wang, M. Xiong, H. Wang, and Y .Yu. SPARK: adapting keyword query to semantic search In Proceedings og the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference, 694-707, 2007. [9] HSQLDB www.hsqldb.org [10] M.A. Rodriguez and J. Bollen. Modelling Computations in a Semantic Network. http://arxiv.org/abs/0706.0022, 2007 [11] Y. Kim, S, Yoo, and S. Park. A Semantic Web browser for novice users. In Proceedings of the 6th Figure 5: The structure of the system International Conference on Complex, Intelligent, and Software Intensitve Systems, 2012. [12] E. Shin and S. Park. Two level grammar and the In this paper, we report our ongoing research on how a two- Semantic Web. In Proceedings of the International level grammar can be used to retrieve information on the Conference on Applied and Theoretical Information Semantic Web. We view the Semantic Web as a collection Systems Research, 2012. of databases and constraints existing in the colecction can [13] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data - be specified using hyperrules of the formalism that we pro- The Story So Far. In International Jounral on posed. Semantic Web and Information Systems, 2009. The motivation of the current research is that once we have a declarative description about the Semantic Web using a formal grammar, it becomes possible to process the Semantic Web. In other words, a part of the Semantic Web can be fed into a computer program as an input and the program can parse the input and perform some task. This is in line with the goal of utilizing the Semantic Web as a representation medium for computations [10]. Currently, we are implementing a system that parses expres- sions defined using a two-level grammar and shows deriva- tion results. We are also working on ways by which the information on the Semantic Web can be exploited using a