Semantic, Digitization, Design and Implementation of Ontology in Social Internet-Services Nazish Mumtaz1, Abida Begum1, Bushra Gul1, Salma Noor1, Roman Odarchenko2, Igor Machalin2 and Olha Saliieva3 1 Shaheed Benazir Bhutto Women University, Peshawar, Pakistan 2 National Aviation University Kyiv, Ukraine 3 Vinnytsia National Technical University, Vinnytsia, Ukraine odarchenko.r.s@ukr.net Abstract. Due to natural disasters, climatic changes, and lack of resources with the local heritage preservation authorities. Pakistan, in general, is losing some very important historical records. The heritage preservation is of immense im- portance to preserve our history and culture for the subsequent generations. With current advent in technology, the digitization of heritage has become cost- efficient and can be adopted by the relevant departments with the help of right expertise. This research uses the idea of Interoperable framework, machine readability and semantic representation for accessing digitally preserved infor- mation regarding the life history of Buddha from Gandhara Art work preserved at the museum of Peshawar. To enable semantic interoperability, we propose the Buddha domain ontology. It ensures the possibility of extracting meaningful and relevant data from a semantically represented repository of relevant data. The work opens ways to exploring Buddha life history and the data related to it in creative ways. This work will facilitate both the specialist and the novice web user the ease of access to desired information. Keywords: Ontology, Semantic interoperability, Semantic web, Culture herit- age, Buddha Ontology. 1 Introduction Civilization is the initial period of human life style. Which is converted to genera- tion to generation. The history of human life style and its related things provide the information. Specially that information which are the reason of change of human life. The initial history of Pakistan Indus valley civilization is also known as Harappa civi- lization. The seven historic periods were found from Mohenjo-Daro. One is the first period from 2500 to 3800 BC second are the middle period from 1700 to 2500 BC and the third are the last period from 1300 to 1700 BC was found from Mohenjo- Daro. Harappa was the big city of this civilization and the second was Mohenjo-Daro. The presence of these big cities reveals that these areas were very populated. The food was produced in large quantity because of fertile ground. This civilization reveals the presence of large forest, the shape of animals on the utensils and the bones show that Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attrib- ution 4.0 International (CC BY 4.0) CMiGIN-2019: International Workshop on Conflict Management in Global Information Networks. there were large number of animals which is used for agriculture and transportation. Harappa archeology show that there was a big factory of wheat grind which used to grind wheat. Big way of transportation of those people was boats, horses and camels. The fun and industrial life of the Indus valley civilization is pottery, instrument, and terracotta statue, marble statue [1]. The aim of this research is to preserve the history of Gandhara art. Pakistan need such kind of research, because of floods, earthquake, terrorism and similar hazards are eradicating our history so there is a need to preserve the historical rich culture of Pa- kistan. Due to modernization and less care from the local heritage preservation au- thorities, the culture history record of Pakistan, in general, is losing some very im- portant historical record. This research will support the idea of protection of history. If the heritage preservation is not given its importance any sooner, the subsequent socie- ty will not even be able to explore it through archeology. With current advent of in- formation technology, the digitization of any type of information has become very affordable and can be adopted by the relevant departments with ease. This will not only preserve our past and changing culture but also produce job opportunities. The basic outcomes of this research target the preservation of Gandhara art history in the best possible means, therefore we chosen 90 objects of Buddha artifacts from Gandhara art, to semantically digitized by implementing ontology. The ontology will overcome the issue and providing better access and knowledge discovery. This ontol- ogy will be able to express information among people and reuse of domain knowledge access and knowledge discovery [2]. 1.1 Ontologies “Ontology is the branch of “Meta Physics” concern with nature and relation of be- ing”. Ontology is decided what is of significance in an area and how information about it is arrange. Ontology is an explicit specification of conceptualization [3]. Ontology are used to describe the concepts in the specific domain and also describe relationship between these concepts. 1.2 Uses of Ontologies Communication. Ontology arranges the framework in organization which abridged the theoretical confusion so the ontology gets better the communication among people [3]. Interoperability. Ontology in field provides the enterprise model and multi agent design for creation of incorporated toolkit [4]. Reliability. Ontology develop the reliability of system [3]. Reusability. Ontology should maintain the reusability thus unlike model of sys- tem can be import or export [4]. The problem is to preserve the Pakistan history and culture by semantic digitization of Peshawar museum. Indus valley civilization is important and rich all over the world they were populated area because of fertilized ground and this valley was famous for trade and merchants came from different cities to exchange the merchandise and stayed. The problem in Pakistan is that Pakistan is not the main stream of global re- search so the project is initiative to place the Pakistan in global research. We choose ninety objects of Buddha artifacts for semantic digitization to fulfill this goal we im- plement in this research an ontology which will enhance support for semantic digiti- zation (converting analog data into machine readable form) of Peshawar museum. The ontology will overcome the issue and providing better access and knowledge discovery. This ontology will be able to express information among people and reuse of domain knowledge access and knowledge discovery. 2 Related Work World Wide Web contains millions of documents which allow the people to re- trieve the information from the database catalog. We need a search engine to extract the information. Now days there are many search engines to search the information but difficult to retrieve meaningful information to reduce this problem in search en- gine to extract meaningful information intelligently we use semantic web technology to provide better search engine. 2.1 Semantic Web Michele R. Ramos [3], project is to design and validate the activities of life, it pro- vides a generalized ontology of life events. The project used different previous ap- proaches such as prosopography project. The traditional prosopography is related to social trend among culture and time period in order to understand the society. The prosopography project database is specific for custody, property, income, ethnic or social relationships. It cannot support the generalized context of life events. The biog- raphy light ontology supports the general biographic text provide open outline for linking biographic events with diverse resources of semantic web Biography vocabu- lary only distinguish four basic events birth, marriage, death and bio Event. Semantic web is a collection of linked information in such a way which is easily process able by a machine on a global scale. “Semantic web is an extension of the current web” [3]. The goal of semantic web is machine process able an understanda- ble information on the web. The purpose of semantic web is to share and reuse infor- mation on the web. This literature review introduces the semantic web technology to support the intelligent search engine [1], [2]. Fig. 1. Semantic Web Framework Semantic web technology plays important role in semantic web that allow the meaning of data is to precise. Data on the semantic web is represent by using W3C standards RDF (Resource description frame work) and Owl (Ontology web Lan- guage) representation model which are used to represent ontology. This technology is used to support interoperability, automation and reuse of data. Current web contains large database but difficult to understand for the machine which is provided by user. When information is added on the web, we face research problem in search engine such as how search engine retrieves meaningful information intelligently. Semantic web solved this problem by semantic web technology to provide meaningful infor- mation by SPARQL query and domain ontology [4]. 3 Motivation The aim of this paper is to preserve the history in the best possible means using I.T. To achieve this, the available record will be some form of data, which will be made available with user friendly G.U.I. The prototype preservation system can be devel- oped in following steps: Digitization of available heritage resources; Converting data- base of manual records into digital form; Discover new knowledge and digitize it with reference; Develop database and front end or API for online access; Explore the method to improve accessibility and usability. Museum digitization motivate the API developer to create the API for Buddha Gal- lery of Peshawar museum, API developer will be able to create a meaningful applica- tion which present the objects in the form of narration and through this application the visitor will take interest in history and increase knowledge of visitor about history of subcontinent so in this way the civilization of Indus valley will be long lasting. 4 Proposed System Architecture To reduced problems of existing system we have design and implement ontology on Buddha life story. We use semantic web technology to semantically digitizing these objects. For semantically digitizing we designed and implement Buddha ontology. Semantic Search Engine Museum API SPARQL QUERY Buddha Ontology Fig. 2. Proposed System. Buddha ontology contains thirteen concepts which describe the Gautama Buddha life events. We use own vocabularies to describe the relationship between instance of each class. We use the approach of arc2 triple store to mapped Buddha ontology into RDF format. Arc2 triple store provide SPARQL endpoint interface to extract data using SPARQL query engine. Proposed system shows the flow of data in a system. It shows the system work. Arc2 store load Protégé ontology for storing in the form of triples. MySQL extract the data from file maker pro and load in arc2 triple store here the information is store in RDF format. SPARQL query engine extract the data from arc2 triple store by using SPARQL query language. SPARQL query engine provide the user interface. See Fig. 3. Fig. 3. Design Scenario of Buddha Ontology 4.1 Work Flow Development Process of Ontology We developed an ontology of Gautama Buddha life story by the following process. The purpose of developing this ontology is to narrate the life events of Gautama Bud- dha or “Siddhartha”. We use this ontology for application which we create in future work. The following steps were taken to implement Buddha ontology.  Domain.  Enumerated terms.  Class hierarchy.  Properties  Individual Fig. 4. Ontology development work flow 5 Advantages of Proposed System After semantic digitization these data available online on web which can be easily accessible and reuse. It facilitates the user to easily access data. It improves the search engine capability. It provides interoperability between one are more applications. User can use Buddha ontology for creating suitable application. 6 Implementation of Buddha Ontology In this section we discuss the ontology development process, ontology mapping, relational database and Xampp sever installation steps. We need following parts for implementation. Fig. 5, show the ontology mapping process using arc2 triple store. First of all, it takes data from file maker pro to create relational database using Mysql server. For storing this information in the form of triples arc2 triple store is takes Protege ontolo- gy and traditional database to store in RDF format. SPARQL endpoint queried the RDF data from arc2 triple store by using SPARQL query engine. Fig. 5. Ontology Mapping Process 6.1 Ontologies Development Problem Contextualization. The developed ontology is useful for more reason to share common structure information among people to reuse of domain information, to make domain supposition unambiguous, to examine area facts and to separate the domain knowledge from operational knowledge. In paper [5], the author using proté- gé for developing ontology. The artificial intelligent define many definitions of ontol- ogy. Define the concepts classes and property in area arranging the classes in hierar- chic form. The ontology progress is the iterative process. There are many methodolo- gies of developing ontology. Ontology is the model of designing reality of the world. Seven steps of developing ontology as follow as:  Domain.  Scope.  Competency questions.  Enumerated terms.  Class hierarchy.  Properties.  Individual These steps are important during ontology development process. The domain of ontology is describing what the ontology is covered. The scope of ontologies explains the use of ontology for an application. Competency question is the type of question which answer by developing ontology. Enumerated terms are important for selecting properties and classes. Class hierarchy are the set of classes which we select from terms to use in ontology Properties are also select from enumerated terms for using in ontology [6]. Dataset. For this project we have taken data from Peshawar museum in which we have chosen 90 objects of Gandhara art gallery to narrate life story of Gautama Bud- dha [10]. Shortcomings. Data which we collect for semantic digitization from Peshawar museum database is not complete because of more fields in database of every objects is missing. Metadata of each objects is not properly defined. Description of objects is too short. Objects are not properly labeled to overcome these problems we collect 90 objects for semantic digitization to describe on semantic web for reuse [11]. Mining of Plan Texts. Text Mining is the process of extracting information from the documents (PDF, XML, Plain text e.g.). We are use GATE software for text min- ing to extract information from the document “Buddha”. Buddha document is plain text document, for knowledge discovery that we have created to appending all the information from the museum record about the Buddha gallery in which include Bud- dha artifacts that represent Gandhara art of stone. 6.2 Buddha Ontology We plan to use Buddha ontology for the application which narrates the life story of Siddhartha. Basically, different concepts of Buddha life story are existing which is not meaningful. But over ontology generate the new meaningful concepts related to tinny objects. The aim is to cover different events of life of Buddha and define the concept of each term used in the story. 6.3 Class Hierarchy We determine classes from important enumerated terms of Buddha life story which show in figure. Fig. 6. Class Hierarchy of Buddha Ontology. 6.4 Ontology Fig. 7. Buddha Ontology 6.5 Natural Language Process The NLP process comprises of the following main steps as discussed in the fig be- low: Fig. 8. NLP Process NLP Steps on Buddha Document. Following are the Steps: 1. Load Buddha document in GATE. 2. Create corpus (corpora are a collection of documents, useful when processing multiple document) for Buddha document. 3. Select ANNIE (stand for A Nearly-New Information Extraction to extract in- formation) and run application with corpus of Buddha. 4. Select processing resources (is a set of parameters). 5. Run pipeline with selected resources (pipeline is a set of selected parameters that set for document). 6. Create data stores (data store is used to store annotated documents). 7. Store Buddha document to data store. Fig. 9. NLP on Buddha Documents 6.6 Results of NLP on Buddha Document IE gate/extract facts and structured information from Buddha document collection. IE returns knowledge in the form of name entity. Name entity is much important to identify name in text and their classification into predefined categories of interest: Persons; Organizations (companies, government organizations, committees etc.); Locations (cities, countries, rivers etc.); Date and time expressions; Addresses; Sen- tences; Splitters; Various other types as appropriate. Fig. 10. Information Extraction from Buddha Document Fig. 11. Information Extraction from Buddha Document 7 Ontology Evaluation and Validation Ontology evaluation and validation is important to check during development of ontology to verify that the ontology fulfills the requirement and check the incom- pleteness and redundancy of classes and instances. Fig. 12. Buddha Ontology Validation and Evaluation 8 Conclusion This work will be useful for reusing and help support better exploration of the arti- facts and new knowledge discovery and will eventually be used as a major contribute of the Peshawar museum API developers. Buddha ontology will make this history more explicit and will be help in developing API. Our future work will include API development. Ontology of Buddha life story main gallery of Peshawar museum which is useful for semantic digitization and also mapped the relational database into RDF data. The common information about Buddha life story is identified which is useful for the preservation of Gandhara art and Buddha ontology make these histories more explicit help in developing the application which narrate the Buddha life in future work. Buddha ontology cover the events of whole life of Buddha. It is related to bio- graphic light event ontology but it is specific for Buddha life events. Ontology pass through the GATE software which generate new concepts. References 1. Jacob, Elin K. "Ontologies and the semantic web." Bulletin of the American Society for Information Science and Technology 29, no. 4 (2003): 19-22. 2. Xu, Xin, and Hubo Cai. "Semantic frame-based information extraction from utility regula- tory documents to support compliance checking." In Advances in Informatics and Compu- ting in Civil and Construction Engineering, pp. 223-230. Springer, Cham, 2019. 3. Khan, Shakeel Ahmad, and Rubina Bhatti. "Semantic Web and ontology-based applica- tions for digital libraries: An investigation from LIS professionals in Pakistan." The Elec- tronic Library 36, no. 5 (2018): 826-841. 4. Endara, Lorena, G. Burleigh, Laurel Cooper, Pankaj Jaiswal, and Marie-Angélique Laporte. "A natural language processing pipeline to extract phenotypic data from formal taxonomic descriptions with a focus on flagellate plants." (2018). 5. Noy, Natalya F., and Deborah L. McGuinness. "Ontology development 101: A guide to creating your first ontology. 2001." See http://protege. stanford. edu/publications (2004). 6. Malik, Sanjay Kumar, Nupur Prakash, and S. A. M. Rizvi. "Ontology Creation towards an Intelligent Web: Some Key Issues Revisited." International Journal of Engineering and Technology 3, no. 1 (2011): 44. 7. van Rotterdam, Jeroen M., Michael Mohen, Ravi Ranjan Jha, and Sreecharan Shroff. "Memory centric database architecture." U.S. Patent 10,049,041, issued August 14, 2018. 8. Khobragade, P. V., and Nilesh Uke. "Cogent Sharing of Covert File Using Audio Crypto- graphic Scheme." International Journal of Applied Information Systems 1, no. 8 (2012): 1- 4. 9. Beno, Martin, Erwin Filtz, Sabrina Kirrane, and Axel Polleres. "Doc2RDFa: Semantic An- notation for Web Documents." (2019). 10. Cressey, Gillian. Diaspora youth and ancestral homeland: British Pakistani/Kashmiri youth visiting kin in Pakistan and Kashmir. Brill, 2006. 11. Al-Azzeh J.S., Al Hadidi M., Odarchenko R., Gnatyuk S., Shevchuk Z., Hu Z. Analysis of self-similar traffic models in computer networks, International Review on Modelling and Simulations, № 10(5), pp. 328-336, 2017. 12. Odarchenko R., Abakumova A., Polihenko O., Gnatyuk S. Traffic offload improved meth- od for 4G/5G mobile network operator, Proceedings of 14th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET-2018), pp. 1051-1054, 2018. 13. Z. Hassan, R. Odarchenko, S. Gnatyuk, A. Zaman, M. Shah, Detection of Distributed De- nial of Service Attacks Using Snort Rules in Cloud Computing & Remote Control Sys- tems, Proceedings of the 2018 IEEE 5th International Conference on Methods and Systems of Navigation and Motion Control, October 16-18, 2018. Kyiv, Ukraine, pp. 283-288. 14. M. Zaliskyi, R. Odarchenko, S. Gnatyuk, Yu. Petrova. A.Chaplits, Method of traffic moni- toring for DDoS attacks detection in e-health systems and networks. CEUR Workshop Proceedings, Vol. 2255, pp. 193-204, 2018.