Semantic Hadith: Leveraging Linked Data Opportunities for Islamic Knowledge Amna Basharat Bushra Abro Dept. of Computer Science Dept. of Computer Science University of Georgia Islamic International University Athens, GA, 30605 USA Islamabad, Pakistan amnabash@uga.edu bushraabro@hotmail.com I. Budak Arpinar Khaled Rasheed Dept. of Computer Science Dept. of Computer Science University of Georgia University of Georgia Athens, GA, 30605 USA Athens, GA, 30605 USA budak@uga.edu khaled@uga.edu ABSTRACT nah (way of life) of the Prophet Muhammad. The later is While the linked data paradigm has gathered much atten- contained with the vast body of Hadith literature [22]. For- tion over the recent years, the domain of Islamic knowledge mally, the Hadith is defined as the (recorded) narrations of has yet to cache upon its full potential. The web-scale in- the sayings and deeds of the Prophet Muhammad. tegration of Islamic texts and knowledge sources at large Our research primarily is motivated to overcome the in- is currently not well facilitated. The two primary sources herent knowledge acquisition bottleneck in creating seman- of the Islamic legislation are the Qur’an and the Hadith tic content in semantic applications. We have established (collections of Prophetic Narrations) and form the basis of how this is particularly true for knowledge intensive domains laying the foundation for anyone wanting to learn Islam. such as the the domain of Islamic Knowledge, which has This paper presents ongoing design and development efforts failed to cache upon the promised potential of the semantic to semantically model and publish the Hadith, which holds web and the linked data technology; standardized web-scale a primary position as the next most important knowledge integration of the available knowledge resources is currently source, after the Qur’an. We present the design of the linked not facilitated at a large scale [7]. data vocabulary for not only publishing these narrations as linked data, but also delineate upon the mechanism for link- 1.1 Background Context and Motivation ing these narrations with the verses of the Qur’an. We es- tablish how the links between the Hadith and the Qur’anic 1.1.1 Importance of Hadith verses may be captured and published using this vocabu- To understand the important of Hadith, the principles of lary, as derived from the secondary and tertiary sources of Qur’anic understanding and the science of tafseer or exege- knowledge. We present detailed insights into the potential, sis must be considered. The verses in the Qur’an cannot be the design considerations and the use cases of publishing this understood in isolation. The Hadith are used to illustrate wealth of knowledge as linked data. the Historical context, the reasons for revelation and elab- oration of essential concepts that may not be directly evi- CCS Concepts dent. This important principle has been adopted by scholars across centuries to write scholarly commentaries and expla- •Information systems → Multilingual and cross-lingual nations. Infact, it is a necessary condition to produce an ac- retrieval; Information extraction; •Computing method- curate tafseer of the Qur’an as explained in detail by Philips ologies → Ontology engineering; [30]. To explain this principle, as an example, consider the Fig- Keywords ure 1, a derived snapshot taken from QuranComplex1 , the linked data; hadith;Quran;Qur’an; semantic web; Islamic official manuscript, with a translation and a commentary, knowledge; provided by the Kingdom of Saudi Arabia. The snapshot shows two verses from the first chapter of the Qur’an. The translation is annotated with a commentary (given in the 1. INTRODUCTION footnotes in this case) in order to provide additional details The vast amount of Islamic Creed and legislation derives where important. It is worth noticing that most authentic itself from and is based priamrily on the two most funda- and reliable commentaries would draw knowledge from the mental sources of Islam: namely the Qur’an and the Sun- sources of Hadith. In the case of this snapshot, the verse 2 Copyrights held by author/owner(s) contains an annotation which provides an elaboration based WWW2016 Workshop: Linked Data on the Web (LDOW2016),Montreal, on an authentic Hadith, from one of the many collections Canada 1 http://qurancomplex.gov.sa/Quran/Targama/Targama.asp Figure 1: A snapshot of a typical Qur’anic Commentary of Hadith, called Sahih Bukhari, which is known to be the We review some of the state of the art towards computa- most authentic and reliable Hadith collection. tional approaches applied to Hadith texts in Section 6. Here, we would like to emphasize that interlinking the Qur’anic 1.1.2 Motivation: Potential for Knowledge Formal- verses and the Hadith is a non-trivial task. We summarize ization and Linking some factors that make this extremely challenging. Most There are hundreds and thousands of Qur’anic commen- of the classical sources do not use a standardized number- taries produced over the last few centuries, in various lan- ing scheme for the Hadith. This is contrary to the Qur’anic guages that draw upon and rely heavily on the Hadith sources verses which have a standardized numbering scheme. There to provide an iterpretation of the Qur’anic verses. Given this are multiple sources of the Hadith, which may have differ- fact, the potential for knowledge formalization and linking is ent levels of authenticity which is a matter of discussion not only evident, rather it cannot be overemphasized. For- beyond the scope of this paper. Despite the fact that most mally modeling this wealth of knowledge and the links would Hadith collections have now been classified into authentic enable new ways of research and knowledge discovery and categories, the mapping of this classification to the sources synthesis - the very motivation for this research. However, that cite them is only possible if the Hadith are extracted realizing this vision to span across the plethora of Islamic re- and linked in a formalized manner. In addition, to add to the sources is a mammoth task. We present some key challenges challenge, the Hadith are of varying length, and oftentimes presented. the commentator or the tafsir scholar will only quote a part of the Hadith or make a passing reference to it, making it 1.1.3 Challenges in Interlinking Islamic Knowledge extremely difficult to trace the original Hadith being cited. Sources To add to the challenge, several Hadith may have common portions of narrations, therefore it makes it all the more There have been some recent efforts to publish Islamic challenging to identify, which exact Hadith is being quoted knowledge as linked data on the Linked Open Data (LOD) or referred to. We believe that a knowledge formalization cloud. The efforts primarily focus on the Qur’an. The two and linking mechanism, using the linked data standards, is datasets that we consider in our research and attempt to link the way forward for solving some or more of these challenges. with our Semantic Hadith research include SemanticQuran 2 [34] and QuranOntology 3 [16]. 1.2 Contributions of the Paper However, there are no known publically available sources In this paper we make the following contributions: of data or vocabularies published as linked data for the Hadith. There are number of well known Hadith repos- • We provide the first of its kind linked data model, itories available, which provide the provision of browsing called Semantic Hadith for publishing Hadith as Linked and searching the hadith collections such as sunnah.com, Data and for linking with other key knowledge sources dorar.net being the most prominent ones. in the Islamic domain, primarily the Qur’an. 2 http://datahub.io/dataset/semanticquran • We present a classification of the various levels of links 3 http://www.quranontology.com that may potentially be established between the Ha- Figure 2: A Sample Hadith Snapshot dith, the Qur’an and other data sets on the linked data Figure 3 shows the conceptual model for publishing Ha- cloud. This classification spans various levels of gran- dith data on the LOD cloud. Here we summarize the key ularity. We highlight the linking challenges and design entities and relations that we chose to include in the concep- issues with each one and present potential modeling tual design model of the Semantic Hadith ontology schema. solutions. • Hadith: This is the central entity in the domain model. • We provide a knowledge extraction, linking and pub- Since there had been no standardized numbering scheme lishing framework that may be reused for publishing for the Hadith since the beginning, a few alternate similar knowledge and linked with the existing linked numbering schemes may be encountered, therefore the data cloud. We present our preliminary implementa- provision to include alternate numberings is made. tion of this framework. • Matn: This is primarily a textual entity, which con- tains the main narration of the Hadith, without the 2. ONTOLOGY FOR SEMANTIC HADITH chain or narrators or the Sanad. We first present an illustration of the structure of the Ha- dith, and then detail upon the design of the ontology for • Narrator: A Narrator is essentially a Person, with the Semantic Hadith. special role of a narrator of the Hadith. One narra- tor may have many Hadith attributed to him or her. 2.1 Hadith Structure If a narrator is the root narrator of the Hadith, then Figure 2 shows a sample of a Hadith taken from sun- a Hadith is usually attributedTo him/her. This is nah.com4 . A given Hadith has two main parts: the ac- shown by the relation between the Hadith and Nar- tual narration or the content portion of the Hadith is called rator. Notice in the Figure 2, the english translation Matan, and the chain of narrators(reporters) through whom does not provide the entire NarratorChain, rather it the narration has been transmitted and then recorded is only provides the name of the narrator to whom the traditionally known as the Sanad or simply the chain of Hadith is attributed to. However, this is not the case narrators. The Sanad is a chronological chain of narrators, for the Arabic (original) version of the Hadith, which each mentioning the one from whom he heard the Hadith all usually contains the entire chain of narrators. The the way to the prime narrator of the Matan followed by the chain is often omitted in the books for simplifying the Matan itself [32]. The Sanad plays the most important role hadith text for the reader and making it more mean- in determining the authenticity of the Hadith, which is the ingful and relevant. However, the NarratorChain is most crucial indicator Scholars resort to when determining considered indispensable for determining the validity whether to accept or reject a Hadith. and authenticity of the Hadith, especially if no other validation source is mentioned. 2.2 Ontology Schema • Sanad(NarratorChain): This is an entity which will 4 http://sunnah.com/bukhari/1 contain reference to a Narrator entity, and a level, Figure 3: High Level Design of Semantic Hadith Ontology which will indicate the sequence of the narrator in the granularity at which they are modeled. A Macro-Level Link chain. Same narrators may appear in many chains. is considered to be one where the source entity is either at the level of a Verse in the Qur’an or a Hadith in a Hadith • HadithClass: This indicates the authenticity level of Collection. If a link is established for a group of Verses or the Hadith. These are detailed in [32]. Hadith, then it will also be considered at the Macro-level. A Micro-level link will be at a sub-verse, sub-Hadith or word • HadithChapter, HadithBook and HadithCollection: These or phrase level. For the scope of this paper, we would detail are entities meant for structural organization of the upon only the Macro-level links of the most essential types. Hadith. A Hadith is a part of a Chapter, which usu- ally contains thematically co-related collections of Ha- 3.1 Hadith-to-Hadith Links dith. Chapters are collected in Books and Books are As essential type of links to be established are those links, compiled as Collection or Volume. where by one Hadith is linked to or related to another Ha- dith. This could be done for Hadith which may be part 2.3 Vocabulary Design of the same collection; or it may be between Hadith that We choose the hvoc prefix for the SemanticHadith vocab- are part of different collections. These relations may be of ulary, as in the domain model. We also ensure reuse of the following primary types: 1) Two Hadith may be consid- well established linked data vocabularies such as FOAF5 [10], ered to be related if they have the same ’sanad’. 2) Two SKOS6 [27], and DublinCore7 [35]. We also provide equiva- Hadith may be considered to be related if they have the lence relations where applicable. Some of the most relevant same ’matan’. Note that two Hadiths may occur in the equivalence relations are with the bibo ontology8 . same collection, in two different chapters, under different thematic categorizations, however, they may be enumerated 3. LINK MODELING AND DESIGN ISSUES or numbered differently. Therefore, by asserting this Hadith One of the most important constituents in the design of as similar/related or identical, we aim to make these links Semantic Hadith, is the aspect of facilitating the interlink- explicit. Oftentimes, the same Hadith may be made part ing of knowledge at various levels. We have earlier described of a different collection and therefore, asserting an identity the Macro-Structure for Islamic Knowledge in [7]. We dis- link would become crucial. This is illustrated in Figure 4. tinguish between the nature of links based on the level of To handle the annotations between two Hadith, we define an entity called HadithRelation, for which the source and 5 http://xmlns.com/foaf/spec/ destination represent the two ends of the relation. The 6 https://www.w3.org/2004/02/skos/ relation would often have a common Theme. The Relation- 7 http://dublincore.org/ Type indicates whether the two Hadith are similar, indicated 8 http://bibliontology.com by Identity as the RelationType, or one Hadith may elab- Figure 4: Conceptual Design model for Hadith- Figure 5: Conceptual Design model for Hadith- Hadith Relationship Verse Relationship orate another indicated by Elaboration and so forth. These 3.2.2 Verse to Hadith Links based on Scholarly Com- relation types are not exhaustive and may be iteratively re- mentaries fined. Another important type of links to be established between the Hadith and the Qur’anic verses are shown in the model 3.2 Linking the Qur’an and Hadith as conceptualized in Figure 6. This is based on the earlier One of the most significant aspects of linking the Hadith motivation, provided on the basis of Figure 1. In this type dataset is with the verses in the existing Qur’an datasets. of relation, we create an entity Verse-Hadith-Relation. In We distinguish two types of relationships that may occur be- this case, the source is a Verse and the destination is a Ha- tween the Qur’an and the Hadith: 1) There may be Verses, dith. The reason is that the Hadith will always be used to entire of which or part of which may be ’Cited’ or quoted in elaborate or provide the context for the verse in any given a Hadith. This is the most direct kind of relation that exists commentary or book of exegesis. The RelationType may be between a Hadith and a Verse. 2) The other relations are provided. In this relation type, the most important aspect based on those that can only be derived from Scholarly com- is establishing the source of the authority of the relation. mentaries. The design and modeling issues for both these This is established by the relation uponAuthorityOf with types are delineated further. a Scholar and a relationestablishedIn with a Book. The Book is naturally authoredBy the Scholar to whom the re- 3.2.1 Verse to Hadith Links based on Direct Cita- lation is attributed. tions A direct link between a Hadith and Verse is characterized 3.3 Linking Hadith with other Datasources as one whereby a Hadith contains within its main body a We aim to provide the provision of linking the Semantic complete verse or a meaningful portion of it. This is mod- Hadith with other available datasources in the LOD cloud. eled in the Figure 5. A Citation entity is created, which We present a high level view of the linked cloud model for is specific reference to a relation with its source as a Ha- Islamic knowledge in Figure 7. We also mention those data- dith and the destination as the Verse, indicating that its sources, which although are not directly available on the the Hadith that is encapsulating the Verse. It is considered LOD, present potential for linking. important that we characterize the CitationType as either Complete, Partial or In-Direct. A Complete Citation will 3.3.1 Linking with Existing Datasources in the LOD include the entire verse in the body of the Hadith and the Cloud Verse will be quoted as such. A Partial citation may only The two available datasources to which the Semantic Ha- contain part of the Verse in the body of the Hadith. To dith is linked to are the QuranOntology and SemanticQuran. indicate this, the sub-verse entity is introduced, which will Semantic Quran links itself to DBPedia 9 and Wiktionary 10 . identify the part of the Verse citedIn the Hadith. This is Links would be established between entities in the Hadith to indicated by the relation characterizedBy. It is important the ones in these two datasets to begin with. For this infor- to note that it is important to annotate and capture the 9 sub-verse, since there may be portions of the same verse http://dbpedia.org 10 that may be linked to different Hadith. http://wiktionary.dbpedia.org/ 4.1 Overview of the Framework The key stages of the framework shown in the Figure 8 in- clude: 1) Data Selection, where the data source is selected; 2)Vocabulary Design and Selection, where conceptual and formal knowledge modelling is carried out; 3) Knowledge Extraction, where the process of information and knowl- edge extraction is carried out; 4) RDF Generation, where the extracted knowledge is converted into the RDF format; 5)Publishing, Linking and Validation is done to make the converted RDF data available via a SPARQL endpoint; and 6) Consumption, is the last stage where the dataset now available as linked data may be consumed into applications. 4.2 Implementation Details We provide some key details of the ongoing implementa- tion process, about the dataset used for publishing as linked data, the knowledge extraction and linking mechanism. We summarize some key results and also highlight some chal- lenges and limitations faced in the implementation process. 4.2.1 Data Sources As the first Hadith repository to be annotated using the Semantic Hadith Model, we have taken the data of Sun- Figure 6: Conceptual Design model for Verse to Ha- nah.com, which is a structured data repository of some of the dith Relationship based on Scholarly Commentaries most well known and authentic collections of Hadith. The foremost collections are those of Sahih Bukhari and Sahih Muslim. Altogether, there are 11 collections in this dataset, mation extraction would be carried out. There are some im- with over 25,000 Hadith. portant datasources which are not directly part of the linked data cloud but have been made available through QuranOn- 4.2.2 Knowledge Extraction and Linking tology and SemanticQuran. These are shown in the Figure 7 namely: QuranyTopicshttp://quranytopics.appspot.com, For the initial implementation, we focused on extracting QuranCorpus11 , and Tanzil12 . some of the key relations explained earlier. One of an essential linking aspects would be to themati- We extracted Verse-Hadith Links from QComplex Com- cally map the QuranyTopics to those of HadithTopics. mentary14 . This is one of the only datasource through which we were able to extract numbered hadith references, which 3.3.2 Linking with other sources could be automatically mapped to the hadith collections available with us. An example of such a reference is shown There are other datasources that we plan to link with in in Figure 1. A pattern extraction module was designed to the future. Scholars database from a source such as Muslim- parse the contents of the commentary. The content of the ScholarsDatabase 13 or eNarrator (Hadith Isnad Ontology) verses, translation and the footnotes were segmented. The [32] [4]. The major limitation is that these sources are not mapping between the verses and the corresponding footnotes currently available in Linked Data format. However, they was easy, given the direct correlation. Pattern matching was present huge potential for linking. then applied to extract the collection name, volume number and the hadith number. This was then mapped to the num- 4. LINKED DATA PUBLISHING bers in our hadith collection. This can be challenging at times, because not all hadith collections use the same type FRAMEWORK AND IMPLEMENTATION of numbering convention. In such a case, it is non-trivial to In order for Semantic Hadith to become a defacto stan- map the hadith citation to the corresponding hadith in the dard and an integral part of the emerging Semantic Web repository. Human intervention will be required for valida- and the LOD cloud for the Islamic Knowledge domain, we tion. We were able to obtain and validate some 300 verse also aimed at providing a reusable framework for publishing to hadith relations. Since the commentary is not a detailed available Hadith based knowledge sources as linked data. one, rather comments are only sparingly included as foot- This is shown in Figure 8. As elaborated in Section 1.1.3, notes to the verse translations, it was expected that this there are multiple hadith repositories available. Therefore, number would be small. this reusable framework will benefit multiple hadith publish- We also performed text mining on the arabic text of the ers to not only expose their data, but also to establish equiv- hadith data to obtain the Hadith-Verse citations, as de- alence links with other repositories. This would be essential scribed in Section 3.2.1. For this, we developed a verse- towards realizing the vision of linked Islamic knowledge as extraction component, which implements a sub-string match- presented in [7]. ing problem, in order to detect complete or partial verses that may be cited in a given hadith. This is not trivial for 11 corpus.quran.com several reasons. Different verses span different lengths in 12 tanzil.net 13 14 http://muslimscholars.info http://qurancomplex.gov.sa/Quran/Targama/Targama.asp Figure 7: A view of the proposed and available Linked Data Cloud for Islamic Knowledge Sources the Qur’an. While some may be as long as an entire page’s Tables 1 and 2. Table 1 summarizes the statistics for some length of a standard book size, others may be as short as one of the key entities present in the dataset. or two words. Therefore, in order to determine, whether the Table 2 provides the raw count for the candidate relations verse is actually being quoted or cited in a hadith requires extracted under the different categories mentioned. It must further validation. Even applying a threshold, relative to be noted however, that the relations are not classified ac- the length of the verse, is not an optimal solution. Setting cording to any of the parameters mentioned in the design. a substantial minimal length was considered, but this may It is also worth mentioning that some of these relations may not guarantee a comprehensive coverage. For the first proto- actually be symmetric. type, only 1,325 expert validated links were asserted. In the sunnah.com data, these links may be found as hyperlinks to the verses on the site quran.com. Table 1: Entity Statistics in the Semantic Hadith In addition, similarity computation algorithms were de- Dataset(Sunnah.com) vised to extract Hadith-Hadith similarity relations. The 4,973 relations, listed in Table 2 are strongly similar Ha- Entity Count dith that have at least 60% of text in common. However, No of Collections 11 the challenge with this approach is that, it cannot be dis- No of Books 311 tinguished if the similarity is in the Sanad or the Matan or No of Hadith(Arabic) 25,934 both. The more meaningful similarities that are of interest No of Hadith(English) 18,040 are in the Matan of the hadith. In future experiments, we aim No of Chapters 8,968 to segment the Sanad and the Matan and extract respective similarity relations. While the similarity threshold for the current approach only took into consideration the common substring, we plan to conduct experiments with more mean- Table 2: Link Statistics in the Semantic Hadith ingful similarity measures such as Cosine, Jaccard and Pear- Dataset son correlation coefficient, as done in our work for Qur’anic verses [8]. Link Type Count Hadith-Hadith Relation 4,973 4.2.3 Results Hadith-Verse Relation(Citations) 1,325 Based on the dataset and experiments carried out, we Verse-Hadith Relation(Scholarly) 313 summarize some of the dataset and link statistics in the Figure 8: Linked Data Generation and Publishing Framework for Semantic Hadith 4.3 Existing Limitations and Proposed Solu- PREFIX rdfs: tions PREFIX hvoc: link extraction and validation. There is an obvious lack of PREFIX dcterms: structured knowledge sources, with well marked citations. PREFIX qvoc: Therefore, the Verse-Hadith links are extremely difficult to be extracted using mere computational means. Human con- select ?hadith_text ?surahNo ?verseNo tribution is a must. For this purpose, we intend to pursue ?ayahText ?ayahEng a crowdsourcing approach, based on our prior work[6]. We WHERE { not only intend to use crowdsourcing and human computa- ?verse hvoc:isRelatedTo ?hadith; tion methods for the purpose of knowledge acquisition, but hvoc:verseNo ?verseNo ; also for knowledge validation. Infact, we believe a hybrid hvoc:surahNo ??surahNo . human-machine computation methodology to be the only ?hadith hvoc:hadithId ?hId; indispensable means of being able to fulfill the vision for hvoc:hadithText ?hadith_text . linked Islamic knowledge at scale, while ensuring the desired reliability and authenticity. SERVICE { ?s qvoc:chapterIndex ?surahNo; qvoc:verseIndex ?verseNo; rdfs:label ?ayahText; 5. PROSPECTIVE APPLICATIONS rdfs:label ?ayahEng. The most significant benefit of realizing the linked data vi- FILTER (lang(?ayahEng) ="en" && sion for Islamic knowledge sources will be towards enabling lang(?ayahText) ="ar") semantics driven distributed knowledge search and retrieval. }} Most current applications in the Islamic domain only provide limited provision for semantic and conceptual search and re- This could be taken to another level, by adding another trieval beyond the traditional keyword based searches, upon level of federation, and querying the themes of the verse a single repository. With the Semantic Hadith model, the from the QuranOntology. first of its kind tools will now be possible that would let Qur’an and Hadith repositories to be queried and searched PREFIX rdfs: in a federated manner. PREFIX hvoc: the Semantic Hadith and Semantic Quran datasets. Given PREFIX dcterms: that a Verse-Hadith relation exists with the Semantic Ha- PREFIX qur: dith dataset, this query retrieves the arabic and english texts for the respective verse. select ?hadith_text ?surahNo ?verseNo ?tname works in this regard include [20], [18], [17], [19], [21]. There WHERE { are also work references with respect to mining the hadith ?verse hvoc:isRelatedTo ?hadith; for indexing and classification [2], [29]. Some recent efforts hvoc:verseNo ?verseNo ; have attempted to model the hadith as semantic ontologies hvoc:surahNo ?surahNo . [4] [32]. However, the efforts have focused on annotating ?hadith hvoc:hadithId ?hId; the different constituents of the hadith. None of these data- hvoc:hadithText ?hadith_text . sources are available as open source. Our work is the first of its kind to propose the linked SERVICE { data based model to propose the linking of hadith with the ?verse qur:DiscussTopic ?t. Qur’an. This linked knowledge forms a vital backbone to en- ?t rdfs:label ?tname. able better integration and discovery of knowledge sources. FILTER(LANGMATCHES(LANG(?tname), "ar")) }}} 7. CONCLUSIONS AND FUTURE WORK In this paper we presented the design and development of This could be further enhanced by automated interlinking our Semantic Hadith framework, which aims to provide the with other available datasources on the linked data cloud, foundation for semantically interlinking the most important as envisioned in Figure 7. For instance, once the available Islamic knowledge sources using the linked data standards. hadith are annotated with mentioned events, place or peo- We presented the design of the Semantic Hadith Ontology ple, they may be linked to the available entities in dbpedia. and explained the nature of links with other data sources. This would enable richer knowledge discovery and retrieval The implementation still needs to be matured. The valida- for a range of applications. tion of the links and extracted knowledge is a huge challenge We expect that using this model, more hadith and Qur’anic we are looking into. We are investigating into crowdsourcing exegesis repositories, that also rely on and cite heavily the models for knowledge acquisition and validation at scale. hadith sources, will be published in the linked data for- mat. This will enable the design and development of en- 8. ACKNOWLEDGEMENTS hanced learning tools for the Islamic domain, which will pro- We are thankful to the authors of sunnah.com for pro- vide efficient and personalized access to primary sources of viding us with the valuable datasource to carry out this re- knowledge, ensuring reliability and authenticity. Given that search. We would also like to acknowledge the efforts of Mr. these tools will give better access to meaningfully interlinked Muhammad Shoaib (Jeju National University, South Korea) knowledge, it will require less effort to find resources and ac- in extending his help with some of the experiments. cess knowledge beyond books. More content, both classical and contemporary, would become discoverable. 9. REFERENCES [1] H. S. Al-Khalifa, M. Al-Yahya, A. Bahanshal, 6. RELATED WORK I. Al-Odah, and N. Al-Helwah. An approach to The linked data approach has emerged as the de facto compare two ontological models for representing standard for sharing the data on the web.It provides a set quranic words. In Proceedings of the 12th International of best practices for publishing and connecting structured Conference on Information Integration and Web-based data on the web [9]. The linked data design issues provide Applications and Services, pages 674–678. ACM. guidelines on how to use standardized web technologies to [2] K. A. Aldhlan and A. M. Zeki. Datamining and set data-level links between data from different sources[23]. islamic knowledge extraction: alhadith as a knowledge Increased interest in the LOD has been seen in various sec- resource. In Information and Communication tors e.g. Education [11], [31], Scientific research [3], libraries Technology for the Muslim World (ICT4M), 2010 [28], [25], Government [12], [24], [33], Cultural heritage [26] International Conference on, pages 21–25. IEEE, 2010. and many others, however, the religious sector has yet to [3] T. K. Attwood, D. B. Kell, P. McDermott, J. Marsh, cache upon the power of the linked open data. S. Pettifer, and D. Thorne. Utopia documents: linking Research in computational informatics applied to the Is- scholarly literature with research data. Bioinformatics, lamic knowledge has primarily centered around Morpholog- 26(18):568–574, 2010. ical annotation of the Qur’an [13], [14], Ontology modeling [4] A. Azmi and N. B. Badia. itree-automating the of the Qur’an [1], [5], [15], [36], [37], and Arabic Natural construction of the narration tree of hadiths language processing [15]. The LOD take-up in the area of (prophetic traditions). pages 1–7, 2010. Islamic knowledge has been particularly extremely limited. As mentioned earlier, there have been some recent efforts [5] S. Baqai, A. Basharat, H. Khalid, A. Hassan, and to publish Islamic knowledge as linked data on the Linked S. Zafar. Leveraging semantic web technologies for Open Data (LOD) cloud. The efforts primarily focus on the standardized knowledge modeling and retrieval from Qur’an. The two datasets that we consider in our research the holy qur’an and religious texts. In Proceedings of and attempt to link with our Semantic Hadith research in- the 7th International Conference on Frontiers of clude SemanticQuran 15 [34] and QuranOntology 16 [16]. Information Technology, FIT ’09, pages 42:1–42:6, Much of the work in the Hadith sciences has focused au- New York, NY, USA, 2009. ACM. tomating the extraction of the Chain of Narrators. Some [6] A. Basharat, I. B. Arpinar, S. Dastgheib, U. Kursuncu, K. Kochut, and E. Dogdu. Semantically 15 http://datahub.io/dataset/semanticquran enriched task and workflow automation in 16 http://www.quranontology.com crowdsourcing for linked data management. International Journal of Semantic Computing, application to hadith indexing. In Applications of 8(04):415–439, 2014. Digital Information and Web Technologies, 2008. [7] A. Basharat, K. Rasheed, and I. B. Arpinar. Towards ICADIWT 2008. First International Conference on linked open islamic knowledge using human the, pages 107–112. IEEE, 2008. computation and crowdsourcing. In Proceedings of the [22] S. Hasan. An introduction to the science of Hadith. International Conference on Islamic Applications in Al-Quran Society, 1994. Computer Science And Technology, 2015. [23] T. Heath and C. Bizer. Linked data: Evolving the web [8] A. Basharat, D. Yasdansepas, and K. Rasheed. into a global data space. Synthesis lectures on the Comparative study of verse similarity for multi-lingual semantic web: theory and technology, 1(1):1–136, 2011. representations of the qur’an. In Proc. on the Int. [24] J. Hendler, J. Holm, C. Musialek, and G. Thomas. Us Conference on Artificial Intelligence (ICAI), pages government linked open data: semantic. data. gov. 336–343, 2015. IEEE Intelligent Systems, 27(3):0025–31, 2012. [9] C. Bizer, T. Heath, and T. Berners-Lee. Linked data - [25] L. C. Howarth. Frbr and linked data: Connecting frbr the story so far. International Journal on Semantic and linked data. Cataloging and Classification Web and Information Systems, 5(3):1–22, 2009. Quarterly, 50(5-7):763–776, 2012. [10] D. Brickley and L. Miller. Foaf vocabulary [26] J. Marden, C. Li-Madeo, N. Whysel, and J. Edelstein. specification 0.98. Namespace document, 9, 2012. Linked open data for cultural heritage: evolution of an [11] S. Dietze, S. Sanchez-Alonso, H. Ebner, H. Q. Yu, information technology. pages 107–112, 2013. D. Giordano, I. Marenzi, and B. P. Nunes. Interlinking [27] A. Miles, B. Matthews, M. Wilson, and D. Brickley. educational resources and the web of data a survey of Skos core: Simple knowledge organisation for the web. challenges and approaches. Program-Electronic Library In Proceedings of the 2005 International Conference and Information Systems, 47(1):60–91, 2013. on Dublin Core and Metadata Applications: [12] L. Ding, T. Lebo, J. S. Erickson, D. DiFranzo, G. T. Vocabularies in Practice, DCMI ’05, pages 1:1–1:9. Williams, X. Li, J. Michaelis, A. Graves, J. G. Zheng, Dublin Core Metadata Initiative, 2005. Z. Shangguan, J. Flores, D. L. McGuinness, and J. A. [28] E. Miller and M. Westfall. Linked data and libraries. Hendler. Twc logd: A portal for linked open The Serials Librarian, 60(1-4):17–22, 2011. government data ecosystems. Journal of Web [29] M. Naji Al-Kabi, G. Kanaan, R. Al-Shalabi, S. I. Semantics, 9(3):325–333, 2011. Al-Sinjilawi, and R. S. Al-Mustafa. Al-hadith text [13] K. Dukes, E. Atwell, and A. M. Sharaf. Syntactic classifier. Journal of Applied Sciences, 5:584–587, 2005. annotation guidelines for the quranic arabic [30] A. A. B. Philips. Usool at-Tafseer: The Methodology dependency treebank. In Proceedings of the of Qur’aanic Explanation. AS Noordeen, 2002. International Conference on Language Resources and [31] N. Piedra, E. Tovar, R. Colomo-Palacios, Evaluation, LREC 2010, 17-23 May 2010, Valletta, J. Lopez-Vargas, and J. A. Chicaiza. Consuming and Malta, 2010. producing linked open data: the case of [14] K. Dukes and N. Habash. Morphological annotation of opencourseware. Program: electronic library and quranic arabic. In Proceedings of the International information systems, 48(1):16–40, 2014. Conference on Language Resources and Evaluation, [32] Y. M. D. Rebhi S. Baraka. Building hadith ontology LREC 2010, 17-23 May 2010, Valletta, Malta, 2010. to support the authenticity of isnad. International [15] A. Farghaly and K. Shaalan. Arabic natural language Journal on Islamic Applications in Computer Science processing: Challenges and solutions. ACM And Technology, 2(1):25–39, 2014. Transactions on Asian Language Information [33] N. Shadbolt, K. O’Hara, T. Berners-Lee, N. Gibbins, Processing, 8:1–22, 2009. H. Glaser, and W. Hall. Linked open government [16] A. Hakkoum and S. Raghay. Ontological approach for data: Lessons from data. gov. uk. IEEE Intelligent semantic modeling and querying the qur’an. In Systems, 27(3):16–24, 2012. Proceedings of the International Conference on Islamic [34] M. A. Sherif and A.-C. N. Ngomo. Semantic Quran - a Applications in Computer Science And Technology, multilingual resource for natural-language processing. 2015. Semantic Web, 6(4):339–345, 2015. [17] F. Harrag. Text mining approach for knowledge [35] S. Weibel, J. Kunze, C. Lagoze, and M. Wolf. Dublin extraction in sahih al-bukhari. Computers in Human core metadata for resource discovery. Technical report, Behavior, 30:558–566, 2014. 1998. [18] F. Harrag, E. El-Qawasmeh, and A. M. S. Al-Salman. [36] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A. Extracting Named Entities from Prophetic Narration Murad. Quranic-based concepts: Verse relations Texts (Hadith), pages 289–297. Springer, 2011. extraction using manchester owl syntax. In [19] F. Harrag and A. Hamdi-Cherif. Uml modeling of text Information Retrieval and Knowledge Management mining in arabic language and application to the (CAMP), 2012 International Conference on, pages prophetic traditions hadiths. pages 11–20, 2007. 317–321. IEEE. [20] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh. [37] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A. Information retrieval architecture for hadith text Murad. Quranic verse extraction base on concepts mining. Journal of Digital Information Management, using owl-dl ontology. Research Journal of Applied 6(6), 2008. Sciences, Engineering and Technology, [21] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh. 6(23):4492–4498, 2013. Vector space model for arabic information retrieval –