=Paper= {{Paper |id=Vol-1593/article-14 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1593/article-14.pdf |volume=Vol-1593 |dblpUrl=https://dblp.org/rec/conf/www/BasharatAAR16 }} ==None== https://ceur-ws.org/Vol-1593/article-14.pdf
     Semantic Hadith: Leveraging Linked Data Opportunities
                     for Islamic Knowledge

                                   Amna Basharat                           Bushra Abro
                              Dept. of Computer Science            Dept. of Computer Science
                                University of Georgia            Islamic International University
                               Athens, GA, 30605 USA                  Islamabad, Pakistan
                               amnabash@uga.edu                  bushraabro@hotmail.com

                                  I. Budak Arpinar                      Khaled Rasheed
                              Dept. of Computer Science           Dept. of Computer Science
                                University of Georgia               University of Georgia
                               Athens, GA, 30605 USA               Athens, GA, 30605 USA
                                  budak@uga.edu                         khaled@uga.edu

ABSTRACT                                                          nah (way of life) of the Prophet Muhammad. The later is
While the linked data paradigm has gathered much atten-           contained with the vast body of Hadith literature [22]. For-
tion over the recent years, the domain of Islamic knowledge       mally, the Hadith is defined as the (recorded) narrations of
has yet to cache upon its full potential. The web-scale in-       the sayings and deeds of the Prophet Muhammad.
tegration of Islamic texts and knowledge sources at large            Our research primarily is motivated to overcome the in-
is currently not well facilitated. The two primary sources        herent knowledge acquisition bottleneck in creating seman-
of the Islamic legislation are the Qur’an and the Hadith          tic content in semantic applications. We have established
(collections of Prophetic Narrations) and form the basis of       how this is particularly true for knowledge intensive domains
laying the foundation for anyone wanting to learn Islam.          such as the the domain of Islamic Knowledge, which has
This paper presents ongoing design and development efforts        failed to cache upon the promised potential of the semantic
to semantically model and publish the Hadith, which holds         web and the linked data technology; standardized web-scale
a primary position as the next most important knowledge           integration of the available knowledge resources is currently
source, after the Qur’an. We present the design of the linked     not facilitated at a large scale [7].
data vocabulary for not only publishing these narrations as
linked data, but also delineate upon the mechanism for link-      1.1      Background Context and Motivation
ing these narrations with the verses of the Qur’an. We es-
tablish how the links between the Hadith and the Qur’anic          1.1.1    Importance of Hadith
verses may be captured and published using this vocabu-              To understand the important of Hadith, the principles of
lary, as derived from the secondary and tertiary sources of       Qur’anic understanding and the science of tafseer or exege-
knowledge. We present detailed insights into the potential,       sis must be considered. The verses in the Qur’an cannot be
the design considerations and the use cases of publishing this    understood in isolation. The Hadith are used to illustrate
wealth of knowledge as linked data.                               the Historical context, the reasons for revelation and elab-
                                                                  oration of essential concepts that may not be directly evi-
CCS Concepts                                                      dent. This important principle has been adopted by scholars
                                                                  across centuries to write scholarly commentaries and expla-
•Information systems → Multilingual and cross-lingual             nations. Infact, it is a necessary condition to produce an ac-
retrieval; Information extraction; •Computing method-             curate tafseer of the Qur’an as explained in detail by Philips
ologies → Ontology engineering;                                   [30].
                                                                     To explain this principle, as an example, consider the Fig-
Keywords                                                          ure 1, a derived snapshot taken from QuranComplex1 , the
linked data; hadith;Quran;Qur’an; semantic web; Islamic           official manuscript, with a translation and a commentary,
knowledge;                                                        provided by the Kingdom of Saudi Arabia. The snapshot
                                                                  shows two verses from the first chapter of the Qur’an. The
                                                                  translation is annotated with a commentary (given in the
1.   INTRODUCTION                                                 footnotes in this case) in order to provide additional details
   The vast amount of Islamic Creed and legislation derives       where important. It is worth noticing that most authentic
itself from and is based priamrily on the two most funda-         and reliable commentaries would draw knowledge from the
mental sources of Islam: namely the Qur’an and the Sun-           sources of Hadith. In the case of this snapshot, the verse 2
Copyrights held by author/owner(s)
                                                                  contains an annotation which provides an elaboration based
WWW2016 Workshop: Linked Data on the Web (LDOW2016),Montreal,     on an authentic Hadith, from one of the many collections
Canada                                                            1
                                                                      http://qurancomplex.gov.sa/Quran/Targama/Targama.asp
                               Figure 1: A snapshot of a typical Qur’anic Commentary


of Hadith, called Sahih Bukhari, which is known to be the             We review some of the state of the art towards computa-
most authentic and reliable Hadith collection.                     tional approaches applied to Hadith texts in Section 6. Here,
                                                                   we would like to emphasize that interlinking the Qur’anic
1.1.2      Motivation: Potential for Knowledge Formal-             verses and the Hadith is a non-trivial task. We summarize
           ization and Linking                                     some factors that make this extremely challenging. Most
  There are hundreds and thousands of Qur’anic commen-             of the classical sources do not use a standardized number-
taries produced over the last few centuries, in various lan-       ing scheme for the Hadith. This is contrary to the Qur’anic
guages that draw upon and rely heavily on the Hadith sources       verses which have a standardized numbering scheme. There
to provide an iterpretation of the Qur’anic verses. Given this     are multiple sources of the Hadith, which may have differ-
fact, the potential for knowledge formalization and linking is     ent levels of authenticity which is a matter of discussion
not only evident, rather it cannot be overemphasized. For-         beyond the scope of this paper. Despite the fact that most
mally modeling this wealth of knowledge and the links would        Hadith collections have now been classified into authentic
enable new ways of research and knowledge discovery and            categories, the mapping of this classification to the sources
synthesis - the very motivation for this research. However,        that cite them is only possible if the Hadith are extracted
realizing this vision to span across the plethora of Islamic re-   and linked in a formalized manner. In addition, to add to the
sources is a mammoth task. We present some key challenges          challenge, the Hadith are of varying length, and oftentimes
presented.                                                         the commentator or the tafsir scholar will only quote a part
                                                                   of the Hadith or make a passing reference to it, making it
1.1.3      Challenges in Interlinking Islamic Knowledge            extremely difficult to trace the original Hadith being cited.
           Sources                                                 To add to the challenge, several Hadith may have common
                                                                   portions of narrations, therefore it makes it all the more
   There have been some recent efforts to publish Islamic
                                                                   challenging to identify, which exact Hadith is being quoted
knowledge as linked data on the Linked Open Data (LOD)
                                                                   or referred to. We believe that a knowledge formalization
cloud. The efforts primarily focus on the Qur’an. The two
                                                                   and linking mechanism, using the linked data standards, is
datasets that we consider in our research and attempt to link
                                                                   the way forward for solving some or more of these challenges.
with our Semantic Hadith research include SemanticQuran 2
[34] and QuranOntology 3 [16].                                     1.2   Contributions of the Paper
   However, there are no known publically available sources
                                                                     In this paper we make the following contributions:
of data or vocabularies published as linked data for the
Hadith. There are number of well known Hadith repos-                  • We provide the first of its kind linked data model,
itories available, which provide the provision of browsing              called Semantic Hadith for publishing Hadith as Linked
and searching the hadith collections such as sunnah.com,                Data and for linking with other key knowledge sources
dorar.net being the most prominent ones.                                in the Islamic domain, primarily the Qur’an.
2
    http://datahub.io/dataset/semanticquran                           • We present a classification of the various levels of links
3
    http://www.quranontology.com                                        that may potentially be established between the Ha-
                                           Figure 2: A Sample Hadith Snapshot


       dith, the Qur’an and other data sets on the linked data      Figure 3 shows the conceptual model for publishing Ha-
       cloud. This classification spans various levels of gran-   dith data on the LOD cloud. Here we summarize the key
       ularity. We highlight the linking challenges and design    entities and relations that we chose to include in the concep-
       issues with each one and present potential modeling        tual design model of the Semantic Hadith ontology schema.
       solutions.
                                                                     • Hadith: This is the central entity in the domain model.
     • We provide a knowledge extraction, linking and pub-             Since there had been no standardized numbering scheme
       lishing framework that may be reused for publishing             for the Hadith since the beginning, a few alternate
       similar knowledge and linked with the existing linked           numbering schemes may be encountered, therefore the
       data cloud. We present our preliminary implementa-              provision to include alternate numberings is made.
       tion of this framework.
                                                                     • Matn: This is primarily a textual entity, which con-
                                                                       tains the main narration of the Hadith, without the
2.     ONTOLOGY FOR SEMANTIC HADITH                                    chain or narrators or the Sanad.
  We first present an illustration of the structure of the Ha-
dith, and then detail upon the design of the ontology for            • Narrator: A Narrator is essentially a Person, with the
Semantic Hadith.                                                       special role of a narrator of the Hadith. One narra-
                                                                       tor may have many Hadith attributed to him or her.
2.1     Hadith Structure                                               If a narrator is the root narrator of the Hadith, then
  Figure 2 shows a sample of a Hadith taken from sun-                  a Hadith is usually attributedTo him/her. This is
nah.com4 . A given Hadith has two main parts: the ac-                  shown by the relation between the Hadith and Nar-
tual narration or the content portion of the Hadith is called          rator. Notice in the Figure 2, the english translation
Matan, and the chain of narrators(reporters) through whom              does not provide the entire NarratorChain, rather it
the narration has been transmitted and then recorded is                only provides the name of the narrator to whom the
traditionally known as the Sanad or simply the chain of                Hadith is attributed to. However, this is not the case
narrators. The Sanad is a chronological chain of narrators,            for the Arabic (original) version of the Hadith, which
each mentioning the one from whom he heard the Hadith all              usually contains the entire chain of narrators. The
the way to the prime narrator of the Matan followed by the             chain is often omitted in the books for simplifying the
Matan itself [32]. The Sanad plays the most important role             hadith text for the reader and making it more mean-
in determining the authenticity of the Hadith, which is the            ingful and relevant. However, the NarratorChain is
most crucial indicator Scholars resort to when determining             considered indispensable for determining the validity
whether to accept or reject a Hadith.                                  and authenticity of the Hadith, especially if no other
                                                                       validation source is mentioned.
2.2     Ontology Schema
                                                                     • Sanad(NarratorChain): This is an entity which will
4
    http://sunnah.com/bukhari/1                                        contain reference to a Narrator entity, and a level,
                              Figure 3: High Level Design of Semantic Hadith Ontology


       which will indicate the sequence of the narrator in the    granularity at which they are modeled. A Macro-Level Link
       chain. Same narrators may appear in many chains.           is considered to be one where the source entity is either at
                                                                  the level of a Verse in the Qur’an or a Hadith in a Hadith
     • HadithClass: This indicates the authenticity level of      Collection. If a link is established for a group of Verses or
       the Hadith. These are detailed in [32].                    Hadith, then it will also be considered at the Macro-level. A
                                                                  Micro-level link will be at a sub-verse, sub-Hadith or word
     • HadithChapter, HadithBook and HadithCollection: These      or phrase level. For the scope of this paper, we would detail
       are entities meant for structural organization of the      upon only the Macro-level links of the most essential types.
       Hadith. A Hadith is a part of a Chapter, which usu-
       ally contains thematically co-related collections of Ha-   3.1    Hadith-to-Hadith Links
       dith. Chapters are collected in Books and Books are
                                                                     As essential type of links to be established are those links,
       compiled as Collection or Volume.
                                                                  where by one Hadith is linked to or related to another Ha-
                                                                  dith. This could be done for Hadith which may be part
2.3     Vocabulary Design                                         of the same collection; or it may be between Hadith that
  We choose the hvoc prefix for the SemanticHadith vocab-         are part of different collections. These relations may be of
ulary, as in the domain model. We also ensure reuse of            the following primary types: 1) Two Hadith may be consid-
well established linked data vocabularies such as FOAF5 [10],     ered to be related if they have the same ’sanad’. 2) Two
SKOS6 [27], and DublinCore7 [35]. We also provide equiva-         Hadith may be considered to be related if they have the
lence relations where applicable. Some of the most relevant       same ’matan’. Note that two Hadiths may occur in the
equivalence relations are with the bibo ontology8 .               same collection, in two different chapters, under different
                                                                  thematic categorizations, however, they may be enumerated
3.    LINK MODELING AND DESIGN ISSUES                             or numbered differently. Therefore, by asserting this Hadith
   One of the most important constituents in the design of        as similar/related or identical, we aim to make these links
Semantic Hadith, is the aspect of facilitating the interlink-     explicit. Oftentimes, the same Hadith may be made part
ing of knowledge at various levels. We have earlier described     of a different collection and therefore, asserting an identity
the Macro-Structure for Islamic Knowledge in [7]. We dis-         link would become crucial. This is illustrated in Figure 4.
tinguish between the nature of links based on the level of        To handle the annotations between two Hadith, we define
                                                                  an entity called HadithRelation, for which the source and
5
  http://xmlns.com/foaf/spec/                                     destination represent the two ends of the relation. The
6
  https://www.w3.org/2004/02/skos/                                relation would often have a common Theme. The Relation-
7
  http://dublincore.org/                                          Type indicates whether the two Hadith are similar, indicated
8
  http://bibliontology.com                                        by Identity as the RelationType, or one Hadith may elab-
Figure 4: Conceptual Design model for Hadith-                    Figure 5: Conceptual Design model for Hadith-
Hadith Relationship                                              Verse Relationship


orate another indicated by Elaboration and so forth. These        3.2.2      Verse to Hadith Links based on Scholarly Com-
relation types are not exhaustive and may be iteratively re-                 mentaries
fined.                                                              Another important type of links to be established between
                                                                 the Hadith and the Qur’anic verses are shown in the model
3.2     Linking the Qur’an and Hadith                            as conceptualized in Figure 6. This is based on the earlier
  One of the most significant aspects of linking the Hadith      motivation, provided on the basis of Figure 1. In this type
dataset is with the verses in the existing Qur’an datasets.      of relation, we create an entity Verse-Hadith-Relation. In
We distinguish two types of relationships that may occur be-     this case, the source is a Verse and the destination is a Ha-
tween the Qur’an and the Hadith: 1) There may be Verses,         dith. The reason is that the Hadith will always be used to
entire of which or part of which may be ’Cited’ or quoted in     elaborate or provide the context for the verse in any given
a Hadith. This is the most direct kind of relation that exists   commentary or book of exegesis. The RelationType may be
between a Hadith and a Verse. 2) The other relations are         provided. In this relation type, the most important aspect
based on those that can only be derived from Scholarly com-      is establishing the source of the authority of the relation.
mentaries. The design and modeling issues for both these         This is established by the relation uponAuthorityOf with
types are delineated further.                                    a Scholar and a relationestablishedIn with a Book. The
                                                                 Book is naturally authoredBy the Scholar to whom the re-
3.2.1    Verse to Hadith Links based on Direct Cita-             lation is attributed.
         tions
   A direct link between a Hadith and Verse is characterized     3.3       Linking Hadith with other Datasources
as one whereby a Hadith contains within its main body a             We aim to provide the provision of linking the Semantic
complete verse or a meaningful portion of it. This is mod-       Hadith with other available datasources in the LOD cloud.
eled in the Figure 5. A Citation entity is created, which        We present a high level view of the linked cloud model for
is specific reference to a relation with its source as a Ha-     Islamic knowledge in Figure 7. We also mention those data-
dith and the destination as the Verse, indicating that its       sources, which although are not directly available on the
the Hadith that is encapsulating the Verse. It is considered     LOD, present potential for linking.
important that we characterize the CitationType as either
Complete, Partial or In-Direct. A Complete Citation will          3.3.1      Linking with Existing Datasources in the LOD
include the entire verse in the body of the Hadith and the                   Cloud
Verse will be quoted as such. A Partial citation may only          The two available datasources to which the Semantic Ha-
contain part of the Verse in the body of the Hadith. To          dith is linked to are the QuranOntology and SemanticQuran.
indicate this, the sub-verse entity is introduced, which will    Semantic Quran links itself to DBPedia 9 and Wiktionary 10 .
identify the part of the Verse citedIn the Hadith. This is       Links would be established between entities in the Hadith to
indicated by the relation characterizedBy. It is important       the ones in these two datasets to begin with. For this infor-
to note that it is important to annotate and capture the
                                                                 9
sub-verse, since there may be portions of the same verse              http://dbpedia.org
                                                                 10
that may be linked to different Hadith.                               http://wiktionary.dbpedia.org/
                                                                  4.1      Overview of the Framework
                                                                    The key stages of the framework shown in the Figure 8 in-
                                                                  clude: 1) Data Selection, where the data source is selected;
                                                                  2)Vocabulary Design and Selection, where conceptual and
                                                                  formal knowledge modelling is carried out; 3) Knowledge
                                                                  Extraction, where the process of information and knowl-
                                                                  edge extraction is carried out; 4) RDF Generation, where
                                                                  the extracted knowledge is converted into the RDF format;
                                                                  5)Publishing, Linking and Validation is done to make the
                                                                  converted RDF data available via a SPARQL endpoint; and
                                                                  6) Consumption, is the last stage where the dataset now
                                                                  available as linked data may be consumed into applications.

                                                                  4.2      Implementation Details
                                                                     We provide some key details of the ongoing implementa-
                                                                  tion process, about the dataset used for publishing as linked
                                                                  data, the knowledge extraction and linking mechanism. We
                                                                  summarize some key results and also highlight some chal-
                                                                  lenges and limitations faced in the implementation process.

                                                                   4.2.1     Data Sources
                                                                    As the first Hadith repository to be annotated using the
                                                                  Semantic Hadith Model, we have taken the data of Sun-
Figure 6: Conceptual Design model for Verse to Ha-
                                                                  nah.com, which is a structured data repository of some of the
dith Relationship based on Scholarly Commentaries
                                                                  most well known and authentic collections of Hadith. The
                                                                  foremost collections are those of Sahih Bukhari and Sahih
                                                                  Muslim. Altogether, there are 11 collections in this dataset,
mation extraction would be carried out. There are some im-
                                                                  with over 25,000 Hadith.
portant datasources which are not directly part of the linked
data cloud but have been made available through QuranOn-           4.2.2     Knowledge Extraction and Linking
tology and SemanticQuran. These are shown in the Figure
7 namely: QuranyTopicshttp://quranytopics.appspot.com,               For the initial implementation, we focused on extracting
QuranCorpus11 , and Tanzil12 .                                    some of the key relations explained earlier.
  One of an essential linking aspects would be to themati-           We extracted Verse-Hadith Links from QComplex Com-
cally map the QuranyTopics to those of HadithTopics.              mentary14 . This is one of the only datasource through which
                                                                  we were able to extract numbered hadith references, which
 3.3.2   Linking with other sources                               could be automatically mapped to the hadith collections
                                                                  available with us. An example of such a reference is shown
  There are other datasources that we plan to link with in        in Figure 1. A pattern extraction module was designed to
the future. Scholars database from a source such as Muslim-       parse the contents of the commentary. The content of the
ScholarsDatabase 13 or eNarrator (Hadith Isnad Ontology)          verses, translation and the footnotes were segmented. The
[32] [4]. The major limitation is that these sources are not      mapping between the verses and the corresponding footnotes
currently available in Linked Data format. However, they          was easy, given the direct correlation. Pattern matching was
present huge potential for linking.                               then applied to extract the collection name, volume number
                                                                  and the hadith number. This was then mapped to the num-
4.    LINKED DATA PUBLISHING                                      bers in our hadith collection. This can be challenging at
                                                                  times, because not all hadith collections use the same type
      FRAMEWORK AND IMPLEMENTATION                                of numbering convention. In such a case, it is non-trivial to
  In order for Semantic Hadith to become a defacto stan-          map the hadith citation to the corresponding hadith in the
dard and an integral part of the emerging Semantic Web            repository. Human intervention will be required for valida-
and the LOD cloud for the Islamic Knowledge domain, we            tion. We were able to obtain and validate some 300 verse
also aimed at providing a reusable framework for publishing       to hadith relations. Since the commentary is not a detailed
available Hadith based knowledge sources as linked data.          one, rather comments are only sparingly included as foot-
This is shown in Figure 8. As elaborated in Section 1.1.3,        notes to the verse translations, it was expected that this
there are multiple hadith repositories available. Therefore,      number would be small.
this reusable framework will benefit multiple hadith publish-        We also performed text mining on the arabic text of the
ers to not only expose their data, but also to establish equiv-   hadith data to obtain the Hadith-Verse citations, as de-
alence links with other repositories. This would be essential     scribed in Section 3.2.1. For this, we developed a verse-
towards realizing the vision of linked Islamic knowledge as       extraction component, which implements a sub-string match-
presented in [7].                                                 ing problem, in order to detect complete or partial verses
                                                                  that may be cited in a given hadith. This is not trivial for
11
   corpus.quran.com                                               several reasons. Different verses span different lengths in
12
   tanzil.net
13                                                                14
   http://muslimscholars.info                                          http://qurancomplex.gov.sa/Quran/Targama/Targama.asp
     Figure 7: A view of the proposed and available Linked Data Cloud for Islamic Knowledge Sources


the Qur’an. While some may be as long as an entire page’s       Tables 1 and 2. Table 1 summarizes the statistics for some
length of a standard book size, others may be as short as one   of the key entities present in the dataset.
or two words. Therefore, in order to determine, whether the        Table 2 provides the raw count for the candidate relations
verse is actually being quoted or cited in a hadith requires    extracted under the different categories mentioned. It must
further validation. Even applying a threshold, relative to      be noted however, that the relations are not classified ac-
the length of the verse, is not an optimal solution. Setting    cording to any of the parameters mentioned in the design.
a substantial minimal length was considered, but this may       It is also worth mentioning that some of these relations may
not guarantee a comprehensive coverage. For the first proto-    actually be symmetric.
type, only 1,325 expert validated links were asserted. In the
sunnah.com data, these links may be found as hyperlinks to
the verses on the site quran.com.                               Table 1: Entity Statistics in the Semantic Hadith
   In addition, similarity computation algorithms were de-      Dataset(Sunnah.com)
vised to extract Hadith-Hadith similarity relations. The
4,973 relations, listed in Table 2 are strongly similar Ha-                   Entity                   Count
dith that have at least 60% of text in common. However,                       No of Collections        11
the challenge with this approach is that, it cannot be dis-                   No of Books              311
tinguished if the similarity is in the Sanad or the Matan or                  No of Hadith(Arabic)     25,934
both. The more meaningful similarities that are of interest                   No of Hadith(English)    18,040
are in the Matan of the hadith. In future experiments, we aim                 No of Chapters           8,968
to segment the Sanad and the Matan and extract respective
similarity relations. While the similarity threshold for the
current approach only took into consideration the common
substring, we plan to conduct experiments with more mean-       Table 2:    Link Statistics in the Semantic Hadith
ingful similarity measures such as Cosine, Jaccard and Pear-    Dataset
son correlation coefficient, as done in our work for Qur’anic
verses [8].                                                             Link Type                            Count
                                                                        Hadith-Hadith Relation               4,973
4.2.3    Results                                                        Hadith-Verse Relation(Citations)     1,325
  Based on the dataset and experiments carried out, we                  Verse-Hadith Relation(Scholarly)     313
summarize some of the dataset and link statistics in the
               Figure 8: Linked Data Generation and Publishing Framework for Semantic Hadith


4.3    Existing Limitations and Proposed Solu-                     PREFIX rdfs: 
       tions                                                       PREFIX hvoc: 
link extraction and validation. There is an obvious lack of        PREFIX dcterms: 
structured knowledge sources, with well marked citations.          PREFIX qvoc: 
Therefore, the Verse-Hadith links are extremely difficult to
be extracted using mere computational means. Human con-            select ?hadith_text ?surahNo ?verseNo
tribution is a must. For this purpose, we intend to pursue           ?ayahText ?ayahEng
a crowdsourcing approach, based on our prior work[6]. We           WHERE {
not only intend to use crowdsourcing and human computa-            ?verse hvoc:isRelatedTo ?hadith;
tion methods for the purpose of knowledge acquisition, but          hvoc:verseNo ?verseNo ;
also for knowledge validation. Infact, we believe a hybrid         hvoc:surahNo ??surahNo .
human-machine computation methodology to be the only               ?hadith hvoc:hadithId ?hId;
indispensable means of being able to fulfill the vision for        hvoc:hadithText ?hadith_text .
linked Islamic knowledge at scale, while ensuring the desired
reliability and authenticity.                                       SERVICE  {
                                                                   ?s qvoc:chapterIndex ?surahNo;
                                                                    qvoc:verseIndex ?verseNo;
                                                                   rdfs:label ?ayahText;
5.    PROSPECTIVE APPLICATIONS                                     rdfs:label ?ayahEng.
   The most significant benefit of realizing the linked data vi-   FILTER (lang(?ayahEng) ="en" &&
sion for Islamic knowledge sources will be towards enabling        lang(?ayahText) ="ar")
semantics driven distributed knowledge search and retrieval.       }}
Most current applications in the Islamic domain only provide
limited provision for semantic and conceptual search and re-          This could be taken to another level, by adding another
trieval beyond the traditional keyword based searches, upon        level of federation, and querying the themes of the verse
a single repository. With the Semantic Hadith model, the           from the QuranOntology.
first of its kind tools will now be possible that would let
Qur’an and Hadith repositories to be queried and searched          PREFIX rdfs: 
in a federated manner.                                             PREFIX hvoc: 
the Semantic Hadith and Semantic Quran datasets. Given             PREFIX dcterms: 
that a Verse-Hadith relation exists with the Semantic Ha-          PREFIX qur: 
dith dataset, this query retrieves the arabic and english texts
for the respective verse.                                          select ?hadith_text ?surahNo ?verseNo
?tname                                                               works in this regard include [20], [18], [17], [19], [21]. There
WHERE {                                                              are also work references with respect to mining the hadith
?verse hvoc:isRelatedTo ?hadith;                                     for indexing and classification [2], [29]. Some recent efforts
 hvoc:verseNo ?verseNo ;                                             have attempted to model the hadith as semantic ontologies
hvoc:surahNo ?surahNo .                                              [4] [32]. However, the efforts have focused on annotating
?hadith hvoc:hadithId ?hId;                                          the different constituents of the hadith. None of these data-
hvoc:hadithText ?hadith_text .                                       sources are available as open source.
                                                                        Our work is the first of its kind to propose the linked
 SERVICE  {                          data based model to propose the linking of hadith with the
?verse qur:DiscussTopic ?t.                                          Qur’an. This linked knowledge forms a vital backbone to en-
?t rdfs:label ?tname.                                                able better integration and discovery of knowledge sources.
FILTER(LANGMATCHES(LANG(?tname), "ar"))
}}}                                                                  7.   CONCLUSIONS AND FUTURE WORK
                                                                        In this paper we presented the design and development of
  This could be further enhanced by automated interlinking           our Semantic Hadith framework, which aims to provide the
with other available datasources on the linked data cloud,           foundation for semantically interlinking the most important
as envisioned in Figure 7. For instance, once the available          Islamic knowledge sources using the linked data standards.
hadith are annotated with mentioned events, place or peo-            We presented the design of the Semantic Hadith Ontology
ple, they may be linked to the available entities in dbpedia.        and explained the nature of links with other data sources.
This would enable richer knowledge discovery and retrieval           The implementation still needs to be matured. The valida-
for a range of applications.                                         tion of the links and extracted knowledge is a huge challenge
  We expect that using this model, more hadith and Qur’anic          we are looking into. We are investigating into crowdsourcing
exegesis repositories, that also rely on and cite heavily the        models for knowledge acquisition and validation at scale.
hadith sources, will be published in the linked data for-
mat. This will enable the design and development of en-              8.   ACKNOWLEDGEMENTS
hanced learning tools for the Islamic domain, which will pro-          We are thankful to the authors of sunnah.com for pro-
vide efficient and personalized access to primary sources of         viding us with the valuable datasource to carry out this re-
knowledge, ensuring reliability and authenticity. Given that         search. We would also like to acknowledge the efforts of Mr.
these tools will give better access to meaningfully interlinked      Muhammad Shoaib (Jeju National University, South Korea)
knowledge, it will require less effort to find resources and ac-     in extending his help with some of the experiments.
cess knowledge beyond books. More content, both classical
and contemporary, would become discoverable.
                                                                     9.   REFERENCES
                                                                      [1] H. S. Al-Khalifa, M. Al-Yahya, A. Bahanshal,
6.      RELATED WORK                                                      I. Al-Odah, and N. Al-Helwah. An approach to
   The linked data approach has emerged as the de facto                   compare two ontological models for representing
standard for sharing the data on the web.It provides a set                quranic words. In Proceedings of the 12th International
of best practices for publishing and connecting structured                Conference on Information Integration and Web-based
data on the web [9]. The linked data design issues provide                Applications and Services, pages 674–678. ACM.
guidelines on how to use standardized web technologies to             [2] K. A. Aldhlan and A. M. Zeki. Datamining and
set data-level links between data from different sources[23].             islamic knowledge extraction: alhadith as a knowledge
Increased interest in the LOD has been seen in various sec-               resource. In Information and Communication
tors e.g. Education [11], [31], Scientific research [3], libraries        Technology for the Muslim World (ICT4M), 2010
[28], [25], Government [12], [24], [33], Cultural heritage [26]           International Conference on, pages 21–25. IEEE, 2010.
and many others, however, the religious sector has yet to             [3] T. K. Attwood, D. B. Kell, P. McDermott, J. Marsh,
cache upon the power of the linked open data.                             S. Pettifer, and D. Thorne. Utopia documents: linking
   Research in computational informatics applied to the Is-               scholarly literature with research data. Bioinformatics,
lamic knowledge has primarily centered around Morpholog-                  26(18):568–574, 2010.
ical annotation of the Qur’an [13], [14], Ontology modeling
                                                                      [4] A. Azmi and N. B. Badia. itree-automating the
of the Qur’an [1], [5], [15], [36], [37], and Arabic Natural
                                                                          construction of the narration tree of hadiths
language processing [15]. The LOD take-up in the area of
                                                                          (prophetic traditions). pages 1–7, 2010.
Islamic knowledge has been particularly extremely limited.
As mentioned earlier, there have been some recent efforts             [5] S. Baqai, A. Basharat, H. Khalid, A. Hassan, and
to publish Islamic knowledge as linked data on the Linked                 S. Zafar. Leveraging semantic web technologies for
Open Data (LOD) cloud. The efforts primarily focus on the                 standardized knowledge modeling and retrieval from
Qur’an. The two datasets that we consider in our research                 the holy qur’an and religious texts. In Proceedings of
and attempt to link with our Semantic Hadith research in-                 the 7th International Conference on Frontiers of
clude SemanticQuran 15 [34] and QuranOntology 16 [16].                    Information Technology, FIT ’09, pages 42:1–42:6,
   Much of the work in the Hadith sciences has focused au-                New York, NY, USA, 2009. ACM.
tomating the extraction of the Chain of Narrators. Some               [6] A. Basharat, I. B. Arpinar, S. Dastgheib,
                                                                          U. Kursuncu, K. Kochut, and E. Dogdu. Semantically
15
     http://datahub.io/dataset/semanticquran                              enriched task and workflow automation in
16
     http://www.quranontology.com                                         crowdsourcing for linked data management.
     International Journal of Semantic Computing,                   application to hadith indexing. In Applications of
     8(04):415–439, 2014.                                           Digital Information and Web Technologies, 2008.
 [7] A. Basharat, K. Rasheed, and I. B. Arpinar. Towards            ICADIWT 2008. First International Conference on
     linked open islamic knowledge using human                      the, pages 107–112. IEEE, 2008.
     computation and crowdsourcing. In Proceedings of the      [22] S. Hasan. An introduction to the science of Hadith.
     International Conference on Islamic Applications in            Al-Quran Society, 1994.
     Computer Science And Technology, 2015.                    [23] T. Heath and C. Bizer. Linked data: Evolving the web
 [8] A. Basharat, D. Yasdansepas, and K. Rasheed.                   into a global data space. Synthesis lectures on the
     Comparative study of verse similarity for multi-lingual        semantic web: theory and technology, 1(1):1–136, 2011.
     representations of the qur’an. In Proc. on the Int.       [24] J. Hendler, J. Holm, C. Musialek, and G. Thomas. Us
     Conference on Artificial Intelligence (ICAI), pages            government linked open data: semantic. data. gov.
     336–343, 2015.                                                 IEEE Intelligent Systems, 27(3):0025–31, 2012.
 [9] C. Bizer, T. Heath, and T. Berners-Lee. Linked data -     [25] L. C. Howarth. Frbr and linked data: Connecting frbr
     the story so far. International Journal on Semantic            and linked data. Cataloging and Classification
     Web and Information Systems, 5(3):1–22, 2009.                  Quarterly, 50(5-7):763–776, 2012.
[10] D. Brickley and L. Miller. Foaf vocabulary                [26] J. Marden, C. Li-Madeo, N. Whysel, and J. Edelstein.
     specification 0.98. Namespace document, 9, 2012.               Linked open data for cultural heritage: evolution of an
[11] S. Dietze, S. Sanchez-Alonso, H. Ebner, H. Q. Yu,              information technology. pages 107–112, 2013.
     D. Giordano, I. Marenzi, and B. P. Nunes. Interlinking    [27] A. Miles, B. Matthews, M. Wilson, and D. Brickley.
     educational resources and the web of data a survey of          Skos core: Simple knowledge organisation for the web.
     challenges and approaches. Program-Electronic Library          In Proceedings of the 2005 International Conference
     and Information Systems, 47(1):60–91, 2013.                    on Dublin Core and Metadata Applications:
[12] L. Ding, T. Lebo, J. S. Erickson, D. DiFranzo, G. T.           Vocabularies in Practice, DCMI ’05, pages 1:1–1:9.
     Williams, X. Li, J. Michaelis, A. Graves, J. G. Zheng,         Dublin Core Metadata Initiative, 2005.
     Z. Shangguan, J. Flores, D. L. McGuinness, and J. A.      [28] E. Miller and M. Westfall. Linked data and libraries.
     Hendler. Twc logd: A portal for linked open                    The Serials Librarian, 60(1-4):17–22, 2011.
     government data ecosystems. Journal of Web                [29] M. Naji Al-Kabi, G. Kanaan, R. Al-Shalabi, S. I.
     Semantics, 9(3):325–333, 2011.                                 Al-Sinjilawi, and R. S. Al-Mustafa. Al-hadith text
[13] K. Dukes, E. Atwell, and A. M. Sharaf. Syntactic               classifier. Journal of Applied Sciences, 5:584–587, 2005.
     annotation guidelines for the quranic arabic              [30] A. A. B. Philips. Usool at-Tafseer: The Methodology
     dependency treebank. In Proceedings of the                     of Qur’aanic Explanation. AS Noordeen, 2002.
     International Conference on Language Resources and        [31] N. Piedra, E. Tovar, R. Colomo-Palacios,
     Evaluation, LREC 2010, 17-23 May 2010, Valletta,               J. Lopez-Vargas, and J. A. Chicaiza. Consuming and
     Malta, 2010.                                                   producing linked open data: the case of
[14] K. Dukes and N. Habash. Morphological annotation of            opencourseware. Program: electronic library and
     quranic arabic. In Proceedings of the International            information systems, 48(1):16–40, 2014.
     Conference on Language Resources and Evaluation,          [32] Y. M. D. Rebhi S. Baraka. Building hadith ontology
     LREC 2010, 17-23 May 2010, Valletta, Malta, 2010.              to support the authenticity of isnad. International
[15] A. Farghaly and K. Shaalan. Arabic natural language            Journal on Islamic Applications in Computer Science
     processing: Challenges and solutions. ACM                      And Technology, 2(1):25–39, 2014.
     Transactions on Asian Language Information                [33] N. Shadbolt, K. O’Hara, T. Berners-Lee, N. Gibbins,
     Processing, 8:1–22, 2009.                                      H. Glaser, and W. Hall. Linked open government
[16] A. Hakkoum and S. Raghay. Ontological approach for             data: Lessons from data. gov. uk. IEEE Intelligent
     semantic modeling and querying the qur’an. In                  Systems, 27(3):16–24, 2012.
     Proceedings of the International Conference on Islamic    [34] M. A. Sherif and A.-C. N. Ngomo. Semantic Quran - a
     Applications in Computer Science And Technology,               multilingual resource for natural-language processing.
     2015.                                                          Semantic Web, 6(4):339–345, 2015.
[17] F. Harrag. Text mining approach for knowledge             [35] S. Weibel, J. Kunze, C. Lagoze, and M. Wolf. Dublin
     extraction in sahih al-bukhari. Computers in Human             core metadata for resource discovery. Technical report,
     Behavior, 30:558–566, 2014.                                    1998.
[18] F. Harrag, E. El-Qawasmeh, and A. M. S. Al-Salman.        [36] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A.
     Extracting Named Entities from Prophetic Narration             Murad. Quranic-based concepts: Verse relations
     Texts (Hadith), pages 289–297. Springer, 2011.                 extraction using manchester owl syntax. In
[19] F. Harrag and A. Hamdi-Cherif. Uml modeling of text            Information Retrieval and Knowledge Management
     mining in arabic language and application to the               (CAMP), 2012 International Conference on, pages
     prophetic traditions hadiths. pages 11–20, 2007.               317–321. IEEE.
[20] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh.           [37] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A.
     Information retrieval architecture for hadith text             Murad. Quranic verse extraction base on concepts
     mining. Journal of Digital Information Management,             using owl-dl ontology. Research Journal of Applied
     6(6), 2008.                                                    Sciences, Engineering and Technology,
[21] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh.                6(23):4492–4498, 2013.
     Vector space model for arabic information retrieval –