=Paper=
{{Paper
|id=Vol-1593/article-14
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-1593/article-14.pdf
|volume=Vol-1593
|dblpUrl=https://dblp.org/rec/conf/www/BasharatAAR16
}}
==None==
Semantic Hadith: Leveraging Linked Data Opportunities
for Islamic Knowledge
Amna Basharat Bushra Abro
Dept. of Computer Science Dept. of Computer Science
University of Georgia Islamic International University
Athens, GA, 30605 USA Islamabad, Pakistan
amnabash@uga.edu bushraabro@hotmail.com
I. Budak Arpinar Khaled Rasheed
Dept. of Computer Science Dept. of Computer Science
University of Georgia University of Georgia
Athens, GA, 30605 USA Athens, GA, 30605 USA
budak@uga.edu khaled@uga.edu
ABSTRACT nah (way of life) of the Prophet Muhammad. The later is
While the linked data paradigm has gathered much atten- contained with the vast body of Hadith literature [22]. For-
tion over the recent years, the domain of Islamic knowledge mally, the Hadith is defined as the (recorded) narrations of
has yet to cache upon its full potential. The web-scale in- the sayings and deeds of the Prophet Muhammad.
tegration of Islamic texts and knowledge sources at large Our research primarily is motivated to overcome the in-
is currently not well facilitated. The two primary sources herent knowledge acquisition bottleneck in creating seman-
of the Islamic legislation are the Qur’an and the Hadith tic content in semantic applications. We have established
(collections of Prophetic Narrations) and form the basis of how this is particularly true for knowledge intensive domains
laying the foundation for anyone wanting to learn Islam. such as the the domain of Islamic Knowledge, which has
This paper presents ongoing design and development efforts failed to cache upon the promised potential of the semantic
to semantically model and publish the Hadith, which holds web and the linked data technology; standardized web-scale
a primary position as the next most important knowledge integration of the available knowledge resources is currently
source, after the Qur’an. We present the design of the linked not facilitated at a large scale [7].
data vocabulary for not only publishing these narrations as
linked data, but also delineate upon the mechanism for link- 1.1 Background Context and Motivation
ing these narrations with the verses of the Qur’an. We es-
tablish how the links between the Hadith and the Qur’anic 1.1.1 Importance of Hadith
verses may be captured and published using this vocabu- To understand the important of Hadith, the principles of
lary, as derived from the secondary and tertiary sources of Qur’anic understanding and the science of tafseer or exege-
knowledge. We present detailed insights into the potential, sis must be considered. The verses in the Qur’an cannot be
the design considerations and the use cases of publishing this understood in isolation. The Hadith are used to illustrate
wealth of knowledge as linked data. the Historical context, the reasons for revelation and elab-
oration of essential concepts that may not be directly evi-
CCS Concepts dent. This important principle has been adopted by scholars
across centuries to write scholarly commentaries and expla-
•Information systems → Multilingual and cross-lingual nations. Infact, it is a necessary condition to produce an ac-
retrieval; Information extraction; •Computing method- curate tafseer of the Qur’an as explained in detail by Philips
ologies → Ontology engineering; [30].
To explain this principle, as an example, consider the Fig-
Keywords ure 1, a derived snapshot taken from QuranComplex1 , the
linked data; hadith;Quran;Qur’an; semantic web; Islamic official manuscript, with a translation and a commentary,
knowledge; provided by the Kingdom of Saudi Arabia. The snapshot
shows two verses from the first chapter of the Qur’an. The
translation is annotated with a commentary (given in the
1. INTRODUCTION footnotes in this case) in order to provide additional details
The vast amount of Islamic Creed and legislation derives where important. It is worth noticing that most authentic
itself from and is based priamrily on the two most funda- and reliable commentaries would draw knowledge from the
mental sources of Islam: namely the Qur’an and the Sun- sources of Hadith. In the case of this snapshot, the verse 2
Copyrights held by author/owner(s)
contains an annotation which provides an elaboration based
WWW2016 Workshop: Linked Data on the Web (LDOW2016),Montreal, on an authentic Hadith, from one of the many collections
Canada 1
http://qurancomplex.gov.sa/Quran/Targama/Targama.asp
Figure 1: A snapshot of a typical Qur’anic Commentary
of Hadith, called Sahih Bukhari, which is known to be the We review some of the state of the art towards computa-
most authentic and reliable Hadith collection. tional approaches applied to Hadith texts in Section 6. Here,
we would like to emphasize that interlinking the Qur’anic
1.1.2 Motivation: Potential for Knowledge Formal- verses and the Hadith is a non-trivial task. We summarize
ization and Linking some factors that make this extremely challenging. Most
There are hundreds and thousands of Qur’anic commen- of the classical sources do not use a standardized number-
taries produced over the last few centuries, in various lan- ing scheme for the Hadith. This is contrary to the Qur’anic
guages that draw upon and rely heavily on the Hadith sources verses which have a standardized numbering scheme. There
to provide an iterpretation of the Qur’anic verses. Given this are multiple sources of the Hadith, which may have differ-
fact, the potential for knowledge formalization and linking is ent levels of authenticity which is a matter of discussion
not only evident, rather it cannot be overemphasized. For- beyond the scope of this paper. Despite the fact that most
mally modeling this wealth of knowledge and the links would Hadith collections have now been classified into authentic
enable new ways of research and knowledge discovery and categories, the mapping of this classification to the sources
synthesis - the very motivation for this research. However, that cite them is only possible if the Hadith are extracted
realizing this vision to span across the plethora of Islamic re- and linked in a formalized manner. In addition, to add to the
sources is a mammoth task. We present some key challenges challenge, the Hadith are of varying length, and oftentimes
presented. the commentator or the tafsir scholar will only quote a part
of the Hadith or make a passing reference to it, making it
1.1.3 Challenges in Interlinking Islamic Knowledge extremely difficult to trace the original Hadith being cited.
Sources To add to the challenge, several Hadith may have common
portions of narrations, therefore it makes it all the more
There have been some recent efforts to publish Islamic
challenging to identify, which exact Hadith is being quoted
knowledge as linked data on the Linked Open Data (LOD)
or referred to. We believe that a knowledge formalization
cloud. The efforts primarily focus on the Qur’an. The two
and linking mechanism, using the linked data standards, is
datasets that we consider in our research and attempt to link
the way forward for solving some or more of these challenges.
with our Semantic Hadith research include SemanticQuran 2
[34] and QuranOntology 3 [16]. 1.2 Contributions of the Paper
However, there are no known publically available sources
In this paper we make the following contributions:
of data or vocabularies published as linked data for the
Hadith. There are number of well known Hadith repos- • We provide the first of its kind linked data model,
itories available, which provide the provision of browsing called Semantic Hadith for publishing Hadith as Linked
and searching the hadith collections such as sunnah.com, Data and for linking with other key knowledge sources
dorar.net being the most prominent ones. in the Islamic domain, primarily the Qur’an.
2
http://datahub.io/dataset/semanticquran • We present a classification of the various levels of links
3
http://www.quranontology.com that may potentially be established between the Ha-
Figure 2: A Sample Hadith Snapshot
dith, the Qur’an and other data sets on the linked data Figure 3 shows the conceptual model for publishing Ha-
cloud. This classification spans various levels of gran- dith data on the LOD cloud. Here we summarize the key
ularity. We highlight the linking challenges and design entities and relations that we chose to include in the concep-
issues with each one and present potential modeling tual design model of the Semantic Hadith ontology schema.
solutions.
• Hadith: This is the central entity in the domain model.
• We provide a knowledge extraction, linking and pub- Since there had been no standardized numbering scheme
lishing framework that may be reused for publishing for the Hadith since the beginning, a few alternate
similar knowledge and linked with the existing linked numbering schemes may be encountered, therefore the
data cloud. We present our preliminary implementa- provision to include alternate numberings is made.
tion of this framework.
• Matn: This is primarily a textual entity, which con-
tains the main narration of the Hadith, without the
2. ONTOLOGY FOR SEMANTIC HADITH chain or narrators or the Sanad.
We first present an illustration of the structure of the Ha-
dith, and then detail upon the design of the ontology for • Narrator: A Narrator is essentially a Person, with the
Semantic Hadith. special role of a narrator of the Hadith. One narra-
tor may have many Hadith attributed to him or her.
2.1 Hadith Structure If a narrator is the root narrator of the Hadith, then
Figure 2 shows a sample of a Hadith taken from sun- a Hadith is usually attributedTo him/her. This is
nah.com4 . A given Hadith has two main parts: the ac- shown by the relation between the Hadith and Nar-
tual narration or the content portion of the Hadith is called rator. Notice in the Figure 2, the english translation
Matan, and the chain of narrators(reporters) through whom does not provide the entire NarratorChain, rather it
the narration has been transmitted and then recorded is only provides the name of the narrator to whom the
traditionally known as the Sanad or simply the chain of Hadith is attributed to. However, this is not the case
narrators. The Sanad is a chronological chain of narrators, for the Arabic (original) version of the Hadith, which
each mentioning the one from whom he heard the Hadith all usually contains the entire chain of narrators. The
the way to the prime narrator of the Matan followed by the chain is often omitted in the books for simplifying the
Matan itself [32]. The Sanad plays the most important role hadith text for the reader and making it more mean-
in determining the authenticity of the Hadith, which is the ingful and relevant. However, the NarratorChain is
most crucial indicator Scholars resort to when determining considered indispensable for determining the validity
whether to accept or reject a Hadith. and authenticity of the Hadith, especially if no other
validation source is mentioned.
2.2 Ontology Schema
• Sanad(NarratorChain): This is an entity which will
4
http://sunnah.com/bukhari/1 contain reference to a Narrator entity, and a level,
Figure 3: High Level Design of Semantic Hadith Ontology
which will indicate the sequence of the narrator in the granularity at which they are modeled. A Macro-Level Link
chain. Same narrators may appear in many chains. is considered to be one where the source entity is either at
the level of a Verse in the Qur’an or a Hadith in a Hadith
• HadithClass: This indicates the authenticity level of Collection. If a link is established for a group of Verses or
the Hadith. These are detailed in [32]. Hadith, then it will also be considered at the Macro-level. A
Micro-level link will be at a sub-verse, sub-Hadith or word
• HadithChapter, HadithBook and HadithCollection: These or phrase level. For the scope of this paper, we would detail
are entities meant for structural organization of the upon only the Macro-level links of the most essential types.
Hadith. A Hadith is a part of a Chapter, which usu-
ally contains thematically co-related collections of Ha- 3.1 Hadith-to-Hadith Links
dith. Chapters are collected in Books and Books are
As essential type of links to be established are those links,
compiled as Collection or Volume.
where by one Hadith is linked to or related to another Ha-
dith. This could be done for Hadith which may be part
2.3 Vocabulary Design of the same collection; or it may be between Hadith that
We choose the hvoc prefix for the SemanticHadith vocab- are part of different collections. These relations may be of
ulary, as in the domain model. We also ensure reuse of the following primary types: 1) Two Hadith may be consid-
well established linked data vocabularies such as FOAF5 [10], ered to be related if they have the same ’sanad’. 2) Two
SKOS6 [27], and DublinCore7 [35]. We also provide equiva- Hadith may be considered to be related if they have the
lence relations where applicable. Some of the most relevant same ’matan’. Note that two Hadiths may occur in the
equivalence relations are with the bibo ontology8 . same collection, in two different chapters, under different
thematic categorizations, however, they may be enumerated
3. LINK MODELING AND DESIGN ISSUES or numbered differently. Therefore, by asserting this Hadith
One of the most important constituents in the design of as similar/related or identical, we aim to make these links
Semantic Hadith, is the aspect of facilitating the interlink- explicit. Oftentimes, the same Hadith may be made part
ing of knowledge at various levels. We have earlier described of a different collection and therefore, asserting an identity
the Macro-Structure for Islamic Knowledge in [7]. We dis- link would become crucial. This is illustrated in Figure 4.
tinguish between the nature of links based on the level of To handle the annotations between two Hadith, we define
an entity called HadithRelation, for which the source and
5
http://xmlns.com/foaf/spec/ destination represent the two ends of the relation. The
6
https://www.w3.org/2004/02/skos/ relation would often have a common Theme. The Relation-
7
http://dublincore.org/ Type indicates whether the two Hadith are similar, indicated
8
http://bibliontology.com by Identity as the RelationType, or one Hadith may elab-
Figure 4: Conceptual Design model for Hadith- Figure 5: Conceptual Design model for Hadith-
Hadith Relationship Verse Relationship
orate another indicated by Elaboration and so forth. These 3.2.2 Verse to Hadith Links based on Scholarly Com-
relation types are not exhaustive and may be iteratively re- mentaries
fined. Another important type of links to be established between
the Hadith and the Qur’anic verses are shown in the model
3.2 Linking the Qur’an and Hadith as conceptualized in Figure 6. This is based on the earlier
One of the most significant aspects of linking the Hadith motivation, provided on the basis of Figure 1. In this type
dataset is with the verses in the existing Qur’an datasets. of relation, we create an entity Verse-Hadith-Relation. In
We distinguish two types of relationships that may occur be- this case, the source is a Verse and the destination is a Ha-
tween the Qur’an and the Hadith: 1) There may be Verses, dith. The reason is that the Hadith will always be used to
entire of which or part of which may be ’Cited’ or quoted in elaborate or provide the context for the verse in any given
a Hadith. This is the most direct kind of relation that exists commentary or book of exegesis. The RelationType may be
between a Hadith and a Verse. 2) The other relations are provided. In this relation type, the most important aspect
based on those that can only be derived from Scholarly com- is establishing the source of the authority of the relation.
mentaries. The design and modeling issues for both these This is established by the relation uponAuthorityOf with
types are delineated further. a Scholar and a relationestablishedIn with a Book. The
Book is naturally authoredBy the Scholar to whom the re-
3.2.1 Verse to Hadith Links based on Direct Cita- lation is attributed.
tions
A direct link between a Hadith and Verse is characterized 3.3 Linking Hadith with other Datasources
as one whereby a Hadith contains within its main body a We aim to provide the provision of linking the Semantic
complete verse or a meaningful portion of it. This is mod- Hadith with other available datasources in the LOD cloud.
eled in the Figure 5. A Citation entity is created, which We present a high level view of the linked cloud model for
is specific reference to a relation with its source as a Ha- Islamic knowledge in Figure 7. We also mention those data-
dith and the destination as the Verse, indicating that its sources, which although are not directly available on the
the Hadith that is encapsulating the Verse. It is considered LOD, present potential for linking.
important that we characterize the CitationType as either
Complete, Partial or In-Direct. A Complete Citation will 3.3.1 Linking with Existing Datasources in the LOD
include the entire verse in the body of the Hadith and the Cloud
Verse will be quoted as such. A Partial citation may only The two available datasources to which the Semantic Ha-
contain part of the Verse in the body of the Hadith. To dith is linked to are the QuranOntology and SemanticQuran.
indicate this, the sub-verse entity is introduced, which will Semantic Quran links itself to DBPedia 9 and Wiktionary 10 .
identify the part of the Verse citedIn the Hadith. This is Links would be established between entities in the Hadith to
indicated by the relation characterizedBy. It is important the ones in these two datasets to begin with. For this infor-
to note that it is important to annotate and capture the
9
sub-verse, since there may be portions of the same verse http://dbpedia.org
10
that may be linked to different Hadith. http://wiktionary.dbpedia.org/
4.1 Overview of the Framework
The key stages of the framework shown in the Figure 8 in-
clude: 1) Data Selection, where the data source is selected;
2)Vocabulary Design and Selection, where conceptual and
formal knowledge modelling is carried out; 3) Knowledge
Extraction, where the process of information and knowl-
edge extraction is carried out; 4) RDF Generation, where
the extracted knowledge is converted into the RDF format;
5)Publishing, Linking and Validation is done to make the
converted RDF data available via a SPARQL endpoint; and
6) Consumption, is the last stage where the dataset now
available as linked data may be consumed into applications.
4.2 Implementation Details
We provide some key details of the ongoing implementa-
tion process, about the dataset used for publishing as linked
data, the knowledge extraction and linking mechanism. We
summarize some key results and also highlight some chal-
lenges and limitations faced in the implementation process.
4.2.1 Data Sources
As the first Hadith repository to be annotated using the
Semantic Hadith Model, we have taken the data of Sun-
Figure 6: Conceptual Design model for Verse to Ha-
nah.com, which is a structured data repository of some of the
dith Relationship based on Scholarly Commentaries
most well known and authentic collections of Hadith. The
foremost collections are those of Sahih Bukhari and Sahih
Muslim. Altogether, there are 11 collections in this dataset,
mation extraction would be carried out. There are some im-
with over 25,000 Hadith.
portant datasources which are not directly part of the linked
data cloud but have been made available through QuranOn- 4.2.2 Knowledge Extraction and Linking
tology and SemanticQuran. These are shown in the Figure
7 namely: QuranyTopicshttp://quranytopics.appspot.com, For the initial implementation, we focused on extracting
QuranCorpus11 , and Tanzil12 . some of the key relations explained earlier.
One of an essential linking aspects would be to themati- We extracted Verse-Hadith Links from QComplex Com-
cally map the QuranyTopics to those of HadithTopics. mentary14 . This is one of the only datasource through which
we were able to extract numbered hadith references, which
3.3.2 Linking with other sources could be automatically mapped to the hadith collections
available with us. An example of such a reference is shown
There are other datasources that we plan to link with in in Figure 1. A pattern extraction module was designed to
the future. Scholars database from a source such as Muslim- parse the contents of the commentary. The content of the
ScholarsDatabase 13 or eNarrator (Hadith Isnad Ontology) verses, translation and the footnotes were segmented. The
[32] [4]. The major limitation is that these sources are not mapping between the verses and the corresponding footnotes
currently available in Linked Data format. However, they was easy, given the direct correlation. Pattern matching was
present huge potential for linking. then applied to extract the collection name, volume number
and the hadith number. This was then mapped to the num-
4. LINKED DATA PUBLISHING bers in our hadith collection. This can be challenging at
times, because not all hadith collections use the same type
FRAMEWORK AND IMPLEMENTATION of numbering convention. In such a case, it is non-trivial to
In order for Semantic Hadith to become a defacto stan- map the hadith citation to the corresponding hadith in the
dard and an integral part of the emerging Semantic Web repository. Human intervention will be required for valida-
and the LOD cloud for the Islamic Knowledge domain, we tion. We were able to obtain and validate some 300 verse
also aimed at providing a reusable framework for publishing to hadith relations. Since the commentary is not a detailed
available Hadith based knowledge sources as linked data. one, rather comments are only sparingly included as foot-
This is shown in Figure 8. As elaborated in Section 1.1.3, notes to the verse translations, it was expected that this
there are multiple hadith repositories available. Therefore, number would be small.
this reusable framework will benefit multiple hadith publish- We also performed text mining on the arabic text of the
ers to not only expose their data, but also to establish equiv- hadith data to obtain the Hadith-Verse citations, as de-
alence links with other repositories. This would be essential scribed in Section 3.2.1. For this, we developed a verse-
towards realizing the vision of linked Islamic knowledge as extraction component, which implements a sub-string match-
presented in [7]. ing problem, in order to detect complete or partial verses
that may be cited in a given hadith. This is not trivial for
11
corpus.quran.com several reasons. Different verses span different lengths in
12
tanzil.net
13 14
http://muslimscholars.info http://qurancomplex.gov.sa/Quran/Targama/Targama.asp
Figure 7: A view of the proposed and available Linked Data Cloud for Islamic Knowledge Sources
the Qur’an. While some may be as long as an entire page’s Tables 1 and 2. Table 1 summarizes the statistics for some
length of a standard book size, others may be as short as one of the key entities present in the dataset.
or two words. Therefore, in order to determine, whether the Table 2 provides the raw count for the candidate relations
verse is actually being quoted or cited in a hadith requires extracted under the different categories mentioned. It must
further validation. Even applying a threshold, relative to be noted however, that the relations are not classified ac-
the length of the verse, is not an optimal solution. Setting cording to any of the parameters mentioned in the design.
a substantial minimal length was considered, but this may It is also worth mentioning that some of these relations may
not guarantee a comprehensive coverage. For the first proto- actually be symmetric.
type, only 1,325 expert validated links were asserted. In the
sunnah.com data, these links may be found as hyperlinks to
the verses on the site quran.com. Table 1: Entity Statistics in the Semantic Hadith
In addition, similarity computation algorithms were de- Dataset(Sunnah.com)
vised to extract Hadith-Hadith similarity relations. The
4,973 relations, listed in Table 2 are strongly similar Ha- Entity Count
dith that have at least 60% of text in common. However, No of Collections 11
the challenge with this approach is that, it cannot be dis- No of Books 311
tinguished if the similarity is in the Sanad or the Matan or No of Hadith(Arabic) 25,934
both. The more meaningful similarities that are of interest No of Hadith(English) 18,040
are in the Matan of the hadith. In future experiments, we aim No of Chapters 8,968
to segment the Sanad and the Matan and extract respective
similarity relations. While the similarity threshold for the
current approach only took into consideration the common
substring, we plan to conduct experiments with more mean- Table 2: Link Statistics in the Semantic Hadith
ingful similarity measures such as Cosine, Jaccard and Pear- Dataset
son correlation coefficient, as done in our work for Qur’anic
verses [8]. Link Type Count
Hadith-Hadith Relation 4,973
4.2.3 Results Hadith-Verse Relation(Citations) 1,325
Based on the dataset and experiments carried out, we Verse-Hadith Relation(Scholarly) 313
summarize some of the dataset and link statistics in the
Figure 8: Linked Data Generation and Publishing Framework for Semantic Hadith
4.3 Existing Limitations and Proposed Solu- PREFIX rdfs:
tions PREFIX hvoc:
link extraction and validation. There is an obvious lack of PREFIX dcterms:
structured knowledge sources, with well marked citations. PREFIX qvoc:
Therefore, the Verse-Hadith links are extremely difficult to
be extracted using mere computational means. Human con- select ?hadith_text ?surahNo ?verseNo
tribution is a must. For this purpose, we intend to pursue ?ayahText ?ayahEng
a crowdsourcing approach, based on our prior work[6]. We WHERE {
not only intend to use crowdsourcing and human computa- ?verse hvoc:isRelatedTo ?hadith;
tion methods for the purpose of knowledge acquisition, but hvoc:verseNo ?verseNo ;
also for knowledge validation. Infact, we believe a hybrid hvoc:surahNo ??surahNo .
human-machine computation methodology to be the only ?hadith hvoc:hadithId ?hId;
indispensable means of being able to fulfill the vision for hvoc:hadithText ?hadith_text .
linked Islamic knowledge at scale, while ensuring the desired
reliability and authenticity. SERVICE {
?s qvoc:chapterIndex ?surahNo;
qvoc:verseIndex ?verseNo;
rdfs:label ?ayahText;
5. PROSPECTIVE APPLICATIONS rdfs:label ?ayahEng.
The most significant benefit of realizing the linked data vi- FILTER (lang(?ayahEng) ="en" &&
sion for Islamic knowledge sources will be towards enabling lang(?ayahText) ="ar")
semantics driven distributed knowledge search and retrieval. }}
Most current applications in the Islamic domain only provide
limited provision for semantic and conceptual search and re- This could be taken to another level, by adding another
trieval beyond the traditional keyword based searches, upon level of federation, and querying the themes of the verse
a single repository. With the Semantic Hadith model, the from the QuranOntology.
first of its kind tools will now be possible that would let
Qur’an and Hadith repositories to be queried and searched PREFIX rdfs:
in a federated manner. PREFIX hvoc:
the Semantic Hadith and Semantic Quran datasets. Given PREFIX dcterms:
that a Verse-Hadith relation exists with the Semantic Ha- PREFIX qur:
dith dataset, this query retrieves the arabic and english texts
for the respective verse. select ?hadith_text ?surahNo ?verseNo
?tname works in this regard include [20], [18], [17], [19], [21]. There
WHERE { are also work references with respect to mining the hadith
?verse hvoc:isRelatedTo ?hadith; for indexing and classification [2], [29]. Some recent efforts
hvoc:verseNo ?verseNo ; have attempted to model the hadith as semantic ontologies
hvoc:surahNo ?surahNo . [4] [32]. However, the efforts have focused on annotating
?hadith hvoc:hadithId ?hId; the different constituents of the hadith. None of these data-
hvoc:hadithText ?hadith_text . sources are available as open source.
Our work is the first of its kind to propose the linked
SERVICE { data based model to propose the linking of hadith with the
?verse qur:DiscussTopic ?t. Qur’an. This linked knowledge forms a vital backbone to en-
?t rdfs:label ?tname. able better integration and discovery of knowledge sources.
FILTER(LANGMATCHES(LANG(?tname), "ar"))
}}} 7. CONCLUSIONS AND FUTURE WORK
In this paper we presented the design and development of
This could be further enhanced by automated interlinking our Semantic Hadith framework, which aims to provide the
with other available datasources on the linked data cloud, foundation for semantically interlinking the most important
as envisioned in Figure 7. For instance, once the available Islamic knowledge sources using the linked data standards.
hadith are annotated with mentioned events, place or peo- We presented the design of the Semantic Hadith Ontology
ple, they may be linked to the available entities in dbpedia. and explained the nature of links with other data sources.
This would enable richer knowledge discovery and retrieval The implementation still needs to be matured. The valida-
for a range of applications. tion of the links and extracted knowledge is a huge challenge
We expect that using this model, more hadith and Qur’anic we are looking into. We are investigating into crowdsourcing
exegesis repositories, that also rely on and cite heavily the models for knowledge acquisition and validation at scale.
hadith sources, will be published in the linked data for-
mat. This will enable the design and development of en- 8. ACKNOWLEDGEMENTS
hanced learning tools for the Islamic domain, which will pro- We are thankful to the authors of sunnah.com for pro-
vide efficient and personalized access to primary sources of viding us with the valuable datasource to carry out this re-
knowledge, ensuring reliability and authenticity. Given that search. We would also like to acknowledge the efforts of Mr.
these tools will give better access to meaningfully interlinked Muhammad Shoaib (Jeju National University, South Korea)
knowledge, it will require less effort to find resources and ac- in extending his help with some of the experiments.
cess knowledge beyond books. More content, both classical
and contemporary, would become discoverable.
9. REFERENCES
[1] H. S. Al-Khalifa, M. Al-Yahya, A. Bahanshal,
6. RELATED WORK I. Al-Odah, and N. Al-Helwah. An approach to
The linked data approach has emerged as the de facto compare two ontological models for representing
standard for sharing the data on the web.It provides a set quranic words. In Proceedings of the 12th International
of best practices for publishing and connecting structured Conference on Information Integration and Web-based
data on the web [9]. The linked data design issues provide Applications and Services, pages 674–678. ACM.
guidelines on how to use standardized web technologies to [2] K. A. Aldhlan and A. M. Zeki. Datamining and
set data-level links between data from different sources[23]. islamic knowledge extraction: alhadith as a knowledge
Increased interest in the LOD has been seen in various sec- resource. In Information and Communication
tors e.g. Education [11], [31], Scientific research [3], libraries Technology for the Muslim World (ICT4M), 2010
[28], [25], Government [12], [24], [33], Cultural heritage [26] International Conference on, pages 21–25. IEEE, 2010.
and many others, however, the religious sector has yet to [3] T. K. Attwood, D. B. Kell, P. McDermott, J. Marsh,
cache upon the power of the linked open data. S. Pettifer, and D. Thorne. Utopia documents: linking
Research in computational informatics applied to the Is- scholarly literature with research data. Bioinformatics,
lamic knowledge has primarily centered around Morpholog- 26(18):568–574, 2010.
ical annotation of the Qur’an [13], [14], Ontology modeling
[4] A. Azmi and N. B. Badia. itree-automating the
of the Qur’an [1], [5], [15], [36], [37], and Arabic Natural
construction of the narration tree of hadiths
language processing [15]. The LOD take-up in the area of
(prophetic traditions). pages 1–7, 2010.
Islamic knowledge has been particularly extremely limited.
As mentioned earlier, there have been some recent efforts [5] S. Baqai, A. Basharat, H. Khalid, A. Hassan, and
to publish Islamic knowledge as linked data on the Linked S. Zafar. Leveraging semantic web technologies for
Open Data (LOD) cloud. The efforts primarily focus on the standardized knowledge modeling and retrieval from
Qur’an. The two datasets that we consider in our research the holy qur’an and religious texts. In Proceedings of
and attempt to link with our Semantic Hadith research in- the 7th International Conference on Frontiers of
clude SemanticQuran 15 [34] and QuranOntology 16 [16]. Information Technology, FIT ’09, pages 42:1–42:6,
Much of the work in the Hadith sciences has focused au- New York, NY, USA, 2009. ACM.
tomating the extraction of the Chain of Narrators. Some [6] A. Basharat, I. B. Arpinar, S. Dastgheib,
U. Kursuncu, K. Kochut, and E. Dogdu. Semantically
15
http://datahub.io/dataset/semanticquran enriched task and workflow automation in
16
http://www.quranontology.com crowdsourcing for linked data management.
International Journal of Semantic Computing, application to hadith indexing. In Applications of
8(04):415–439, 2014. Digital Information and Web Technologies, 2008.
[7] A. Basharat, K. Rasheed, and I. B. Arpinar. Towards ICADIWT 2008. First International Conference on
linked open islamic knowledge using human the, pages 107–112. IEEE, 2008.
computation and crowdsourcing. In Proceedings of the [22] S. Hasan. An introduction to the science of Hadith.
International Conference on Islamic Applications in Al-Quran Society, 1994.
Computer Science And Technology, 2015. [23] T. Heath and C. Bizer. Linked data: Evolving the web
[8] A. Basharat, D. Yasdansepas, and K. Rasheed. into a global data space. Synthesis lectures on the
Comparative study of verse similarity for multi-lingual semantic web: theory and technology, 1(1):1–136, 2011.
representations of the qur’an. In Proc. on the Int. [24] J. Hendler, J. Holm, C. Musialek, and G. Thomas. Us
Conference on Artificial Intelligence (ICAI), pages government linked open data: semantic. data. gov.
336–343, 2015. IEEE Intelligent Systems, 27(3):0025–31, 2012.
[9] C. Bizer, T. Heath, and T. Berners-Lee. Linked data - [25] L. C. Howarth. Frbr and linked data: Connecting frbr
the story so far. International Journal on Semantic and linked data. Cataloging and Classification
Web and Information Systems, 5(3):1–22, 2009. Quarterly, 50(5-7):763–776, 2012.
[10] D. Brickley and L. Miller. Foaf vocabulary [26] J. Marden, C. Li-Madeo, N. Whysel, and J. Edelstein.
specification 0.98. Namespace document, 9, 2012. Linked open data for cultural heritage: evolution of an
[11] S. Dietze, S. Sanchez-Alonso, H. Ebner, H. Q. Yu, information technology. pages 107–112, 2013.
D. Giordano, I. Marenzi, and B. P. Nunes. Interlinking [27] A. Miles, B. Matthews, M. Wilson, and D. Brickley.
educational resources and the web of data a survey of Skos core: Simple knowledge organisation for the web.
challenges and approaches. Program-Electronic Library In Proceedings of the 2005 International Conference
and Information Systems, 47(1):60–91, 2013. on Dublin Core and Metadata Applications:
[12] L. Ding, T. Lebo, J. S. Erickson, D. DiFranzo, G. T. Vocabularies in Practice, DCMI ’05, pages 1:1–1:9.
Williams, X. Li, J. Michaelis, A. Graves, J. G. Zheng, Dublin Core Metadata Initiative, 2005.
Z. Shangguan, J. Flores, D. L. McGuinness, and J. A. [28] E. Miller and M. Westfall. Linked data and libraries.
Hendler. Twc logd: A portal for linked open The Serials Librarian, 60(1-4):17–22, 2011.
government data ecosystems. Journal of Web [29] M. Naji Al-Kabi, G. Kanaan, R. Al-Shalabi, S. I.
Semantics, 9(3):325–333, 2011. Al-Sinjilawi, and R. S. Al-Mustafa. Al-hadith text
[13] K. Dukes, E. Atwell, and A. M. Sharaf. Syntactic classifier. Journal of Applied Sciences, 5:584–587, 2005.
annotation guidelines for the quranic arabic [30] A. A. B. Philips. Usool at-Tafseer: The Methodology
dependency treebank. In Proceedings of the of Qur’aanic Explanation. AS Noordeen, 2002.
International Conference on Language Resources and [31] N. Piedra, E. Tovar, R. Colomo-Palacios,
Evaluation, LREC 2010, 17-23 May 2010, Valletta, J. Lopez-Vargas, and J. A. Chicaiza. Consuming and
Malta, 2010. producing linked open data: the case of
[14] K. Dukes and N. Habash. Morphological annotation of opencourseware. Program: electronic library and
quranic arabic. In Proceedings of the International information systems, 48(1):16–40, 2014.
Conference on Language Resources and Evaluation, [32] Y. M. D. Rebhi S. Baraka. Building hadith ontology
LREC 2010, 17-23 May 2010, Valletta, Malta, 2010. to support the authenticity of isnad. International
[15] A. Farghaly and K. Shaalan. Arabic natural language Journal on Islamic Applications in Computer Science
processing: Challenges and solutions. ACM And Technology, 2(1):25–39, 2014.
Transactions on Asian Language Information [33] N. Shadbolt, K. O’Hara, T. Berners-Lee, N. Gibbins,
Processing, 8:1–22, 2009. H. Glaser, and W. Hall. Linked open government
[16] A. Hakkoum and S. Raghay. Ontological approach for data: Lessons from data. gov. uk. IEEE Intelligent
semantic modeling and querying the qur’an. In Systems, 27(3):16–24, 2012.
Proceedings of the International Conference on Islamic [34] M. A. Sherif and A.-C. N. Ngomo. Semantic Quran - a
Applications in Computer Science And Technology, multilingual resource for natural-language processing.
2015. Semantic Web, 6(4):339–345, 2015.
[17] F. Harrag. Text mining approach for knowledge [35] S. Weibel, J. Kunze, C. Lagoze, and M. Wolf. Dublin
extraction in sahih al-bukhari. Computers in Human core metadata for resource discovery. Technical report,
Behavior, 30:558–566, 2014. 1998.
[18] F. Harrag, E. El-Qawasmeh, and A. M. S. Al-Salman. [36] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A.
Extracting Named Entities from Prophetic Narration Murad. Quranic-based concepts: Verse relations
Texts (Hadith), pages 289–297. Springer, 2011. extraction using manchester owl syntax. In
[19] F. Harrag and A. Hamdi-Cherif. Uml modeling of text Information Retrieval and Knowledge Management
mining in arabic language and application to the (CAMP), 2012 International Conference on, pages
prophetic traditions hadiths. pages 11–20, 2007. 317–321. IEEE.
[20] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh. [37] A. R. Yauri, R. A. Kadir, A. Azman, and M. A. A.
Information retrieval architecture for hadith text Murad. Quranic verse extraction base on concepts
mining. Journal of Digital Information Management, using owl-dl ontology. Research Journal of Applied
6(6), 2008. Sciences, Engineering and Technology,
[21] F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmeh. 6(23):4492–4498, 2013.
Vector space model for arabic information retrieval –