<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Semantic Representation of Igbo Text Using Knowledge Graph ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Nkechi</forename><forename type="middle">J</forename><surname>Ifeanyi-Reuben</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Computer Science Department</orgName>
								<orgName type="institution">Nnamdi Azikiwe University Awka</orgName>
								<address>
									<country key="NG">Nigeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Patience</forename><forename type="middle">Usoro</forename><surname>Usip</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Computer Science Department</orgName>
								<orgName type="institution">University of Uyo</orgName>
								<address>
									<settlement>Uyo</settlement>
									<country key="NG">Nigeria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Semantic Representation of Igbo Text Using Knowledge Graph ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">2163B6A1F3249AD464555EEAF1B53EEA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T08:18+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Igbo Language</term>
					<term>Text Representation</term>
					<term>Text Classification</term>
					<term>Ontology</term>
					<term>Knowledge Graph</term>
					<term>Artificial Intelligence</term>
					<term>Compound Word</term>
					<term>Semantics</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>With the fast growth of Artificial Intelligence and its application in different areas of Natural Language Processing, semantic representation contributes immensely to smoothing the progress of different automated language processing applications. Semantic representation returns the meaning of the text as it may be understood by humans. Although semantic representation is very useful for several applications, no semantic model is proposed for the Igbo language. The usage of Igbo language in the text-based applications such as text mining, information retrieval, natural language processing is at the increase. Igbo language uses compounding in its word formation and word ordering play high role in the language. The uncertainty in dealing with these compound words has made the representation of Igbo text very difficult. There is need to for smart data representation model in the said language to enhance efficiency and effectiveness in its text-based application. This paper presents the analysis of a language classification, considering Igbo language, considering its compounding nature and describes a smart model for text representation using a Knowledge Graph. The model will create a smart data repository the real-world usage of text and tangled its context relationship. The proposed Igbo Knowledge Graph (IKG) text representation model was used in Igbo text classification system. The performance of the Igbo text classification system is measured by computing the precision, recall and F1-measure of the result obtained on bigram, semantic-based and unigram represented textual documents. The Igbo text classification on semantic-based represented text has highest degree of exactness (precision). This shows that the classification on semantic-based Igbo represented text outperforms bigram and unigram represented texts. Semantic-based text representation model using knowledge graph is highly recommended for any Igbo text-based system. It enables automated reasoning as well addresses the challenges incurred as a result of Igbo compounding, word ordering and collocations language peculiarities.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IWMSW-2022: International Workshop on Multilingual Semantic Web, Co-located with the KGSWC-2022, November 21-23, 2022, Madrid, Spain</head><p>⋆ You can use this document as the template for preparing your publication. We recommend using the latest version of the ceurart style. * * Corresponding author. † These authors contributed equally. Envelope nj.ifeanyi-reuben@unizik.edu.ng (N. J.Ifeanyi-Reuben); patienceusip@uniuyo.edu.ng (P. U. Usip) Orcid 0000-0002-6516-5194 (P. U. Usip)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Text representation is the selection of appropriate features to represent document <ref type="bibr" target="#b0">[1]</ref>. The approach in which text is represented has a big effect in the performance of any text-based applications <ref type="bibr" target="#b1">[2]</ref>. It is powerfully controlled by the language of the text. The spread of Information Technology (IT) in real life activities has assisted in inculcating Igbo language in text-based application such as text creation, web creation, text mining, information retrieval and natural language processing. This research improved the existing research of <ref type="bibr" target="#b2">[3]</ref>. on the analysis and representation of Igbo text for a text-based system by incorporating the semantic representation of the text in order to create detailed notations of the text that accurately conveys its meaning. Semantic representation of the textual document is very rich and is adopted in many applications of Natural Language Processing (NLP) such as machine translation, information retrieval, question answering, text classification, sentiment analysis, text summarisation and text extraction. It reflects the meaning of the text as it may be understood by humans. Thus, it contributes to facilitating various automated language processing applications. The research on <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b5">6]</ref> emphasized that the semantic representation of Arabic text can facilitate several natural language processing applications such as text summarization and textual entailment. Semantic representation can be achieved using Knowledge Graph (KG). Semantic representation reflects the meaning of the text as it may be understood by humans. Thus, it contributes to facilitating various automated language processing applications. Semantic representation can be achieved using Knowledge Graph (KG). Knowledge Graph (KG) is a way to represent and organize the data in a more efficient and easy way to modify, use, and understand <ref type="bibr" target="#b6">[7]</ref> It is also referred to as a collection of interlinked description of concepts, entities, relationships and events via linking and semantic metadata, providing a framework for data integration, unification, analytics and sharing. With the widespread growth of Igbo data on the Web, the need for efficient methods to get and arrange valuable information from these big noisy data is increased. This research presents an Igbo Knowledge Graph (IKG) for representing data created with Igbo language for better performance for any Igbo text-based applications. This Igbo smart representation will be useful for many purposes such as question answering, summarization and information retrieval. The model chosen by the researchers will also help to discover unidentified facts and concealed knowledge that may exist in the lexical, semantic or relations in Igbo text corpus.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.">Language classification</head><p>A language is a method of communication between individuals who share common code, in form of symbols <ref type="bibr" target="#b7">[8]</ref>. In linguistics, there are two kinds of language classification: genetic (or genealogical) and typological. Genetic, also known as genealogical language is a type that group languages into families based to their degree of diachronic relatedness. Examples of genealogic language group are German, English, Dutch, Swedish, Norwegian, Danish, Irish, Welsh, Breton, etc. Typological classification groups languages into types according to their structural characteristics. These structural characteristics can be phonological typology, morphological typology or syntactic typology. Typological languages form words by agglutination. Examples are Igbo, Turkish, Finnish, Japanese, etc. <ref type="bibr" target="#b8">[9]</ref> Igbo Language The Igbo language is one of the agglutinative languages, a language that form words through the combination of smaller morphemes to get compound words. It is one of the three major languages in Nigeria. It is largely spoken by the people in the eastern part of Nigeria. Igbo language has many dialects. The standard Igbo is used formally and is adopted for this research. The current Igbo orthography <ref type="bibr" target="#b7">[8]</ref> is based on the Standard Igbo. Orthography is a way of writing sentence or constructing grammar in a language. Standard Igbo has thirty-six (36) alphabets (a, b, ch, d, e, f, g, gb, gh, gw, h, i, ị, j, k, kw, kp, l, m, n, nw, ny, ṅ, o, ọ, p, r, s, sh, t, u, ụ, v, w, y, z). Igbo language has a large number of compound words. A compound word is a word that has more than one root, and can be made from combination of either nouns, pronouns or adjectives. Ifeanyi-Reuben et al. <ref type="bibr" target="#b7">[8]</ref> studied the Igbo compound words and categorized them as follows: i. Nominal (NN) Compound Word: A nominal compound word is formed by the combination of two or more nouns. The nominal compound words are written separately not minding the semantic status of the nouns in Igbo. Example of Igbo nominal compound words are: nwa akwụkwọ -student; onye nkuzi -teacher; ama egwuregwu -stadium; ụlọ ọgwụ -hospital; ụlọ akwụkwọ -school. ii. Agentive Compound Words: In agentive compound word, one or more nouns express the meaning of the agent, doer of the action. The Igbo agentive compound words are written separately irrespective of the translations in English. They can also be referred to as VN (Verb Noun) compound words. Example: oje ozi -messenger; oti ịgba -drummer. iii. Igbo Duplicated Compound Word: Igbo duplicated compound words are formed by the repetition of the exact word two or more times to show a variety of meaning. For example: ọsọ ọsọ -quickly; mmiri mmiri -watery; ọbara ọbara -reddish. iv. Igbo Coordinate Compound Words: This compound word is formed by the combination of two or words joined by the Igbo conjunction "na" meaning "and" in English. All the Igbo compound words of this category is written separately. Example: Ezi na ụlọ -family; okwu na ụka -quarrel. v. Igbo Proper Compound Words: This category of Igbo compound words includes personal names, place names, and club names. All words in this category are wriiten together not minding how long they may be. Example: Uchechukwu; Ngozichukwuka; Ifeanyichukwu. vi. Igbo Derived Compound Words: The derived Igbo compound words are words derived from verbs or phrases. The roots of the derived Igbo compound words are written together. Example: Dinweụlọ -landlord. Igbo, being an agglutinative language, has a huge number of compounds words and can be referred to as a language of compound words. The proposed research of Igbo Knowledge Graph representation will consider this peculiarity to get a good result. Zhang, Yoshida and Tang <ref type="bibr" target="#b9">[10]</ref> studied and compared the performance of adopting TF*IDF, LSI (Latent Semantic Indexing) together with multiple words for text representation. They used Chinese and English corpora to assess the three techniques in information retrieval and text categorization. Their result showed that LSI produced greatest performance in retrieving English documents and also produced best performance for Chinese text categorization. Chih-Fong <ref type="bibr" target="#b10">[11]</ref> improved and applied Bag of Word (BOW) for image annotation. An image annotation is used to allocate keywords to images automatically and the images are represented using characteristics such as color, texture and shape. This is applied in Content-Based Image Retrieval System (CBIRS) and the retrieval of the image is based on indexed image features. Usip and Ntekop <ref type="bibr" target="#b11">[12]</ref> posited that ontology is a necessary technology tool for easy and intelligent reasoning with knowledge. Being the underlying schema for every knowledge graph, this study will improve the existing work of Ifeanyi-Reuben et al. <ref type="bibr" target="#b7">[8]</ref> by adding intelligence to the work using Knowledge Graph. Ontology-driven applications for multilinguality was described by Usip and Ekpenyong <ref type="bibr" target="#b12">[13]</ref>. Etaiwi and Awajan <ref type="bibr" target="#b13">[14]</ref> proposed SemG-TS, a novel semantic graph embedding-based abstractive text summarization model for the Arabic language which employed a deep neural network to generate abstractive summary. The result obtained shows SemG-TS model outperforms the popular baseline word embedding technique, word2vec.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>The bulk of concerns for any text-based system are attributed to text representation considering the peculiarities of the natural language involved. In this section, we propose an efficient and effective model to represent Igbo text to be adopted by any text-based system. This is a process of transforming unstructured Igbo textual document into a form proper for automatic processing. This is a vital step in text processing because it affects the general performance of the system. The proposed approach for the Igbo text representation process is shown in Figure <ref type="figure" target="#fig_1">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Text Preprocessing</head><p>Text preprocessing involves tasks that are performed on text to convert the original natural language text to a structure ready for processing. It performs very important functions in Igbo Text Normalization: In Normalization process, we transformed the Igbo textual document to a format to make its contents consistent, convenient and full words for an efficient processing. We transformed all text cases to lower case and also removed diacritics and noisy data. The noisy data is assumed to be data that are not in Igbo dataset. Text Tokenization: Tokenization is the task of analyzing or separating text into a sequence of discrete tokens (words). Igbo Stop-words Removal: Stop-words are language-specific functional words; the most frequently used words in a language that usually carry no information. There are no specific number of stop-words which all Natural Language Processing (NLP) tools should have. Most of the language stop-words are generally pronouns, prepositions, and conjunctions. This task removes the stopwords in Igbo text. Some of Igbo stopwords is shown in Figure <ref type="figure" target="#fig_2">2</ref>.</p><p>In the proposed system, a stop-word list will be created and saved in a file named "stop-words" and is loaded to the system whenever the task is asked to perform. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Knowledge Graph Text Representation</head><p>Knowledge graphs combine characteristics of several data management paradigms:</p><p>• Database: The data can be explored via structured queries.</p><p>• Graph: Data can be analyzed as any other network data structure.</p><p>• Knowledge base: The model will bear formal semantics, which can be used to interpret the data and infer new facts. Igbo Knowledge graphs will provide good framework for Igbo data integration, unification, linking and reuse.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Sample Igbo Text and the Corresponding Proposed Knowledge Graph</head><p>Given the examples of Igbo compound words in table 1, it is observed that the actual meaning of the semantic correctness of Igbo compound words is not the same when compared with their roots and meaning after decomposition to Igbo single words. Hence, the need for the compound word categorization.</p><p>Following the categorization of the Igbo compound words, a knowledge graph representation of Igbo words and the various categories is given in Figure <ref type="figure" target="#fig_4">3</ref>.</p><p>The underlying ontology used as the schema for the knowledge graph has the domain knowledge which includes the bilingual corpora of Igbo single words and their English meaning, the n-gram modeling feature and resulting Igbo compound words classified based on the compound word categorization.</p><p>From the knowledge graph, the relationship among the various Igbo compound word, single Igbo word and their English word meaning can be determined and used in an effort towards the construction of a semantically correct bilingual Igbo -English Language dictionary consisting of both single and compound Igbo words. With the knowledge graph, missing links between Igbo compound and single words can be detemined at a glance for proper restructuring and fixture to produce a semantically correct Igbo word  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">System Performance Evaluation</head><p>The system performance is evaluated by computing the precision, F1-measure and Recall. Precision is defined as the quotient of total TPs and sum of total TPs and FPs. Precision point is known to as a point of correctness.</p><formula xml:id="formula_0">Precision = 𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑃<label>(1)</label></formula><p>Recall of the classification system is described as the quotient of total TPs and sum of total TPs and total FNs. Recall level measures completeness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Recall = 𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑁</head><p>(2) F1-Measure is single function that joins recall and precision points. When the F1-measure is high, it means that the overall text classification system is high.</p><formula xml:id="formula_1">F1-Measure = (2 * 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 * 𝑅𝑒𝑐𝑎𝑙𝑙) (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙)<label>(3)</label></formula><formula xml:id="formula_2">= 2𝑇 𝑃 (2𝑇 𝑃 + 𝐹 𝑃 + 𝐹 𝑁 )<label>(4)</label></formula><p>In summary, computation of precision, recall and f1-measure required four input parameters: TP, FP, TN and FN. i. TP -total of text documents accurately allotted to document class. ii. FP -total of text documents wrongly allotted to document class. iii. FN -total of text documents wrongly rejected from document class. iv. TN -total of text documents correctly rejected from document class. These parameters are input to the evaluator. They are obtained from the classification result.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Result Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Conclusion</head><p>An improved intelligent approach for representing Igbo text document using Knowledge Graph model considering the agglutinative nature of Igbo language is proposed. This is to solve the issues of collocations, compounding, and word ordering that plays major roles in the language, thereby making the representation semantic-enriched. The model is implemented and evaluated using Igbo text classification system. The model will be of high commercial potential value and will be useful in any text based intelligent system on the language. It will also motivate other researchers to develop interest in doing more research on Igbo language processing to the benefit of people and society.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Ifeanyi-Reuben et al. [chidiebere2020analysis] presents the analysis of Igbo language text document and describes its representation with the Word-based N-gram model. The result shows that Bigram and Trigram n-gram text representation models perform better than unigram model. Wael and Arafat [3] proposed a graph-based semantic representation model for Arabic text. The proposed model aims to extract the semantic relations between Arabic words. The results proved that the proposed graph-based model is able to enhance the performance of the textual entailment recognition task in comparison to other baseline models.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Igbo Text Representation Process.</figDesc><graphic coords="5,89.29,84.19,507.75,353.25" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Sample of Igbo Stop-words list.</figDesc><graphic coords="6,166.39,84.19,262.50,90.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4</head><label>4</label><figDesc>Figure 4 is a designed Igbo knowledge graph model showing all the due processes employed to represent Igbo textual document based on its semantic (reasoning) using knowledge graph.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Knowledge Graph Representation of Sample Igbo Compound.</figDesc><graphic coords="8,89.29,84.19,430.50,280.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Igbo Knowledge Graph Model</figDesc><graphic coords="9,89.29,84.19,726.00,459.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 5</head><label>5</label><figDesc>Figure 5 displays the Text classification module of the system used to test the effectiveness of the proposed model. The result obtained in text classification on Igbo text represented semantically using knowledge graph is compared with the results obtained in unigram and bigram text representation models. Table3and Figure6show the classification performance measure result and chart respectively. The result shows that the recall, precision and F1 for bigram Igbo represented text are 1.00, .80 and .89 respectively. The recall, precision and F1 for semantic-based Igbo text are 1.00, .90 and .95 respectively. The recall, precision and F1 for unigram Igbo represented text are 1.00, .62 and .82 respectively. Recall evaluates the degree of completeness. The result shows Igbo text classification on the text represented with the three models (bigram, semantic-based and unigram) has the equal level of recall (completeness). This means all the text documents that were given to the classifier, were given a label name. Precision measures the degree of exactness. The classification with semantic-based has highest degree (0.90) of exactness (precision). Table2gives the summary of classification result obtained on Bigram, Semantic-Based and Unigram text representation. A total of 10 testing documents are used for the experiment. In bigram, eight documents are correctly assigned a class label while two are incorrectly assigned a class label. In semantic-based text representation using knowledge graph, 9 documents are correctly assigned a class label while one is incorrectly assigned. In unigram, 7 documents are correctly assigned a class label while 3 are incorrectly assigned a class label.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head></head><label></label><figDesc>The performance was measured by computing the classification accuracy of Bigram, Semantic-Based and Unigram represented text. The result showed that the classification performed on Semantic-based represented text has higher performance than Bigram and unigram represented texts. It has shown that a high quality text representation model certainly boost performance of NLP tasks.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Igbo Compound Words<ref type="bibr" target="#b7">[8]</ref> </figDesc><table><row><cell>Igbo Compound Words</cell><cell>Meaning</cell><cell>Roots and meaning</cell><cell>Compound Word Category</cell></row><row><cell></cell><cell></cell><cell>Onye -Person</cell><cell></cell></row><row><cell>Onye nkuzi</cell><cell>Teacher</cell><cell>Nkuzi -Teach</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Ezi -surrounding</cell><cell></cell></row><row><cell></cell><cell></cell><cell>Na -and</cell><cell></cell></row><row><cell>Ezi na ụlọ</cell><cell>Family</cell><cell>ụlọ -family</cell><cell>Coordinate</cell></row><row><cell></cell><cell></cell><cell>Ojiiego -use money</cell><cell></cell></row><row><cell>Ojiiegoachọego</cell><cell>businessman</cell><cell>achọego -find money</cell><cell>Derived</cell></row><row><cell></cell><cell></cell><cell>ụgbọ -vessel</cell><cell></cell></row><row><cell>ụgbọ ala</cell><cell>Car, motor</cell><cell>ala -land (road)</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Egbe -gun</cell><cell></cell></row><row><cell>Egbe igwe</cell><cell>Thunder</cell><cell>Igwe -sky</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Iri -ten</cell><cell></cell></row><row><cell>Iri abụọ</cell><cell>Twenty</cell><cell>Abụọ -two</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Ode -Write</cell><cell></cell></row><row><cell>Ode akwụkwọ</cell><cell>Secretary</cell><cell>Akwụkwọ -book</cell><cell>Agentive</cell></row><row><cell></cell><cell></cell><cell>Ebere -mercy</cell><cell></cell></row><row><cell>Eberechukwu</cell><cell>God's mercy</cell><cell>Chukwu -God</cell><cell>Proper</cell></row><row><cell>Mmiri mmiri</cell><cell>Watery</cell><cell>Mmiri -water</cell><cell>Duplicate</cell></row><row><cell>ọcha ọcha</cell><cell>Whitish</cell><cell>ọcha -white</cell><cell>Duplicate</cell></row><row><cell></cell><cell></cell><cell>Onye -person</cell><cell></cell></row><row><cell>Onye nchekwa</cell><cell>Administrator</cell><cell>Nchekwa -protect</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Kọmputa -Computer</cell><cell></cell></row><row><cell>Kọmputa Nkunaka</cell><cell>Laptop</cell><cell>Nkunaka -Handcarry</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>ọkpụ -mold</cell><cell></cell></row><row><cell>Ọkpụ ụzụ</cell><cell>Blacksmith</cell><cell>ụzụ -clay</cell><cell>Agentive</cell></row><row><cell></cell><cell></cell><cell>Nche -protect</cell><cell></cell></row><row><cell>Nche anwụ</cell><cell>Umbrella</cell><cell>Anwụ -sun</cell><cell>Agentive</cell></row><row><cell></cell><cell></cell><cell>Onyonyo -screen</cell><cell></cell></row><row><cell>Onyonyo kọmputa</cell><cell>Monitor</cell><cell>Kọmputa-computer</cell><cell>Nominal</cell></row><row><cell></cell><cell></cell><cell>Okwu -speech</cell><cell></cell></row><row><cell>Okwu ntughe</cell><cell>Password</cell><cell>Ntughe -opening</cell><cell>Nominal</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3 and</head><label>3</label><figDesc>Figure 6  show the classification performance measure result and chart respectively. The result shows that the recall, precision and F1 for bigram Igbo represented text are 1.00, .80 and .89 respectively. The recall, precision and F1 for semantic-based Igbo text are 1.00, .90 and .95 respectively. The recall, precision and F1 for unigram Igbo represented text are 1.00, .62 and .82 respectively. Recall evaluates the degree of completeness. The result shows Igbo text classification on the text represented with the three models (bigram, semantic-based and unigram) has the equal level of recall (completeness). This means all the text documents that were given to the classifier, were given a label name. Precision measures the degree of exactness. The classification with semantic-based has highest degree (0.90) of exactness (precision). Table2gives the summary of classification result obtained on Bigram, Semantic-Based and Unigram text representation. A total of 10 testing documents are used for the experiment. In bigram, eight documents are correctly assigned a class label while two are incorrectly assigned a class label. In semantic-based text representation using knowledge graph, 9 documents are correctly assigned a class label while one is incorrectly assigned. In unigram, 7 documents are correctly assigned a class label while 3 are incorrectly assigned a class label.</figDesc><table /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Acknowledgments</head><p>The authors wish to express gratitude the unknown reviewers of this work for their useful comments and contributions that assisted in enhancing the worth of this paper.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Text classification improved through multigram models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-T</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th ACM international conference on Information and knowledge management</title>
				<meeting>the 15th ACM international conference on Information and knowledge management</meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="672" to="681" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Representation quality in text classification: An introduction and experiment</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">D</forename><surname>Lewis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Speech and Natural Language: Proceedings of a Workshop Held at Hidden</title>
				<meeting><address><addrLine>Valley, Pennsylvania</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1990">June 24-27, 1990, 1990</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Graph-based arabic text semantic representation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Etaiwi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Awajan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="page">102183</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Tools, Languages, Methodologies for Representing Semantics on the Web of Things</title>
		<author>
			<persName><forename type="first">S</forename><surname>Tiwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Siarry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mehta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jabbar</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2022">2022</date>
			<publisher>John Wiley &amp; Sons</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Aishe-onto: a semantic model for public higher education universities</title>
		<author>
			<persName><forename type="first">R</forename><surname>Panchal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Swaminarayan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tiwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ortiz-Rodriguez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The 22nd Annual International Conference on Digital Government Research</title>
				<editor>
			<persName><surname>Dg</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2021">O2021. 2021</date>
			<biblScope unit="page" from="545" to="547" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Mexin: Multidialectal ontology supporting nlp approach to improve government electronic communication with the mexican ethnic groups</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ortiz-Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tiwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Panchal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Medina-Quintero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Barrera</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The 23rd Annual International Conference on Digital Government Research</title>
				<editor>
			<persName><surname>Dg</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="461" to="463" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">A</forename><surname>Ahmed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">N</forename><surname>Al-Aswadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Noaman</surname></persName>
		</author>
		<title level="m">Arabic knowledge graph construction: a close look in the present and into the future</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
		<respStmt>
			<orgName>Journal of King Saud University-Computer and Information Sciences</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">U</forename><surname>Chidiebere</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tunde</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2009.06376</idno>
		<title level="m">Analysis and representation of igbo text document for a text-based system</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The writing of standard igbo in okereke oo</title>
		<author>
			<persName><forename type="first">M</forename><surname>Onukawa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">readings in citizenship education</title>
				<meeting><address><addrLine>Okigwe</addrLine></address></meeting>
		<imprint>
			<publisher>Wythem Publishers</publisher>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Text classification based on multi-word with support vector machine</title>
		<author>
			<persName><forename type="first">W</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yoshida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Tang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge-Based Systems</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="page" from="879" to="886" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Bag-of-words representation in image annotation: A review</title>
		<author>
			<persName><forename type="first">C.-F</forename><surname>Tsai</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Scholarly Research Notices</title>
		<imprint>
			<date type="published" when="2012">2012. 2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">The use of ontologies as efficient and intelligent knowledge management tool</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">U</forename><surname>Usip</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ntekop</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Future Technologies Conference (FTC), IEEE</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="626" to="631" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Towards ontology-driven application for multilingual speech language therapy</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">U</forename><surname>Usip</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Ekpenyong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Human Language Technologies for Under-Resourced African Languages</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="85" to="101" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Semg-ts: Abstractive arabic text summarization using semantic graph embedding</title>
		<author>
			<persName><forename type="first">W</forename><surname>Etaiwi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Awajan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Mathematics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page">3225</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
