Knowledge gap discovery: A case study of Wikidata Millenio Ramadizsa1 , Fariz Darari1,2 , Werner Nutt3 and Simon Razniewski4 1 Faculty of Computer Science, Universitas Indonesia, Indonesia 2 Tokopedia-UI AI Center of Excellence, Indonesia 3 Free University of Bozen-Bolzano, Italy 4 Bosch Center for AI, Germany Abstract Society, science, and economy are becoming more and more data-driven, and therefore the study of gaps in knowledge gains importance. The arguably most prominent public source of structured knowledge is Wikidata, which contains impressive amounts of knowledge, but nonetheless comes with surprising gaps. In this paper we propose a framework for identifying class-level knowledge gaps in Wikidata, based on the concepts of gap properties, i.e., properties that mostly exist for prominent entities, but are missing in the tail, and the gap property ratio. We conduct an analysis for a varied set of 20 classes, and show that our framework can discover unexpected knowledge gaps, that may guide contributors towards addressing them. 1. Introduction Society, science, and economy are getting increasingly reliant on data in decision making. While the overall trend is that data and knowledge are vastly and constantly collected, stored, and processed, this does not happen at an equal rate: domains, topics, or subjects receive uneven coverage. These imbalances have fueled a whole research field that uncovers them, most notably in terms of coverage of genders [1, 2, 3], citizenships [4], and individual entity assertions [5]. Whether observed imbalances really reflect a bias of editors or imbalances of the real-world, is often difficult to disentangle [6], and one should therefore be cautious with drawing conclusions. Nonetheless, awareness of knowledge gaps, and their characterization, is a first step towards investigating possible root causes, and is therefore a crucial task. In this paper, we propose to identify and characterize knowledge gaps of classes in knowledge graphs via the concept of gap properties, which are properties, that are frequently present among the “information-richest” entities in a class, but largely absent from the “poor” ones. Gap properties can then be used (𝑖) to identify classes with large gaps (that is, classes, where most properties are gap properties), and (𝑖𝑖) to characterize which properties constitute the imbalances. Our contributions are three-fold: 1. We introduce the concept of gap properties for knowledge graphs, and show how they can be used to identify and characterize class-level knowledge gaps in knowledge graphs. Wikidata’23: Wikidata workshop at ISWC 2023 © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. We perform a case study on 20 Wikidata classes, showing that gap properties provide an intuitive way to understand knowledge imbalances. 3. We analyze the nature of knowledge gaps, finding that on general classes, these mostly relate to generic, logically existing properties, that could in principle be added, while on specific classes, optional achievements stand more out. 2. Related Work Inequality, bias, and fairness Observing, explaining, and criticizing inequality are fun- damental to human society [7, 8]. Notably, the Gini coefficient [8] has become a widespread tool to measure economic inequality in a single number. Inequalities and biases also affect fairness in the information society: it is a widely held belief that inequalities are able to reinforce themselves, e.g., in terms of representations of genders, ethnicities, or educational backgrounds, in various professional roles. As such, society may consider it important to identify inequalities, and to actively tackle them. Bias and gaps in Wikipedia and Wikidata With the emergence of public world knowledge repositories, the question of their fairness and gaps has emerged. The Wikimedia Foundation considers knowledge gaps a core challenge in their 2030 agenda [9]. A considerable set of studies have focused not only on gender imbalances in Wikipedia [1, 2, 3], but also languages [10], nationalities [4], or individuals [5]. Most interestingly, a study by Abian et al. suggested a methodology to disentangle editor-based gaps from readership-interest-based gaps [6]. The structured format of Wikidata makes statistical analyses especially easy, yet also com- plicates their interpretation, since gaps may merely stem from technical reasons. A set of tools for helping editors with knowledge gaps in Wikidata have been developed. COOL-WD can be used to display insights about the completeness of subject-property-pairs [11]. ReCoin enables completeness information relative to other entities in the same class [5]. It adds a traffic-light-style status indicator, and lists frequent absent properties. ProWD is a framework and web application tool for profiling the completeness of Wikidata [12, 13]. It notably provides information about group-level knowledge distribution via the Gini index. Our present proposal builds upon the ProWD framework: While the Gini index in ProWD can only provide numeric insights into imbalances, with gap properties, we aim to characterize them. 3. Formal Framework Knowledge Graphs Knowledge graphs (KGs) organize assertions about entities, such as Ada Lovelace worked as a computer scientist, or Ada Lovelace was born on 10 December 1815. They are stored as triples like (AdaLovelace, occupation, ComputerScientist), or (AdaLovelace, dateOfBirth, “10 December 1815”). The triples are made up of three kinds of building blocks: items, properties, and literals. In the triples above, AdaLovelace and ComputerScientist are items, “10 December 1815” is a literal, and occupation and dateOfBirth are properties. Items represent entities, such as people, cities, organizations, and concepts. Literals are scalar values, such as numbers, strings, or dates. Properties are used to assert that entities are related to other entities or atomic values. Such triples are called statements. They have the generic form (𝑠, 𝑝, 𝑜), where 𝑠 plays the role of the subject of the statement, 𝑝 of the predicate, and 𝑜 of the object. Subjects are always items, predicates are properties, while objects can be either items or literals. A KG is a set of such triples. We refer the reader to [14] for the Resource Description Framework (RDF), a standard data model for KGs. In Wikidata, items and literals can be freely created by KG editors. The set of properties, however, is carefully administered. New properties can only be created by a community consensus, since properties constitute the language in which statements are formulated. Classes KGs have special items that represent classes. Examples in Wikidata are Human, Country, Painting, or Object. That an item 𝑎 is an instance of a class 𝐶 is expressed by the statement (𝑎, instanceOf, 𝐶). In this paper we apply a more general notion of class. For us, a class is defined by a class item and possibly additional conditions. Technically, this is achieved by means of queries of the form (?𝑣, 𝑃 ), consisting of a set 𝑃 of triple patterns (i.e., like triples but with the addition of variables) [15], one of which is (?𝑣, instanceOf, 𝐶), plus a projection ?𝑣 onto a single variable. As an example, we can define the class of computer scientists by the query (?𝑣, { (?𝑣, instanceOf, Human), (?𝑣, occupation, ComputerScientist) }). Instantiating that class over the Wikidata graph returns as instances Ada Lovelace and Tim Berners-Lee, among others. We can make such a subclass more specific by adding more triple patterns, defining for instance all computer scientists of a certain nationality. Information Wealth of Entities Let 𝐺 be a KG and 𝑎 an item in 𝐺. Moreover, let 𝑇𝑎 be the set of triples where 𝑎 appears as the subject. In a way, the set 𝑇𝑎 comprises all the information about (the entity represented by) 𝑎 that is provided by 𝐺: it constitutes the wealth of information about 𝑎 in 𝐺. For example, the wealth of Ada Lovelace in Wikidata comprises all Wikidata triples with Ada Lovelace appearing in the subject position, and that the triples describe a range of Ada Lovelace’s properties from image and birth name to audio and plaque image. To gauge the information wealth of 𝑎 in 𝐺, we can can apply various measures to 𝑇𝑎 . A straightforward measure is the cardinality of 𝑇𝑎 . Another one is the number of properties occurring in 𝑇𝑎 . While the first measure can be skewed by a large number of statements about the same property, the second reflects the variety of information about an entity. It seems natural to expect that often all or at least most true statements about a specific property of an item have been entered into a KG if the property is present. For this reason, we concentrate on the number of distinct properties occurring in 𝑇𝑎 , denoted as np(𝑎), as the measure of information wealth of an entity 𝑎. For a given class 𝐶, one can study how the values np(𝑎) are distributed for the items 𝑎 in 𝐶. In previous work [13], the distribution of information wealth within classes of items has been analyzed in terms of Gini coefficients. By adopting the Gini coefficient formula for income distribution [16], one may measure the degree of inequality of the information wealth of items in a class (e.g., computer scientist, sovereign state), ranging from 0.0 (i.e., the perfect equality) to 1.0 (i.e., the perfect inequality). However, the Gini coefficient alone does not tell in which way statements of poor items differ from those of rich ones, which is the focus of the present paper. Properties Associated with Information Richness and Poverty The concepts below are defined with respect to a given class 𝐶 in a KG 𝐺. We consider 𝐶 as the set of its instances. To keep things simple, we do not mention 𝐶 explicitly, but assume that it is clear from the context. We can sort the elements 𝑎 of 𝐶 in ascending order with respect to np(𝑎). For each integer (𝑘) (𝑘) 𝑘 > 0, we divide 𝐶 into 𝑘 quantiles with respect to this order, which we denote as 𝑄1 , . . . , 𝑄𝑘 . Typical values of 𝑘 are 4, 5, and 10, where we speak of quartiles, quintiles, or deciles. We want to find out which properties are typical for which quantile, in particular, which properties are (𝑘) (𝑘) typical for the top and bottom quantile 𝑄𝑘 and 𝑄1 for a given 𝑘. We often denote the top quantile as rich and the bottom quantile as poor. We adapt some vocabulary from the field of association rule mining. We denote the cardinality of a set 𝑋 as |𝑋|. Let 𝑝 be a property of the KG 𝐺. The domain of 𝑝 is the set of items that appear as subject of some statement with predicate 𝑝, that is dom(𝑝) = { 𝑎 | (𝑎, 𝑝, 𝑜) ∈ 𝐺 for some item or literal 𝑜 }. Suppose 𝑆 is some subset of interest of 𝐶, for instance a union of 𝑘-quantiles like 𝑆 = poor∪rich or simply 𝑆 = 𝐶. The support of 𝑝 relative to 𝑆 is the proportion of items in 𝑆 that are in the domain of 𝑝, that is, |dom(𝑝) ∩ 𝑆| supp𝑆 (𝑝) = . |𝑆| (𝑘) Let 𝑄 be some quantile 𝑄𝑖 of 𝐶. We call 𝑄 → 𝑝 an association rule between 𝑄 and the property 𝑝. For example, rich → dateOfBirth is such a rule. The confidence in the rule 𝑄 → 𝑝 is defined as |dom(𝑝) ∩ 𝑄| conf (𝑄 → 𝑝) = . |𝑄| Clearly, conf (𝑄 → 𝑝) and supp𝑄 (𝑝) are numerically equivalent. Finally, we introduce the lift of the rule 𝑄 → 𝑝 relative to 𝑆, where 𝑄 ⊆ 𝑆. This is defined as conf (𝑄 → 𝑝) lift𝑆 (𝑄 → 𝑝) = . supp𝑆 (𝑝) Note that the definition of relative lift puts an upper bound on the value liftpoor∪rich (rich → 𝑝). The lift is maximal if all items in the rich quantile have property 𝑝 and none in the poor quantile. Then conf (rich → 𝑝) = 1 while supppoor∪rich (𝑝) = 0.5, so that liftpoor∪rich (rich → 𝑝) = 2. The other extreme occurs if no rich item has property 𝑝 while some poor items do have 𝑝. In that case conf (rich → 𝑝) = 0, hence liftpoor∪rich (rich → 𝑝) = 0. Gap Properties We are especially interested in properties that occur with some minimum frequency in a class, and are abnormally frequent within the richest decile. Concretely, we call a property 𝑝 a gap property if supp𝑝𝑜𝑜𝑟∪rich (𝑝) ≥ 0.1 and liftpoor∪rich (rich → 𝑝) ≥ 1.5. The support threshold is defined as such in order to eliminate a fair number of spurious properties, that is, properties that might appear as gap properties by accident. On the other hand, the lift threshold is determined based on the middle point of the spectrum between total equality (i.e., lift of 1.0) and total inequality (i.e., lift of 2.0). To be able to compare gaps between classes in a normalized context, we also introduce the gap property ratio (GPR), which we define as the fraction of properties 𝑝 with supppoor∪rich (𝑝) ≥ 0.1 that are gap properties. 4. Experimental Evaluation Experimental Setup We aim to conduct a gap analysis for a varied set of 20 classes and show that gap properties can provide insights to understand knowledge imbalances for a real-world knowledge graph. We use an RDF version of a Wikidata dump dated September 30, 2020. More specifically, we take the truthy subset of that dump, focusing on a direct representation of Wikidata assertions without considering qualifiers and references.1 Moreover, we omit external identifier properties2 as this allows us to concentrate on properties that describe and link entities from within Wikidata (as opposed to external data sources). Without the removal of such properties, our gap analysis would tend to be heavily biased towards well-known entities in relation to external parties. As formalized in Section 3, in our experiments we divide our classes of interest into 10 quantiles (that is, deciles) and perform a gap analysis relative to the poor and rich quantiles. Our experiment program is developed using Java for the data preprocessing part, and Python for the analysis part. We rely on the Java-based Apache Jena library3 for processing our RDF data and making it available through SPARQL querying [15]. Once we get hold of the data, we then use our Python program to analyze gap properties. The program is based on SPARQLWrapper4 for querying RDF, pandas5 for data analysis, and tqdm6 for tracking the running progress of the program. Our code for the experiment is available at https://github.com/millerama17/ association-analysis. 4.1. Gap Property Analysis We report on an overview of our gap property analysis over 20 Wikidata classes: human, com- puter scientist (CS), American CS, German CS, Indonesian CS, football player (FP), Bundesliga FP, Premier League FP, language, painting, painting at the Museum of Modern Art (MoMA), painting at the Louvre, public university, sovereign state, gene, galaxy, taxon, movie, song, and 1 https://www.wikidata.org/wiki/Wikidata:Database_download 2 https://www.wikidata.org/wiki/Wikidata:External_identifiers 3 https://jena.apache.org/ 4 https://github.com/RDFLib/sparqlwrapper 5 https://pandas.pydata.org/ 6 https://pypi.org/project/tqdm/ Table 1 Top 10 Most Frequent Gap Properties from 20 Wikidata Classes Rank Gap Property Count 1 image 12 2 Commons category 11 3 name in native language 7 4 award received 6 5 date of death 6 6 family name 6 7 languages spoken, written or signed 6 8 official website 6 9 place of birth 6 10 described by source 5 Table 2 Top 10 Most Frequent Gap Properties from Human-associated Classes: Human, Computer Scientist (CS), American CS, German CS, Indonesian CS, Football Player (FP), Bundesliga FP, Premier League FP Rank Gap Property Count 1 Commons category 7 2 image 7 3 name in native language 7 4 award received 6 5 date of death 6 6 family name 6 7 languages spoken, written or signed 6 8 place of birth 6 9 official website 5 10 place of death 4 Table 3 Top 10 Most Frequent Gap Properties from Non-Human Classes: Language, Painting, Painting at the Museum of Modern Art, Painting at the Louvre, Public University, Sovereign State, Gene, Galaxy, Taxon, Movie, Song, Star Rank Gap Property Count 1 genre 5 2 image 5 3 title 5 4 Commons category 4 5 catalog code 3 6 main subject 3 7 topic’s main category 3 8 coordinate location 2 9 native label 2 10 subclass of 2 star. The selection of the classes is based on the following considerations: human vs. non-human, class vs. subclass, and (in terms of Gini coefficient) equal vs. inequal. The complete list of the gap properties of the 20 classes is available at https://bit.ly/GPR20Classes. Table 1 ranks the 10 most frequent gap properties out of the 20 classes. Furthermore, Table 2 and Table 3 show the 10 most common gap properties for human-associated classes and non-human classes, respectively. Characterization of gap properties Among the 20 classes examined, a few intriguing observations emerge. In Table 1, “image” (P18) and “Commons category” (P373) are the two most frequent gap properties. Both properties are similar in the sense that they are related to providing images or multimedia files for Wikidata items and could in principle apply to all 20 classes. By being gap properties, it means that the properties “image” and “Commons category” are commonly found in rich Wikidata items but not so in poor ones. The rest of the gap properties in Table 1 are dominated by properties to describe humans as can be confirmed by the content of Table 2 about the top 10 most frequent gap properties from human-associated classes. As for Table 2, we gather a number of remarks. The most notable gap property is “place of birth” (P19), which in principle indeed every human possesses. The designation of “date of death” (P570), “family name” (P734), and “name in native language” (P1559) as gap properties is likely due to the fact that not all humans have passed away, not every culture adheres to the tradition of family names, and some individuals do not have a name variation in their native language. Table 3 lists top 10 most frequent gap properties for non-human classes. The properties “genre” (P136), “image” (P18), and “title” (P1476) are among the most common gaps. Note that “genre” and “title” (P1476) are gap properties for classes related to creative work like painting (and its subclasses), movie, and song. The gap property of “catalog code” (P528) is exclusive to paintings (and its subclasses). Moreover, “coordinate location” (P625) is found to be a gap property for public university and, interestingly, language. Gap properties on classes vs. subclasses We also discuss gap properties in the context of class-subclass relationships. Take, for example, the class of computer scientist. There are 11 gap properties of human that are also gap properties of computer scientist, such as “award received” (P166), “country of citizenship” (P27), and “image” (P18). In total, exactly half of all gap properties of computer scientist inherit from those of human. Gap properties of human that are not found in computer scientist include “sport” (P641) and, interestingly, “date of birth” (P569). On the other side, gap properties of computer scientist that are not found in human are, e.g., “Erdős number” (P2021), “doctoral student” (P185), and “field of work” (P101). It is apparent, therefore, that gap properties on more specific subclasses more often are properties that describe specific achievements inside the class, which not necessarily every class member possesses. In contrast, gap properties on more general classes more typically reflect just KG incompleteness. We now take another example, the class of painting. Out of 11 gap properties of painting at MoMA, as many as 9 are inherited from those of painting. Furthermore, the class of painting at Figure 1: Gap Property Ratio (GPR) vs. Gini Coefficient for 20 Wikidata Classes the Louvre inherits 7 gap properties from the class of painting; this accounts for 64% of all gap properties in the class of painting at Louvre. Gap properties existing in painting at MoMA but not existing in painting are “copyright holder” (P3931) and “country of origin” (P495), while gap properties found in painting at the Louvre but not found in painting are “Commons category” (P373), “depicts Iconclass notation” (P1257), “exhibition history” (P608), and “movement” (P135). Again, we find that several of these specialized properties do not apply to every item, i.e., they do not necessarily reflect an actionable incompleteness of the investigated knowledge graph. Gap Property Ratio Figure 1 reveals a clear positive correlation between the Gap Property Ratio (GPR) and the Gini coefficient for each class.7 Specifically, a higher GPR value corresponds to a higher Gini coefficient value. To quantify this correlation, we apply the Spearman correlation, resulting in the value of 0.95.8 Given such a high correlation value, it is evident that the GPR can be effectively used to gauge the level of imbalance within a Wikidata class. Another interesting finding relates to the relationship between classes and their subclasses. Subclasses tend to exhibit lower GPRs and Gini coefficients compared to their superclasses. For instance, the classes “computer scientist” and “football player” possess lower GPR and Gini coefficient values than the broader class of “human”. Similarly, more specific classes such as “American CS”, “German CS”, “Indonesian CS”, “Bundesliga FP”, and “Premier League FP” show lower GPR and Gini coefficients than their more general classes. This pattern likely arises because general classes encompass a wider array of entities from diverse backgrounds, thus leading to larger gaps. Conversely, filtered or specific classes consist of entities from a more homogeneous group, usually sharing a greater number of common properties. 4.2. Case Studies We investigate more deeply gaps in the classes of computer scientist and sovereign state. We choose these two classes to showcase an imbalanced class vs. a more balanced one, which we 7 We refer the reader to https://bit.ly/GPR20Classes for details about the GPR and Gini coefficient of each class. 8 We choose the Spearman method because it does not rely on linearity nor normality. Table 4 Top 10 Gap Properties from Computer Scientists and Sovereign States Rank Computer Scientist Sovereign State 1 notable work coordinates of geographic center 2 name in native language median income 3 Erdős number patron saint 4 place of death seal description 5 doctoral student category of people buried here 6 Commons category archives at 7 described by source compulsory education (maximum age) 8 residence age of candidacy 9 languages spoken, written or signed studied by 10 member of water as percent of area will later show through our gap property analysis. Computer Scientist Table 4 lists top 10 gap properties of computer scientists as well as sovereign states, ordered in descending order by the lift values. There are a few noteworthy gap properties from computer scientists. Examples include academic-related properties such as “notable work” (P800), “Erdős number” (P2021), and “doctoral student” (P185). This implies that “poor” computer scientists have no information about, e.g., their notable work. Indeed, the property of “notable work” heavily depends on the popularity of computer scientists (and hence, less popular computer scientists might not have any notable work). "place of death" (P20) and "residence" (P551) are common properties for humans (and hence for computer scientists), yet not all humans have passed away, and not all humans (esp. non-public figures) are willing to share their residence information (due to privacy issues). The computer scientist class has 22 gap properties out of 27 properties with support ≥ 0.1. The GPR value is therefore 22/27, or approximately 0.81. It has a pretty high GPR which means that most properties of computer scientists are gap properties. Sovereign State As shown in Table 4, the top-2 gap properties of sovereign states are “coordinates of geographic center” (P5140) and “median income” (P3529), which are actually attributes that all sovereign states should have in the real world. Next, the property of “patron saint” (P417) is however only applicable to countries related to Christianity. Other properties rarely owned by “poor” sovereign states but frequently occurring in “rich” ones include “seal description” (P418),9 “category of people buried here” (P1791), and “archives at” (P485). In an absolute number, there are much higher gap properties in sovereign states compared to computer scientists, that is, 60 vs. 22, respectively. This, however, does not mean that the sovereign state class is more imbalanced than computer scientist. On the contrary, due to the larger number of sovereign state properties with support ≥ 0.1, totaling at 146 properties, the GPR of the sovereign state class is indeed pretty low, that is, 60/146, or approximately 0.41 (as opposed to 0.81, the GPR of the computer scientist class). 9 It is now called “has seal, badge, or sigil”. As lessons learned, imbalances in Wikidata classes can be effectively measured using gap properties and the GPR. Gap properties are useful for identifying which properties constitute the wealth separation between the poor and rich groups in a class, while the GPR provides a tool for comparing the gap level among classes in a normalized context. Findings from our framework may raise the prevasive issue of knowledge gaps, that the problem is real, and actions can be taken by the editors and community to mitigate such an issue. Our framework highlights the essential properties that need data completion, ruling out irrelevant ones, such as “military rank” (P410) in the computer scientist class. Our analysis allows us to understand knowledge gaps and spark initiatives to manage them effectively, promoting accuracy and efficiency in data completion. 5. Conclusions We have proposed a framework to discover knowledge gaps in Wikidata classes. We have introduced the concepts of gap properties and the gap property ratio (GPR) that can be useful to give insights as to which properties constitute the gaps between the poor and rich groups of a Wikidata class and measure (and compare) the gap level among classes. Our experimental evaluation of gap analysis over 20 classes of Wikidata has shown that knowledge gaps do exist and that awareness to such an issue can be approached scientifically. Especially to Wikidata researchers and contributors, this tool can help them address this phenomenon more swiftly and accurately by identifying the essential properties in Wikidata that constitute imbalances and addressing them accordingly. Acknowledgement This work has been partially supported by the project CONFUCIUS, funded by the Free Univer- sity of Bozen-Bolzano. References [1] J. Reagle, L. Rhue, Gender Bias in Wikipedia and Britannica, International Journal of Communication 5 (2011) 1138–1158. [2] C. Wagner, D. Garcia, M. Jadidi, M. Strohmaier, It’s a Man’s Wikipedia? Assessing Gender Inequality in an Online Encyclopedia, in: ICWSM, 2015. [3] F. Tripodi, Ms. Categorized: Gender, notability, and inequality on Wikipedia, New Media & Society 25 (2023) 1687–1707. [4] Z. Shaik, F. Ilievski, F. Morstatter, Analyzing Race and Citizenship Bias in Wikidata, in: MASS, 2021. [5] V. Balaraman, S. Razniewski, W. Nutt, Recoin: Relative Completeness in Wikidata, in: WWW (Companion Volume), 2018. [6] D. Abián, A. Meroño-Peñuela, E. Simperl, An Analysis of Content Gaps Versus User Needs in the Wikidata Knowledge Graph, in: ISWC, 2022. [7] K. Marx, Das Kapital, Verlag von Otto Meissner, 1867. [8] C. Gini, Variabilità e Mutabilità: Contributo allo Studio delle Distribuzioni e delle Relazioni Statistiche, P. Cuppini, 1912. [9] M. Redi, I. Johnson, M. Gerlach, L. Zia, Address Knowledge Gaps, Three Years On, Wiki- media Foundation, 2022. [10] J. M. Dolmaya, Expanding the sum of all human knowledge: Wikipedia, translation and linguistic justice, The Translator 23 (2017) 143–157. [11] R. E. Prasojo, F. Darari, S. Razniewski, W. Nutt, Managing and Consuming Completeness Information for Wikidata using COOL-WD, in: COLD@ISWC, 2016. [12] A. Wisesa, F. Darari, A. Krisnadhi, W. Nutt, S. Razniewski, Wikidata Completeness Profiling Using ProWD, in: K-CAP, 2019. [13] N. H. Ramadhana, F. Darari, P. O. H. Putra, W. Nutt, S. Razniewski, R. I. Akbar, User- Centered Design for Knowledge Imbalance Analysis: A Case Study of ProWD, in: VOILA@ISWC, 2020. [14] G. Schreiber, Y. Raimond (Eds.), RDF 1.1 Primer, W3C Working Group Note, 24 June 2014. https://www.w3.org/TR/rdf11-primer/. [15] S. Harris, A. Seaborne (Eds.), SPARQL 1.1 Query Language, W3C Recommendation, 21 March 2013. https://www.w3.org/TR/sparql11-query/. [16] A. Sen, On Economic Inequality, Oxford University Press, 1997.