=Paper=
{{Paper
|id=Vol-2293/jist2018pd_paper7
|storemode=property
|title=Evaluating Semantic Interoperability of Government Open Data Portals
|pdfUrl=https://ceur-ws.org/Vol-2293/jist2018pd_paper7.pdf
|volume=Vol-2293
|authors=Cathy S. Lin,Hsin-Chang Yang
|dblpUrl=https://dblp.org/rec/conf/jist/LinY18
}}
==Evaluating Semantic Interoperability of Government Open Data Portals==
Evaluating Semantic Interoperability of Government Open Data Portals Cathy S. Lin [0000-0002-7976-3523] and Hsin-Chang Yang [0000-0001-5851-2760] National University of Kaohsiung, Kaohsiung 811, Taiwan {cathy,yanghc}@nuk.edu.tw Abstract. Open data emerged in the last decade as governments worldwide endeavor to publicize their owned data for government transparency, effective- ness, and social goodness. Many applications and use cases of open government data have been reported and revealed the potentials of them. Most of the users accessed these data via portals established by governments in different scales. Aspects such as usability, accessibility, and openness should be considered in building such portals to provide users data and services in high quality. In this study, we applied a set of measures covering the semantic aspects of open data portal design and maintenance to evaluate the quality of government open data portals. Results on several major national-level portals were reported. Keywords: Government Open Data, Open Data Portal, Open Data Quality, Portal Quality Evaluation, Semantic Interoperability. 1 Introduction Drive by the trend of open data, authorities of the public sector release their data ac- tively or proactively and play the roles of data providers, producing enormous amount of government open data (GOD). In this aspect, the governments can improve many aspects of governance, such as transparency, accountability, innovative and intelligent services, and government-citizen interactivity, through the use of open data. It is essen- tial and common to provide a portal, which is generally implemented as a Web service, for accessing a large quantity of data maintained by their owners. Many governments have established their open data portals as the volume of GOD increases drastically recently. As a Web service, an open data portal should meet requirements in various aspects to provide quality services. The arguments on the architecture and functions of open data portals are still ongoing since we lack standard and good practices on these subjects. Schemes for evaluating the quality of portals are thus required for data pro- viders in building their portals. The quality assessment should take account of several aspects such as functionality, usability, accessibility, openness, and so forth. Several schemes have been devised to tackle this problem (Máchová & Lnénicka, 2017). Data semantics is important information that can be applied to many domains such as information retrieval, e-commerce, knowledge acquisition, etc. An open data portal that can provide datasets together with much semantic information will be helpful for 2 service providers and application developers. Thus, it is crucial to access the semantic interoperability, which is defined here as the completeness and easiness of implemen- tation and access of the data semantics, of an open data portals. However, current prac- tices on open data evaluation did not pay much attention to this issue. In this work, we will try to give an initial attempt on evaluating the semantic interoperability of open data portals. 2 Quality Assessment Measures for Open Data Portals Open data portals are essential infrastructure for disseminating open government data. Governments publish their data on these portals and allow individuals or organizations to use them freely to create values. Currently, several measures and metrics have been proposed to ensure the effectiveness of such portals (Máchová & Lnénicka, 2017). Ac- cording to Máchová & Lnénicka’s survey, the United Kingdoms ranks first, followed by India and the United States. However, Taiwan was ranked 21 out of 89, which con- trasts with the result of Global Open Data Index 1 and Open Data Barometer2 (Brown, 2017). Unfortunately, the semantic interoperability of open data portals were rarely dis- cussed. To evaluate the semantic interoperability of the open data portals, we observed that two major aspects should be addressed. First, the descriptive semantics of datasets should be provided. These descriptive information could be stored as their metadata, which could be easily conformed to existing standards. Second, the portals should pro- vide a way to allow users utilizing such data semantics. In this work, we identified 22 indices which could be used to evaluate the semantic interoperability of open data por- tals. Table 1 lists the proposed indices. These indices were categorized into 2 classes, namely ‘semantic description’ and ‘semantic operation’, which reflect the two aspects mentioned above. We believe that these indices could reflect semantic information in aspects of content and access to some extent. Evaluations on such indices may give an insight into the semantic interoperability of the open data portals. Table 1. List of indices for evaluating semantic interoperability of GOD portals. Number Category Description 1 Each dataset includes at least a title and a description 2 A creation date is given for each dataset The most recent date on which the dataset was changed, 3 Semantic updated, or modified is given for each dataset description The identity and role of the agency and person responsible 4 for each dataset is specified 5 The datasets are categorized 1 https://index.okfn.org/ 2 https://opendatabarometer.org/ 3 Number Category Description 6 The datasets are tagged with keywords Each dataset is accompanied by a reference to the lan- 7 guage used If a dataset refers to a specific range of time, its temporal 8 coverage is specified If a dataset refers to a specific geographical area, its spatial 9 coverage is specified The descriptive record contains a direct link to the URL 10 of the data The name and email address of the publisher of a dataset 11 are provided 12 The frequency with which dataset is updated is provided 13 Each dataset is accompanied by a descriptive record Metadata associated with each dataset is available in a 14 standard format Metadata describing the datasets is structured in a stand- 15 ard way Any vocabularies used within the dataset are identified 16 and documented Topics are linked to published vocabularies and taxono- 17 Semantic mies operation 18 Each dataset is given a unique identifier A checksum and/or signature is available to verify the va- 19 lidity of each file Data adheres to the defined syntax of any specified vocab- 20 ularies It is possible to query data and metadata in accordance 21 with standards of the web of data (Linked Open Data) 22 Data sources linked from a dataset are reported 3 Evaluations on Semantic Interoperability of GOD Portals For preliminary justification of the proposed metrics, we adopted the indices in Table 1 to evaluate some open data portals. In this study, six examiners were assembled to assess the portals using the checklist during June 2015. These examiners are all gradu- ate students taking the open data related courses for at least two semesters. They were also trained in how to apply the indices. We translated the indices into Chinese and gave two-hour instructions on the meaning of each index. Examiners then assessed every portal independently. For each index, the five-level Likert scale was adopted to score the index where a score of 5 means that the function or requirement is fully implemented or applicable and 1 means the function or requirement is missed entirely 4 or inapplicable. The scores for each index were then aggregated over all examiners as its final score. We evaluated three GOD portals, namely US3, UK4, and Taiwan5, in the preliminary experiments. The result is shown in Table 2. We only show the overall result of the two categories due to space limitation. The result of the evaluation shows that the US portal outperformed the other two in both categories. However, all portals received lower ratings in ‘semantic operation’ category, reflecting the fact that imple- mentation of functionality and support for standards such as linked data (Berners-Lee, 2006) for access and usage of semantic information were still immature for these por- tals. Table 2. The result of the semantic interoperability assessment. The TW, US, and the UK stand for the official GOD portals of Taiwan, US and UK governments, respectively. We only show the average scores of each theme due to space limitation. Category TW US UK Semantic description 3.11 3.73 3.48 Semantic operation 2.07 3.47 2.09 4 Conclusions In this study, we devised a set of criteria to assess the quality of three major government open data portals on their semantic interoperability. Two aspects of semantic interop- erability were defined to measure the descriptive and operational nature of semantics provided in the portals. A total of 22 indices were suggested. We believe that this is the first attempt to assess the semantic interoperability of open data portals. Preliminary experiments on three major government open data portals, namely US, UK, and Tai- wan, were conducted. The US portal topped the other two according to the result. References 1. Berners-Lee, T. (2006). Linked Data. Retrieved from http://www.w3.org/DesignIs- sues/LinkedData.html 2. Brown, C. (2017). Open Source Data: Just How Open Is Taiwan’s Government? Retrieved from https://www.ketagalanmedia.com/2017/07/27/open-source-data-just-open-taiwans- government/ 3. Máchová, R., & Lnénicka, M. (2017). Evaluating the Quality of Open Data Portals on the National Level. Journal of Theoretical and Applied Electronic Commerce Research, 12(1), pp. 21-41. 3 https://www.data.gov/ 4 https://data.gov.uk/ 5 https://data.gov.tw/