Folksonomies behind the scenes

                      Leyla Jael García-Castro1, Alexander García2
             1
               Universität der Bundeswehr München, Werner-Heisenberg-Weg 39,
                                 85779 Neubiberg, Germany
                                     w31blega@unibw.de
            2
              University of Arkansas for Medical Sciences, Biomedical Informatics.
                          agarcia@uams.edu /alexgarciac@gmail.com


       Abstract. In this position paper we analyze the similarities amongst
       folksonomies, semantic wikis, and ontology building; we also propose an
       alignment and orchestration of ontologies representing these scenarios. We
       argue that such alignment enables a more direct application of folksonomy-
       based approaches over these set-ups. The rationale behind folksonomies is
       shared across environments such as collaborative ontology building and
       semantic wikis. The three of them aim to facilitate knowledge sharing across
       communities; a social environment in which individual and community
       objectives are achieved supports them all.
       Keywords: Ontology engineering, social web, semantic web, folksonomy, wiki


1    Introduction

Social tagging systems (STS) have become increasingly popular within the Web 2.0
era; they allow users to freely associate terms, i.e. tags, to resources. Tags support a
variety of tasks such as information retrieval, personal organization strategies, and
share-ability. Conceptual structures emerging from STS are known as folksonomies
[1, 2]; they have been used mainly to improve the retrieval on tagged resources [3-5]
as well as to discover shared conceptualizations and make explicit the semantic
behind tags [6, 7]. Simplicity and immediate benefits for end users, e.g. bookmarks
available online, are part of the rationale behind the fast adoption of STS [8].
   Ontologies are shared conceptualizations that aim to represent an abstraction of a
particular domain [9, 10]. Ontologies play a central role in Semantic Web because
they are intended to enable data and information exchange in a machine-accessible
format; establishing in this way common vocabularies and semantic interpretations of
concepts [11]. Whilst agreements in folksonomies are implicit and mainly reached by
common use and popularity; agreements in ontologies are explicit as well as
documented and supported on evidences.
   Wikis enable users to collaboratively create, share, and edit information via a
browser interface, thus the final content is the result of everybody’s effort [12].
Semantic Wikis, introduced in 2004 [13], aim to facilitate ontology content
integration in Wiki as well as to support the evolution of knowledge: moving from
term lists to logical constraints while granting users the freedom over the creative
process. Most of the existing semantic Wikis rely on RDF and mainly support
subject-predicate-object structures [13]. Semantic Wikis exhibit a similar structure as
folksonomies supporting both semantic annotations and selection within documents: a
semantic annotation takes as subject a document or a portion of it and relates it to an
object by means of a semantic qualifier, e.g. skos:broader. This structure could also
be extracted from links in traditional Wikis; in this case the annotations would
establish a relatedTo relation.
   In this position paper we analyze the similarities amongst folksonomies, semantic
wikis, and ontology building; we also propose an alignment and orchestration of
ontologies representing these scenarios. We argue that such alignment enables a more
direct application of folksonomy-based approaches over these set-ups.


2    Folksonomies behind the scenes

Annotation Ontology. The Annotation Ontology (AO) [14] represents the annotation
process within social environments. AO is built upon the Annotea Project
(http://www.w3.org/2001/Annotea/); it is also compatible with Newman’s tagging
ontology (www.holygoat.co.uk/projects/tags/), the Meaning of a Tag ontology
(http://moat-project.org/), and the Simple Knowledge Organization System (SKOS)
(http://www.w3.org/2004/02/skos/). AO supports both free as well as semantic
annotations, namely qualified annotations in AO. It enables users to freely attach
terms to resources –free annotations, as well as terms related to ontological entities –
semantic annotation. Annotations can be attached to the entire resource as well as to
portions of it, e.g. text, images, or tables. As annotations on specific parts of a
document do not necessarily apply to the whole document, implementations should
take care of it by enabling users to define whether or not such annotations should be
also global. AO also supports the curation process over the annotations and offers
different types of annotations such as notes, comments, erratum, etc. It offers
provenance support by reusing the Provenance Authoring and Versioning ontology
(PAV, http://swan.mindinformatics.org/spec/1.2/pav.html).
Collaborative Ontology Building. The Changes and Annotations Ontology (ChAO)
[15] provides a model to track the modifications on ontology classes, properties and
instances. It contains two main classes: Change represents changes –add, edit, and
delete, in the ontology, and Annotation that stores related information such as
comments, examples, explanations and votes. ChAO is currently in use by the
collaborative Protégé project (http://protegewiki.stanford.edu/wiki/Collaborative_
Protege). The ontology building process entails negotiation practices, i.e. ontologies
are social agreements to accomplish shared objectives. When building ontologies,
people are pursuing: (i) retrieving related information, (ii) sharing information, and
(iii) improving and broadening both knowledge and performance. Interestingly, these
are also the motivations for tagging resources [16]. We consider the ontology building
process as a structured folksonomy in which the main document being tagged is that
one representing the ontology; also, the participants are aware of the purpose of their
contributions, i.e. annotations.
    Mapping ChAO and AO makes it possible to use the flexibility of folksonomies
into the ontology building process; Fig. 1 shows the proposed mapping. Users in
ChAO are identified by user names or accounts whilst AO uses foaf:Agent for that
purpose; this brings benefits such as a unique URI to identify a contributor
participating in different ontology developments, regardless of the methodology or
editor. AO can also facilitate reusing information from other folksonomies as it is
compatible with Newman’s ontology, MOAT, and SKOS.


Fig. 1 AO and ChAO mapping

   Fig. 2 shows an annotation from the ChAO and AO perspectives: a user named
Daniel works on the Pizza ontology document; he creates a property #hasTopping,
chao:Property_Created, on a chao:Ontology_Component, and adds an annotation,
chao:Annotation, explaining why the property was created. From the AO perspective
Daniel is represented as a foaf:Agent; he creates an annotation, ao:Annotation,
corresponding to a creation, ao:hasTopic, on a portion of the ontology named
#hasTopping, ann:context. In AO, the property #hasTopping is represented by means
of XPointer (www.w3.org/TR/xptr/) selector, i.e. an element in an XML document.


Fig. 2 Changes on an ontology component, ChAO and AO perspectives

   During the ontology building, contributors perform activities such as adding,
editing, and deleting ontological entities; those activities produce classes, properties
or axioms that can be seen as portions of the ontology, easily identified by a URI.
Annotations such as comments, notes, and votes are attached to particular entities.
Consequently, it is possible to identify a contributor attaching annotations to pieces of
a document, i.e. the ontology; this process is fully supported by AO. Conjugation of
ChAO, AO, PAV and FOAF makes it possible to use SPARQL in order to answer
questions such as who has worked on this class, on which ontologies have contribute
Andy and Tony, or what ontologies have been created for a particular domain.
Wikis. SweetWiki [17] proposes an ontology to represent the wiki structure; concepts
include document, page, tag, link, backward link, contributor, version, attached file,
etc. These concepts are found in both semantic and non semantic wikis; most of them
are also covered by well known vocabularies such as Dublin Core, SKOS, SIOC, and
FOAF. Semantic annotations within semantic wikis follow the structure subject-
predicate-object [13]; by using AO, it is possible to model those annotations as
qualified ones: the contributor corresponds to the annotator –pav:createdBy and
foaf:Agent, the subject to the wiki page or a portion of it –ao:onDocument and
ann:context, the topic to the type of the annotation –skos:broader,
dcterms:isVersionOf, sioc:attachment, etc., and the object to the annotation –ao:body.
AO currently supports semantic annotations corresponding to skos:exactMatch,
skos:closeMatch, skos:broader, and skos:narrower; however, it is possible to extend
AO in such a way that other qualifiers are also allowed. This extension is based on
Hypertag [18] and consists of a new type of annotation Relationship that relates either
two resources, or one resource as subject and the other one as object; it also enables
both reusing relations, e.g. dcterms:isVersionOf, or creating new ones.


3    Conclusions and Future Work

We have presented an alignment between folksonomies and collaborative ontology
building that facilitates collaboration across decentralized settings, pacing with
dynamics on evolving domains, and monitoring the quality and consistency of the
model by using the wisdom of crowds. The proposed alignments will likely facilitate
the use of folksonomy-based approaches in wiki environments and vice versa as well
as the knowledge emergence from both of them. Semantic annotations as well as
provenance do not solve all semantic issues that folksonomies lack of; however, it
reduces the gap between the social and the semantic web. The proposed alignment
makes it easy to integrate knowledge gathered from social platforms into knowledge
elicitation phases in ontology development methodologies. The alignment fits into
methodologies with a collaborative component reusing non-structured or semi-
structured existing knowledge such as Mature Project [19], NeOn Methodology [20],
and Melting Point [21]. Integrated to the Mature Project, our approach makes it easier
to consolidate and axiomatize the ontology built by the community as it includes
semantic links; facilitating in this way the extraction of hierarchies as well as ad hoc
relationships and mappings. Similarly, our approach facilitates reusing and
reengineering non-ontological resources; one of the phases proposed by the NeOn
methodology, as well as the conceptualization activity proposed in the Melting Point.
The aforementioned methodologies reuse knowledge while our alignment facilitates
to add and extract semantics from social environments. This combination makes
possible the evolution of the extracted ontology as this one becomes part of the
ontologies used to qualify annotation, thus new mappings and relations can emerge.
In this a way, users contribute to the ontology building process without being aware
of the process in which they are taking part.
   We have identified three scenarios that could benefit from such an alignment,
improving the way that content from folksonomies is currently exploited. The first
scenario belongs to the bioinformatics domain and is related to the collaborative
annotation of proteins. In such a scenario, documents representing protein sequences
are enhanced by semantic annotations that can be applied to the whole sequence or a
portion of it as well as to other annotations on the protein. In this way, users provide
content in the form of annotations, facilitating the publication of experimental data
related to proteins. It also enables the immediate discovery of such information as
annotations are modeled with AO and linked to protein specialized vocabularies; thus,
it will be available as part of the Linked Open Data cloud. It will also facilitate
ontology evolution by using an AO extension that enables the representation of
relationships.
   The second scenario belongs to the biological domain, it is related to annotations in
laboratory notebooks. Tags4Labs [22] is a prototype supporting the annotation of
experimental data for some of the processes routinely run at the Center for
International Tropical Agriculture (CIAT) biotechnology laboratory. With the
proposed alignment it will be also possible to use annotations as an enrichment
mechanism for those ontologies being used to annotate experimental procedures. The
third scenario belongs to the medical domain and is related with the annotation of
medical images. Ceballos et al. [23] (http://72.167.51.20:8888/webprotege/) propose
and environment in which medical images can be annotated with ontological terms or
just by “tagging”. With the proposed alignment it will be also possible to enrich
existing ontologies by capturing the evidence behind a “tag” so that ontology
engineers can decide on the inclusion of the term in the ontology. Also, other users
will be able to access such information, making it easier for them to evaluate the
relevance of the term and its corresponding use.
   Documents should be able to “know about” their own content for automated
processes in order to “know what to do” with them. With the proposed alignment we
aim to make it possible, i.e. both knowledge discovery and knowledge emergence.


References

1. Jaschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Discovering shared
conceptualizations in folksonomies. Web Semantics: Science, Services and Agents on the
World Wide Web 6 (2008) 38-53
2. Helic, D., Strohmaier, M., Trattner, C., Muhr, M., Lerman, K.: Pragmatic evaluation of
folksonomies. International World Wide Web Conference. ACM, Hyderabad, India (2011)
3. Begelman, G., Keller, P., Smadja, F.: Automated Tag Clustering: Improving search and
exploration in the tag space. World Wide Web Conference - Collaborative Web Tagging
Workshop, Scotland (2006)
4. Heymann, P., Garcia-Molina, H.: Collaborative Creation of Communal Hierarchical
Taxonomies in Social Tagging Systems Stanford University (2006)
5. Yeung, C.A., Gibbins, N., Shadbolt, N.: Understanding the Semantics of Ambiguous Tags in
Folksonomies. International Workshop on Emergent Semantics and Ontology Evolution, Korea
(2007)
6. Angeletou, S., Sabou, M., Specia, L., Motta, E.: Bridging the Gap Between Folksonomies
and the Semantic Web: An Experience Report. European Semantic Web Conference - Bridging
the Gap between Semantic Web and Web 2.0 Workshop, Austria (2007)
7. Van Damme, C., Hepp, M., Siorpaes, K.: FolksOntology: An Integrated Approach for
Turning Folksonomies into Ontologies. European Semantic Web Conference - Workshop
”Bridging the Gap between Semantic Web and Web 2.0", Austria (2007)
8. Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information Retrieval in Folksonomies:
Search and Ranking. The Semantic Web: Research and Applications (2006) 411-426
9. Gruber, T.: What is an Ontology? , Vol. 2009 (1992) Retrieved May. 23, 2009, from
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
10. Guarino, N., Poli, R., Gruber, T.: Toward Principles for the Design of Ontologies Used for
Knowledge Sharing. International Workshop on Formal Ontology. Kluwer Academic
Publishers (1993)
11. Gendarmi, D., Lanubile, F.: Community-Driven Ontology Evolution Based on
Folksonomies. On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops,
Vol. 4277. Springer (2006)
12. Schaffert, S., Gruber, A., Westenthaler, R.: A Semantic Wiki for Collaborative Knowledge
Formation. Semantics 2005, Vienna, Austria (2005)
13. Kuhn, T.: AceWiki: A Natural and Expressive Semantic Wiki. Semantic Web User
Interaction, Florence, Italy (2008)
14. Ciccarese, P., Ocana, M., Garcia Castro, L.J., Das, S., Clark, T.: An Open Annotation
Ontology for Science on Web 3.0. BmC Bioinformatics (accepted) (2010)
15. Noy, N., Chugh, A., Liu, W., Musen, M.: A Framework for Ontology Evolution in
Collaborative Environments (2006)
16. Braun, S., Schmidt, A., Walter, A., Nagypal, G., Zacharias, V.: Ontology Maturing: a
Collaborative Web 2.0 Approach to Ontology Engineering. International World Wide Web
Conference - Workshop on Social and Collaborative Construction of Structured Knowledge
(CKC), Canada (2007)
17. Buffa, M., Gandon, F.: SweetWiki: semantic web enabled technologies in Wiki.
International Symposium on Wikis. ACM, Odense, Denmark (2006)
18. García-Castro, L.J., Hepp, M., García, A.: Tags4Tags: Using Tagging to Consolidate Tags.
International Conference on Database and Expert Systems Applications, Linz, Austria (2009)
19. Braun, S., Schmidt, A., Walter, A., Nagypal, G., Zacharias, V.: Ontology Maturing: a
Collaborative Web 2.0 Approach to Ontology Engineering. International World Wide Web
Conference - Workshop on Social and Collaborative Construction of Structured Knowledge
(CKC), Canada (2007)
20. Suárez-Figueroa, M., Dellschaft, K., Montiel-Ponsoda, E., Villazón-Terrazas, B., Yufei, Z.,
Aguado-de-Cea, G., García, A., Fernández-López, M., Gómez-Pérez, A., Espinoza, M., Sabou,
M.: NeOn Methodology for Building Contextualized Ontology Networks (NeOn Deliverable
D5.4.1.). NeOn Project (2008)
21. Garcia, A., O’Neill, K., Garcia, L.J., Lord, P., Stevens, R., Corcho, O., Gibson, F.:
Developing Ontologies within Decentralised Settings. In: Chen, H., Wang, Y., Cheung, K.-H.
(eds.): Semantic e-Science, Vol. 11. Springer US (2010) 99-139
22. Garcia Castro, A., Giraldo, O., Garcia Castro, L.J.: Annotating experimental records using
ontologies. International Conference on Biomedical Ontology, Buffalo, NY, USA (2011)
23. Ceballos, O., Garcia Castro, A., Garcia Castro, L.J., Millan, M.: Anotación semántica de
imágenes médicas. Acta Biológica Colombiana 15 (2010)