=Paper=
{{Paper
|id=Vol-273/paper-7
|storemode=property
|title=Fostering knowledge evolution through community-based participation
|pdfUrl=https://ceur-ws.org/Vol-273/paper_48.pdf
|volume=Vol-273
|dblpUrl=https://dblp.org/rec/conf/www/GendarmiAL07
}}
==Fostering knowledge evolution through community-based participation==
Fostering knowledge evolution through community-based
participation
Domenico Gendarmi Fabio Abbattista Filippo Lanubile
University of Bari University of Bari University of Bari
Dipartimento di Informatica Dipartimento di Informatica Dipartimento di Informatica
Via E. Orabona, 4 - 70125 Bari Via E. Orabona, 4 - 70125 Bari Via E. Orabona, 4 - 70125 Bari
+390805442286 +390805443298 +390805443261
gendarmi@di.uniba.it fabio@di.uniba.it lanubile@di.uniba.it
ABSTRACT ways to find and work with information that matches their
The ontology development process is typically led by single or personal needs, interests, and capabilities. Then people need to
small groups of experts, with users mostly playing a passive role. bring together their individual knowledge to build a shared
Such an elitist approach in building ontologies hinders the understanding and collaborative outcomes [14]. This can be
primary purpose of large-scale knowledge sharing. Collaborative accomplished by the Semantic Web whose main goal is to enable
tagging systems have emerged as a new web annotation method computers and people to work in cooperation [1].
proving appealing features in fostering users to collaboratively Ontologies play a relevant role within the Semantic Web vision,
organize information through their own metadata. Collaborative because they allow to cope with heterogeneous representations of
tagging shifts the creation of metadata for indexing web web resources, providing a common understanding of a domain to
resources, from an individual professional activity to a collective be shared among human beings and software agents [6]. The
endeavor, where every user is a potential contributor. domain model implicit in an ontology can be taken as a unified
In this paper we introduce an approach to knowledge evolution structure for giving information a common representation and
which aims to exploit the ability of collaborative tagging in semantic [2]. However the ontology development process is
fostering community members participation to move forward an typically led by single or small groups of experts, with users
initial knowledge structure. We present user scenarios about how mostly playing a passive role. Such an elitist approach in building
subscribers of a scientific digital library might play the role of ontologies hinders the primary purpose of large-scale knowledge
knowledge organizers through personal organization and sharing sharing.
of citations of interest.
The achievement of a widespread participation in the ontology
development process is often hampered by entry barriers, like the
Categories and Subject Descriptors lack of easy-to-use and intuitive tools for ontology contribution.
H.3.5 Online Information Services, H.3.7 Digital Libraries, H.5.3 Barriers to active participation, combined with traditional top-
Group and Organization Interfaces. down approaches in building ontologies, force users to conform to
an undesirable knowledge representation. Such an imposition
General Terms weakens common ground and increases the likelihood that the
Design, Human Factors.
ontology will not be widely used.
Keywords Ontologies need to change as fast as the parts of the world they
describe [7]. However, changes have to be captured and applied
Community, knowledge evolution, collaborative tagging.
by skilled knowledge engineers, preferably the original creators
of the ontology. This is a bottleneck which causes unacceptable
1. INTRODUCTION delays in the ontology maintenance process.
Knowledge is strongly tied up with cognitive and social aspects,
as the management of knowledge occurs within a tangled A reasonable assumption on how to reduce maintenance costs is
structured social context. Human and social factors involved in to spread the burden across users. In fact, given the Web's fractal
the development and exchange of knowledge have a heavy impact nature, costs might decrease as ontology users increase in number
on the design of knowledge management supporting systems [16]. [13]. Community participation to ontology development has
Such a collaborative knowledge construction process takes place already been identified as a solution to a more complete and up-
when multiple participants contribute to the growth of to-date structured knowledge construction [19]. Other than being
interpretations on a shared information base, simultaneously group of users with common interests, communities can then be
extended by information seeking and transformations [15]. considered as the top layer of the Semantic Web architecture [12].
In order to help community members constructing knowledge in This paper describes our vision for enabling a community of
their own personal perspectives while also negotiating shared autonomous users to cooperate in a dynamic and open
understanding, two needs have to be addressed: First, people need environment, collectively evolving an initial knowledge structure.
Participants can organize some piece of knowledge according to a
self-established vocabulary, building up personal taxonomies for
searching and browsing through their own information spaces. By
sharing portions of their knowledge, users can also create
Copyright is held by the author/owner(s).
WWW 2007, May 8--12, 2007, Banff, Canada.
connections and negotiate meaning with people having similar can coexist with popular ones without disrupting the implicit
interests. emerging consensus on the meaning of the terms.
The main goals of the proposed approach are: (1) to allow users to The main drawbacks with tags concern semantic and cognitive
organize personal information spaces, starting from a prearranged issues, such as polysemy, synonymy and basic level variation [5].
knowledge structure; and (2) to take advantage of users’ Polysemy occurs when the same term is used for tags employed
contribution for better reflecting the community evolution of a with different meanings. The polysemy problem affects query
shared knowledge structure. results by returning potentially related but often inappropriate
resources. Polysemy is occasionally equalized to homonymy,
The rest of the paper is organized as follows.
however polysemous words have different meanings but related
Section 2 provides background information about collaborative
senses, while homonyms have multiple, unrelated meanings.
tagging systems. In Section 3 we describe our approach to
Synonymy takes place when different terms are used for tags
community-based evolution through a specific context, a having the same meaning. Synonymous tags are another source of
scientific digital library, and a number of user scenarios. Section 4
ambiguity, severely hindering the discovery of all the relevant
summarizes related work that can be seen as complementary to
resources which are available in a tagging system. Polysemy and
our approach. Finally section 5 draws conclusions and points out
synonymy represent two critical aspects of a search, as they
some challenges we are going to address in the near future.
respectively affect precision and recall, which are typically used
for evaluating information retrieval systems.
2. COLLABORATIVE TAGGING
A further relevant problem, concerning the cognitive aspect of
SYSTEMS categorization, is the basic level variation of tags. Terms used to
One of the major obstacles hindering the widespread adoption of describe a resource can vary along a continuum of specificity
controlled vocabularies is the constant growth of available content ranging from very general to particularly specific. Different users
which anticipates the ability of any single authority to create and can use terms at different levels of abstraction to describe the
index metadata. In such contexts collaborative tagging represents same resource, leading to a low recall in retrieving resources.
a potential solution to the vocabulary problem [4]. Collaborative tagging is also referred to as "folksonomy",
Collaborative tagging has emerged as a new social-driven originally coined by Thomas Vander Wal who combined the
annotation method, as it shifts the creation of metadata for words "folk" and "taxonomy", this term refers to a taxonomy
describing web resources, from an individual professional activity created by common people [17]. However, taxonomies are
to a collective endeavor, where every user is a potential hierarchical structures of classifications with parent-child
contributor. relationships among concepts.
Figure 1 shows a conceptual model of collaborative tagging, While it is well-known that search and retrieval are facilitated by
according to UML notation [3], with tags seen as association structured subject headings, the tags which form a folksonomy are
classes between users and resources. Users can label any resource just flat terms. Besides the previous drawbacks, the lack of a
with whatever tag thought as appropriate and, vice versa, structure is one of the main aspects which weaken severely the
resources can be annotated with any tag by any user. Users are information retrieval in a collaborative tagging system.
able to share both resources and tags within a community, leading
to a network of users, resources and tags with a flat structure and 3. OUR APPROACH TO COMMUNITY
no limits in evolution. KNOWLEDGE EVOLUTION
In this section we lay out our approach for applying collaborative
tagging techniques to support the evolution of a knowledge
structure adopted for the classification of a wide amount of digital
resources.
We first briefly introduce a scientific digital library that we have
selected as an application context. Then we present the
knowledge evolution process from a user perspective.
Figure 1. Conceptual model of collaborative tagging
3.1 Approach Context
As an illustrative context for our approach, we consider the digital
Collaborative tagging systems exhibit other interesting benefits library of the Association for Computing Machinery (ACM).
such as their ability in adhering to the personal way of thinking.
No forced restrictions on the allowed terms, as well as the lack of The ACM Guide to Computing Literature is an index to
syntax to learn can shorten significantly the learning curve. computing literature from over 3000 publishers, containing over
Collaborative tagging systems also create a strong sense of 750,000 citations of books, journal articles, conference
community amongst their users, allowing them to realize how proceedings, doctoral and master’s theses, and technical reports.
others have categorized the same resource or how the same tag Citations can be browsed by publication type, author name, as
has been used to label different resources. This immediate well as authors’ keywords and classification terms from the ACM
feedback leads to an attractive form of asynchronous taxonomy, named The Computing Classification System.
communication through metadata [10]. There is no need to The ACM Guide to Computing Literature is part of the services
establish a common agreement on the meaning of a tag because it offered by the ACM Portal. Portal subscribers can create any
gradually emerges with the use of the system. Marginal opinions number of binders, which are personal collections of citations
with links to the publication source through the Digital Object
Identifier (DOI) bookmark, and the full text if the citation is that article (e.g. abstract, references, index terms, collaborative
published by ACM itself. When creating their binders, users colleagues). Once explored more in detail some results, John finds
choose whether to keep them private or share them with other as citation of interest the article named “Usage patterns of
selected users or, more generally, the public. collaborative tagging systems”. John wants to save it into his own
personal information space using the “Save this Article to a
3.2 User Perspective Binder” feature (Figure 3).
According to our approach, the interaction process of a user with
a digital library can be characterized as a three-step iteration
(Figure 2).
1. Selection. It involves discovering and choosing a specific
citation in the whole repository. This step is already
available in a common digital library.
2. Organization. It involves creating and structuring a personal
information space according to individual interests. This step
goes beyond current opportunities because it allows not only
to store collections of citations of interest but also to group
them using the desired metadata and structure.
3. Sharing. It involves making public some selected collections
and corresponding metadata in order to support a community Figure 3. Detailed page of the selected citation
knowledge evolution.
To explain how our approach can affect the user experience, 3.2.2 Organization
afterwards we present a scenario for each step. John now has to choose the name of the binder where saving the
selected citation. This name represents the label of a specific
category playing the role of a virtual folder where storing a
collection of citations. In choosing the name John is supported by
a suggestion feature providing a set of potential binder names. In
this case some suggested binder names can be collaborative
tagging systems, delicious studies and social bookmarking
analyses. John chooses to store the citation in a binder named
tagging patterns.
Saving an article into a virtual personal space is a sign of a real
interest for the citation, hence we can assume that John is wishful
to provide the metadata he considers most appropriate for
annotating the selected citation. However, to avoid burdening
John’s experience, authoring metadata have to remain as simple
Figure 2. Three-step iteration as in collaborative tagging systems.
The task assigned to John is just to browse a space of suggested
3.2.1 Selection metadata, pointing out the most favorites and eventually
John is an ACM member with a web account on the Portal. As an proposing new ones. Through the DOI, the system is able to
assignment, he has to write a state of the art about collaborative univocally identify the selected citation, and a large set of
tagging systems. He is not looking for well-known papers but, metadata related to that article can be retrieved from different
rather his goal is to explore the recent bibliography on this systems freely available on the web. For example for the selected
specific topic to discover new scientific articles he could find citation the system could retrieve keywords from ACM, as well as
interesting to read. tags from services like CiteULike, Bibsonomy and Connotea
(Figure 4).
In order to find citations within the ACM Portal, John has two
options: He can perform a search (basic or advanced); otherwise
he can browse the repository in several different ways. For
example, he can browse through the Guide using index terms of
the ACM taxonomy or he can browse through the Digital Library
according to the kinds of publications. However, due to the
limitations of the current taxonomy in organizing citations,
especially for articles about recent topics as collaborative tagging,
John prefers to use the search feature.
John performs a simple query, within the Guide, using as
keywords the sentence collaborative tagging. A list of results
showing a set of basic information (e.g. title, authors, publishers,
year of publication) for each matching citation is presented to
John ordered by relevance. John, then, can select a specific
citation to let the system display additional information related to Figure 4. Retrieved metadata of the selected citation
Using a filtering process to discard useless keywords or tags, such
as those occurring isolated and group very similar ones, this space
of metadata can be normalized in order to help John in the
browsing task (Figure 5).
Figure 5. Space of metadata
Figure 7. Synonyms, hypernyms and hyponyms for the
While browsing, John can select a metadata and, just picking out
selected sense of the term classification
it, he can state his agreement or disagreement (e.g. Y/N). In this
case, browsing the space in Figure 5, John selects classification
and expresses an agreement with such a term. For example, a possible suggestion can be to attach the new
Using a lexical resource, such as Wordnet, a searching for concept as child of information storage (Figure 8). If John
possible multiple senses associated to the selected term can be approves this suggestion a relationship between information
performed. Four senses are retrieved from Wordnet for the noun storage and classification will be added and the new taxonomy
classification and John disambiguates these senses selecting the will be stored in John’s personal information space. From now on,
first one (Figure 6). Furthermore, Wordnet can provide synonyms, the digital library will keep track of new concepts in the John’s
hypernyms and hyponyms related to the selected sense (Figure 7). personal taxonomy and additions of new concepts will be checked
The system can thus map the term chosen by John to a to avoid inconsistencies. The selected citation will be
corresponding concept including relationships with other related automatically classified in John’s personal space, according to the
concepts. new concept just added (Figure 9).
While browsing the space of metadata, John can select and agree
with another term, such as collaborative tagging which could not
have any associated sense in Wordnet. In this case John has not to
disambiguate any sense but he has to provide a brief description
of the concept. Anyway John has to find the right place in the
taxonomy where to insert the concept corresponding to the
selected term.
John can also disagree with a term in the space of metadata, in
such a situation he can optionally propose new terms. Proposing a
new term renders the same scenario as if he has chosen an
existing one in the space of metadata.
Figure 6. Senses for the term classification
John now has to decide the best position, within the ACM
taxonomy, where to put the concept corresponding to the selected
term classification. In such a task John can be supported by the
system through some recommendations suggesting possible
relevant parts of the taxonomy where the concept could already
exist or where the concept could be inserted.
Figure 8. Suggested taxonomy branch where to attach the
concept associated with the term classification
Figure 11. A portion of Michael's personal taxonomy
Figure 9. Personal taxonomy
Lucia has shared a binder named tagging systems analyses where
3.2.3 Sharing she stored all the citations in the Michael’s binder and the citation
John’s information space will be structured in a set of binders named “What goes around comes around: an analysis of
where he will store citations classified according to his favorite del.icio.us as social space”. In Figure 12 there is the portion of
metadata. Moreover, storing and annotating citations will give Lucia’s personal taxonomy relative to all the citations in her
rise to an evolving personal taxonomy which John can exploit to shared binder.
browse through his personal space. Using the digital library, a
user profile will be created in order to keep track of topics of
interest. For each binder created by John, one or more
corresponding topics of interest will be included in his profile
(Figure 10).
Figure 12. A portion of Lucia's personal taxonomy
Once John has shared the binder, he gains access to a shared
information space concerning a particular topic related to the
binder. In this shared space, John can view all users interested in
the same topic, all citations relevant to the topic stored by these
users, as well as one or more shared taxonomies. Every taxonomy
in this shared space has the purpose to represent a particular
perspective on that topic, depicting a common way to classify
related citations employed by a group of people with similar
interests. One or more shared portions of these taxonomies are
recommended to John. He is now allowed to rank suggestions in
accordance with his own perspective. As a result, the shared
information space will be displayed to John (Figure 13).
Now John can perform any of the following actions:
Figure 10. Creation of a user profile • browse through users’ personal information spaces, viewing
user profiles, taxonomies, shared binders, unless they have
been kept as private;
John now chooses to share the binder just created, named tagging
• discover new citations about the topic collaborative tagging
patterns. Within John’s profile the systems looks for one or more
and add them to either the shared binder or a new one;
topics of interest associated to that binder. Having established the
topic of the shared binder, the system looks for other profiles with • observe how shared taxonomies have been ranked by other
the same topic, in order to find users which share similar interests users and express his own grade.
with John. After John has shared his binder, users, who have previously
contributed to the shared space, will be notified about changes.
For example two other users, Michael and Lucia have in their Afterwards, users can check the information space in order to
profiles analogous topics about collaborative tagging dynamics. discover new users with their own similar interests, new citations
Michael has in his personal space a shared binder named tagging about the topic, and changes to the shared taxonomies.
studies, with the same citation stored by John and other two
citations, respectively named “Tagging, communities, vocabulary, John hence contributes to a community perspective for the topic
evolution” and the other titled “HT06, tagging paper, taxonomy, of interest by sharing his personal metadata as well as expressing
Flickr, academic article, to read”. Figure 11 shows a portion of his preference on the shared taxonomies. On the other hand, he
the Michael’s personal taxonomy which describes how Michael gets feedback for his personal organization while actively taking
has classified citations within his shared binder. part to the community.
Figure 13. The resulting shared information space
graph is created exploiting the social network notion of graph
4. RELATED WORK centrality. Starting from the similarity graph and according to
While our approach aims to apply collaborative tagging concepts three fundamental hypotheses, namely hierarchy representation,
to the problem of knowledge evolution, much research work noise and general-general assumptions, a latent hierarchical
assumes the opposite perspective: Discovering semantic relations taxonomy is extracted.
among tags to enhance how current collaborative tagging systems
Wu et al. [18] exploit a probabilistic generative model to
work.
represent the user's annotation behavior in a social bookmarking
Mika [11] extends the traditional bipartite model of ontologies system and to automatically derive the emergent semantics of the
with the social dimension leading to a tripartite model of tags. Starting from the assumption that tags heavily used by users
ontologies with three different classes of nodes, namely persons, with similar interests are semantically related, the authors apply
concepts, and instances and hyperedges representing the statistical techniques to discover semantic relationships from the
commitment of a person in terms of classifying an instance as different frequencies of co-occurrences among users, resources
belonging to a certain concept. This model is exploited by and tags. The resulting emergent semantics of user interests, tags
generating two kinds of association networks: the network of and web resources is then exploited to develop an intelligent
concepts and instances and the network of people and concepts. semantic search system with the purpose to search and discover
From the association network of concepts and instances, it is semantically-related web resources.
extracted a classification hierarchy. From the network of people
and concepts, the author generates a hierarchy based on sub- 5. CONCLUSION
community relationships. This paper provides a community-driven approach to knowledge
Hotho et al. [9] propose an adaptation of a data mining approach evolution. Although we have depicted scenarios for a research
to detect emergent semantics within a collaborative tagging community, the proposal applies to other online communities.
system. The adaptation lies in reducing the three-dimensional As in collaborative tagging systems, the main idea is to shift the
folksonomy to a two-dimensional formal context in order to apply creation of metadata from a restricted to a collective activity, but
association rule mining techniques. Discovered association rules still maintaining the expressiveness an ontology can provide for
can be then exploited in a recommender system which supports classification.
the user in choosing useful tags. The obtained rules can be also
seen as subsumption relations, in order to learn a taxonomic Knowledge engineers struggle to capture all the variety taking
structure. place within a lively community. We hypothesize that
augmenting users’ participation in the process of annotating and
In [8] authors present an algorithm that tries to address the basic classifying shared items reflects the community knowledge more
level variation issue by converting a large corpus of tags into a effectively than relying on prescribed knowledge structures,
navigable hierarchical taxonomy. Tags are grouped using vectors maintained by a central authority. A collaborative approach to
according to the number of times each tag has been used for every knowledge evolution can split costs over a wide group of people,
annotated resource. Then, the algorithm defines a function to who have special interests in specific knowledge domains.
calculate similarity between vectors and a threshold to prune
irrelevant values. Finally, for a given dataset a tag similarity
The scenarios presented in this paper point out how challenging is Computer Science, Stanford University, Stanford, CA, USA
to directly involve users in the knowledge evolution process. We (2006).
need to provide tool support to allow community members to
[9] Hotho, A., Jäschke, R., Schmitz, C., Stumme, G. Emergent
easily organize their personal information spaces, and contribute
Semantics in BibSonomy. Proc. Workshop on Applications
with a minimal overload. We intend to develop a software agent
of Semantic Technologies, Informatik 2006, Dresden, 2006.
which is able to monitor users’ interactions with the system and
learn about users’ interests. The agent will gain access to [10] Mathes, A. Folksonomies-Cooperative Classification and
metadata in users’ personal information spaces to discover topics Communication Through Shared Metadata. Technical
of interest. In order to enable software agents to better handle Report, LIS590CMC, Computer Mediated Communication,
metadata, users’ tags will be rendered as RDF statements rather Graduate School of Library and Information Science,
than simple keywords expressed in natural language. University of Illinois, Urbana-Champaign, 2004.
The approach presented here is a first step toward a collaborative [11] Mika, P. Ontologies are us: A unified model of social
knowledge evolution system with the aim to provide an enhanced networks and semantics. Proceedings of the 4th International
infrastructure supporting the ever-evolving community Semantic Web Conference (ISWC 2005), LNCS 3729,
knowledge through the active participation of its members. Springer-Verlag, 2005.
[12] Mika, P. Social Networks and the Semantic Web: The Next
6. REFERENCES Challenge. IEEE Intelligent Systems 20 (2005).
[1] Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web.
Scientific American (2001). [13] Shadbolt, N., Berners-Lee, T., and Hall, W. The Semantic
Web Revisited, IEEE Intelligent Systems, 21, 3 (2006), 96-
[2] Davies, J., Fensel, D., and Harmelen, F. Towards the 101.
Semantic Web: Ontology-driven Knowledge Management.
John Wiley & Sons, 2003. [14] Stahl, G. 2000. Collaborative information environments to
support knowledge construction by communities. AI Soc. 14,
[3] Fowler, M. Uml Distilled: A Brief Guide to the Standard 1 (Apr. 2000), 71-97.
Object Modeling Language. Addison-Wesley Professional,
2004. [15] Suthers, D. D. 2005. Collaborative Knowledge Construction
through Shared Representations. In Proceedings of the
[4] Furnas, G., Landauer, T., Gomez, L., and Dumais, S. The Proceedings of the 38th Annual Hawaii international
vocabulary problem in human-system communication, Conference on System Sciences (Hicss'05).
Communications of the ACM, 30, 11 (1987), 964-971.
[16] Thomas, J. C., Kellogg, W. A., Erickson, T. The knowledge
[5] Golder, S. and Huberman, B. Usage patterns of collaborative management puzzle: Human and social factors in knowledge
tagging systems, Journal of Information Science, 32, 2 management. IBM Systems Journal 40, 4 (2001), 863-884.
(2006), 198-208.
[17] Vander Wal, T. Folksonomy Definition and Wikipedia.2005.
[6] Gruber, T.: Toward Principles for the Design of Ontologies
Used for Knowledge Sharing. International Journal Human- [18] Wu, X., Zhang, L., Yu, Y.: Exploring social annotations for
Computer Studies 43 (1993), 907-928. the semantic web. Proc. of the 15th international conference
on World Wide Web (2006), 417-426.
[7] Haase, P., Völker, J., and Sure, Y. Management of dynamic
knowledge, Journal of Knowledge Management, 9, 5 (2005), [19] Zhdanova, A. V., Krummenacher, R., Henke, J., and Fensel,
97-107. D. 2005. Community-Driven Ontology Management: DERI
Case Study. In Proceedings of the the 2005 IEEE/WIC/ACM
[8] Heymann, P., Garcia-Molina, H.: Collaborative Creation of international Conference on Web intelligence (Wi 2005),
Communal Hierarchical Taxonomies in Social Tagging IEEE Computer Society Press.
Systems. Technical Report InfoLab 2006-10, Department of