=Paper=
{{Paper
|id=Vol-201/paper-27
|storemode=property
|title=FaceTag: Integrating Bottom-up and Top-down Classification
|pdfUrl=https://ceur-ws.org/Vol-201/05.pdf
|volume=Vol-201
|dblpUrl=https://dblp.org/rec/conf/swap/RosatiRQ06
}}
==FaceTag: Integrating Bottom-up and Top-down Classification==
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 1
FaceTag: Integrating Bottom-up and Top-down
Classification in a Social Tagging System
Quintarelli, E. - Resmini, A. - Rosati, L.
Abstract – Facetag is a working prototype of a semantic Despite their low cognitive cost, their capability of
collaborative tagging tool conceived for bookmarking matching users’ real needs and language and their great
information architecture resources. It aims to show how the value in a serendipity research task, folksonomies imply
widespread homogeneous and flat keywords' space of tags can be however a lack of precision, a very low findability
effectively mixed with a richer faceted classification scheme to
improve the “information scent” and “berrypicking” capabilities
quotient (especially in a known-item approach) and a
of the system. The additional semantic structure is aggregated limited scalability for the intrinsic variability of language
both implicitly observing user behaviour and explicitly [Quintarelli 2005].
introducing a compelling user experience to facilitate the As a result of the inherently inconsistent, evolving and
creation of relationships between tags directly by end-users. much variable process of associating words and meanings,
Facetag current implementation is written in PHP / SQL and tagging systems are also implicitly plagued by a number of
includes an open API which allows querying and integration issues which include polysemy, homonymy, plurals,
from other applications. synonymy, problems of ego-oriented nature and basic
Index Terms – Social classification, folksonomy, tagging, level variation which do not appear easy to solve [Golder
faceted classification, information architecture. & Huberman 2005]. Any of these problems can
dramatically reduce the effectiveness of the application,
I. INTRODUCTION ∗ mining the benefits brought on by the use of tagging
systems.
Collaborative tagging systems have been largely adopted by In addition, tags have recently started to be used by
end-users as useful and powerful tools to organize, browse bloggers as reading-aids to help users identify articles and
and publicly share personal collections of resources on the posts of interest, providing as such a complimentary
World Wide Web through the introduction of simple structure over a purely chronological list of text pieces.
metadata. This approach marks a major shift, in that tagging also
The aggregation of user metadata is often referred to as a becomes a tool to maximize findability and browsability
folksonomy, a user-generated classification, emerging through without limiting the reader to only access the most popular
bottom-up consensus while users assign free form keywords or recent tags as in common tag clouds [Feinstein &
to online resources for personal or social benefit. Del.icio.us Smadja 2006].
, Flickr , Tag clouds are widely used visual interfaces for
43things , Furl information retrieval that provide a global contextual view
and Technorati of tags assigned to resources in the system. In such a
are web-based collaborative structure, the most popular tags are usually displayed
systems for building shared databases of items, enriched by a through an alphabetically ordered list with the font size
flat metadata vocabulary that can be used to perform increasing with the tag's relevance. Users browse the
metadata-driven queries, to monitor change in areas of interest cloud, scanning hyperlinks to recognize information of
or to discover emergences or trends, such as the hottest / most interest [Hassan-Montero & Herrero-Solana 2006].
popular topics in the system [Quintarelli 2005]. Flat tag clouds are anyway not sufficient to provide a
In the past, folksonomies have often been seen as semantic, rich and multidimensional browsing experience
orthogonal to taxonomies and controlled vocabularies: the over large tagging spaces:
latter rigid, hierarchical and organically hand-crafted by • Choosing tags by frequency of use inevitably
professionals a priori; the former flat, inclusive and emerging causes a high semantic density with very few
from bottom-up users' consensus [Quintarelli 2005]. In a flat well-known and stable topics dominating the
tagging system each document can be retrieved through a scene (as seen on RawSugar,
simple set of keywords, collaboratively introduced by users to );
describe and categorize the document, very much like in a • Providing only an alphabetical criterion to sort
keyword-based search process in which descriptive terms can tags heavily limits the ability to quickly navigate,
be used to get a set of applicable items. scan and extract, and hence build a coherent
mental model out of tags;
∗
This paper is the result of a collaborative effort. Nonetheless, Emanuele
Quintarelli specifically wrote paragraphs I-II, Andrea Resmini wrote
• A flat tag cloud cannot visually support semantic
paragraphs V-VI and Luca Rosati paragraphs III-IV. relationships between tags. We suggest that these
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 2
relationships are needed to improve the user ; Etsy
experience and general usefulness of the system; 1.
• Current tag clouds often miss to provide complex The choice of facets is based on the CRG theory
logical operation over tags. Simply clicking on a tag [Vickery 1960]. Indeed, an aspect often underestimated on
is not enough to enable a smooth and powerful the World Wide Web is that both Ranganathan and the
exploration or refinement. CRG described a generic schema for faceted classification,
Even if Facetag doesn’t promise to address all of these which every actual schema can refer to. Thus, in a faceted
issues, we believe our approach can limit the impact of classification project one does not have to rebuild the
polysemy, homonymy and basic level variation while schema from scratch every time, but may follow a constant
introducing an innovative, multidimensional and more guideline while building one's main categories (i.e. facets).
semantic paradigm for organizing, navigating and searching CRG postulates 11-13 general categories. In the table
large information spaces through tags. below we show the matching between CRG standard
To reach this goal, FaceTag mixes three contributions to categories and IA-related categories that were used to
social tagging systems: define our facets.
• The use of (optional) tag hierarchies. Users have the
possibility to organize their resources by means of TABLE 1: FACETAG FACETS DEFINITION BY CRG STANDARD
CATEGORIES.
father-son relationships;
• Tag hierarchies are semantically assigned to CRG FaceTag
editorially established facets that can be later Thing [Documents, resources]
leveraged on to flexibly navigate the resource Type Resource Types
domain; (e.g. online report, case
• Tagging and searching can be mixed to maximize study...)
findability, browsability and user-discovery. Part --
II. OVERVIEW OF FACETAG Property Language
Until today, one of the main limitations of hierarchical Material [Format]
faceted categories was the lack of a good automated process Process --
for both creating the categories and associating items to the
hierarchy of labels under each facet [Hearst 2006a]. Operation Activities/Subjects
We decided to avoid the issue entirely and use no (e.g. competitive analysis,
algorithmic round-ups: Facetag is built around the notion that faceted classification ...)
the users provide the structure and especially aims to Product [Deliverables]
investigate how a hierarchical and faceted metadata structure Byproduct --
can be added to user generated content making use of tags
provided by end users in collaborative systems, limiting the Patient Usage
amount of effort and toil required through a careful user (e.g. Industry, Health ...)
interface design. Agent People
III. FACETED ANALYSIS: THE FACETED SCHEME Space [Country]
CONSTRUCTION Time Date
Although facet, faceted have become very common terms
in the information architecture field, their application falls A preliminary analysis of a corpus of IA resources from
often far from its original meaning. The attribute faceted, the Information Architecture Institute Library
indeed, is used in a large variety of meanings, and is often allowed us to define six
referred loosely to the availability of means to search by facets which appeared to be suitable for the classification
different keys [La Barre 2004]. The full theory of faceted of IA resources.
classification, as it has been developed by Ranganathan and
the Classification Research Group (CRG) and which includes
rules for citation order and notation, is less widespread as a
backend for website organization; remarkable exceptions are
offered by projects staffing librarians, such as FATKS [Slavic
2002].
So, we thought to apply faceted classification to the IA field
itself respecting in full the original library theory, in order to
leverage on its potentialities and obtain maximum benefits. In
such perspective, our design was inspired by these projects: 1 Both Facetious and Etsy mix proper facets and metadata (formal
Flamenco project ; Facetious proprieties of an item).
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 3
TABLE 2: FACETAG FACETS AND EXAMPLES OF FOCI activities, to which the user interface adapts providing
Facet Examples different aiding tools (navigation, resource management)
Resource Types white paper, case study etc. and different behaviours (zooming, tag suggestions)
Language predefined values (based on
ISO Standard ISO 639-2) respectively.
Activities/Subjects discovery>competitive When a user accesses the application first, Facetag
analysis, classification>facets replies in browsing mode and she is presented a page
Usage industry, public which lists the most recent additions to the system in the
administration, health etc.
People dion hinchcliffe, morville main body. Other relevant parts of the user interface are a
Date automatically added by the search box and a sidebar. The sidebar lists facets and
software pertaining first-level tags with query previews, i.e the
number of resourced associated to each tag automatically
The foci listed near some of the facets serve the only generated from the schema and data stored in the database.
purpose of making the facets self-explanatory. In the actual Inside Facetag, a user can decide to look for content a)
implementation, since tags are our foci, foci will be user- by entering keywords b) by choosing first-level tags from
generated, with the only exception of the language facet, a specific facet list.
which will use a predefined list of languages in the ISO 639-2 If the user enters a keyword, Facetag returns the
notation, and the date facet, which will receive a software- paginated results set of all the resources which either
generated timestamp upon resource creation. contain that keyword in their tags or in their title,
description or notes. The sidebar facet display is adjusted
IV. BERRYPICKING, INFORMATION SCENT AND THE to show only those facets and pertaining first-level tags
TWO AXIS OF INFORMATION ARCHITECTURE which are related to the results set.
As a matter of fact, facets constitute an adaptive In case the keyword happens to be an nth-level tag, the
classification system capable, in force of its own nature, to corresponding facet will show all nth+1 tags and add any
represent: broader tag in the hierarchy up to the nth-1 tag to the facet
• in movement knowledge, like that observable in a title as clickable items which allow zooming out. If there
social collaborative context; is no nth+1 tag, the facet is not displayed.
• several mental models at the same time, such as those If the user clicks on a tag from the facet sidebar,
playing their role in this context. Facetag returns the paginated results set of all the
Furthermore, facets are particularly suitable to classify a resources which have been tagged with that tag. A
homogeneous collection of items – i.e. a set of resources breadcrumb path is displayed which lists the active facet
belonging to a specific disciplinary area. (the one the tag is a focus for) and the position of the tag
Besides enforcing order on the flat space of keywords, the in any tag hierarchy it may belong to.
blend of tags and facets is able to empower the “information The sidebar facet display is adjusted consequently. The
scent” [Chi et al. 2001] and the “berrypicking” [Bates 1989] active facet shows all broader tags from the hierarchy the
capabilities of the system. Every information architecture selected tag may be part of alongside the facet title, and all
project refers to two different information axes: pertaining narrower tags. Inactive facets show first-level
• a vertical (or paradigmatic) axis, i.e. the hierarchical tags which relate to the resources pertaining to the results
relationship that each item of a system engages with set.
the others; Upon subsequent zooming in and refining the query,
• a horizontal (or syntagmatic) axis, i.e. the semantic, when there are no narrower tags, the breadcrumb display is
contiguity relationship that each item engages with maintained to allow zooming out or what we call
the others. disengaging, resetting the search, while the active facet
In our case, the combination of tags and facets allows for display is effectively removed from the sidebar.
better management of both these axes: Obviously, a user may start searching for a keyword and
• from the vertical or paradigmatic point of view, when then adjust her results set using facets, combining the two
a user is going to associate a keyword to a facet (in approaches in any way she prefers until she reaches a
order to tag a resource), the system suggests similar satisfactory answer, or proceed viceversa and zoom in and
tags or hierarchy of tags pertaining to the same facet; out by using tags. Similarly, tags pertaining to different
• from the horizontal or syntagmatic point of view, at facets can be used together during a single search to
the same time, the system will allow the user to see narrow down a results set quickly and efficiently. If there
all the other tags belonging to the same facet(s). is no disengagement, all subsequent operations are
performed on the intermediate results set.
V. FACETED HIERARCHICAL TAGGING If a user logs in, access to the administrative interface is
Facetag deals with users, resources, tags and facets in two granted and adding, editing and deleting resources and
quite distinct ways: since it's a social tagging application, it tags becomes possible.
offers both a browsing/searching mode and an Upon entering new resources, a user is provided with a
administrative/editing mode. These are two different
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 4
simple form with entry fields for every facet. These tag fields VII. REFERENCES
are optional, and can be left empty at will: there is no
mandatory facet. But if a user start to enter a tag, the Bar-Ilan J., Shoham S., Idan A., Miller Y., Shachak A., (2006)
completion tool suggests similar tags from the pertaining facet Structured vs. unstructured tagging – A case study, WWW20006,
only. Moreover, since users can optionally identify two or Edimburg .
more tags as a hierarchy through a simple syntax (using the Broughton, V. (2001) Klasifikacija za 21. stoljece: nacela i struktura
‘>’ character), the completion tool can suggest, again facet per Blissove bibliografske klasifikacije [= A classification for the 21st
century: principles and structure of the Bliss bibliographic
facet, not just similar tags, but similar tags as parts of a classification], Vjesnik bibliotekara Hrvatske, 44, 1-4, p. 38-51; trad. it.
hierarchy 2 of tags, hence effectively suggesting an entire Una classificazione per il 21’ secolo: principî e struttura della
hierarchy. Classificazione bibliografica Bliss, AIB-WEB. Contributi,
Gradually, with use, these hierarchies acquire complexity .
and become globally significant in the system. Campbell, G.D., Fast, K.V., (2006) From Pace Layering to Resilience
Editing or modifying can be done seamlessly from the Theory: The Complex Implications of Tagging from Information
browsing interface, by clicking icons which appear next to Architecture, Proceedings of IA Summit 2006 (Vancouver, March 23-27,
2006), ASIS&T
one's own resources. Noticeably, the same happens if a user .
tries to add a resource she already added (based on URI
Chi, E.H. - Pirolli, P. , Chen, K. – Pitkow, J. (2001) Using
identification): Facetag simply supplies the editing interface Information Scent to Model User Information Needs and Actions on the
preloading the original data. Web, Proceedings of the SIGCHI conference on Human factors in
computing systems (Seattle, Washington, 2001), ACM Press
VI. CONCLUSION .
By providing the user with facets to which hierarchical sets
of tags relate and pertain and a usable interface which adapts English, J., Hearst, M., Sinha, R., Swearingen K., and Yee, P.,
to the ongoing query, Facetag may solve, through (2002a) Hierarchical Faceted Metadata in Site Search Interfaces, CHI
2002 Conference Companion
contextualization and user-added semantic value, most of the .
basic issues connected with polysemy, homonymy and base
-- (2002b) Flexible search and browsing using faceted metadata,
level variations. Unpublished Manuscript
While further testing and usability studies are needed to .
verify to which extent users are motivated to use our
Feinstein, D., Smadja F., (2006) Hierarchical Tags and Faceted
prototype and to introduce structure in addition to flat tags, Search. The RawSugar Approach, Proceedings of SIGIR 2006 (August 6-
preliminary user evaluations show how the addition of 11, 2006, Seattle, Washington).
hierarchies and facets can improve and disambiguate the Flamenco Group (2002) How to Build a Flamenco instance
meaning of tags giving them a stronger context and a more .
users can make better sense of the meaning of a tag, discover Gnoli, C., Marino, V., Rosati, L., (2006) Organizzare la conoscenza.
related tags at different levels of specificity and exclude Dalle biblioteche all'architettura dell'informazione per il Web [=
homonimies or find out a large number of other tags that can Organizing Knowledge. From Libraries to Information Architecture for
the Web], Tecniche Nuove.
be of interest. This approach also tends to augment the
scalability of the system when addressing the enormous Golder, A.S., Huberman, B.A., (2005) The Structure of Collaborative
domains presented today by the most appreciated social Tagging Systems, Information Dynamics Lab
.
applications.
Improving on current features, Facetag aims to provide an Hassan-Montero, Y., and Herrero-Solana, V., (2006) Improving Tag-
Clouds as Visual Information Retrieval Interfaces, International
advanced tagging experience through other innovative tools or Conference on Multidisciplinary Information Sciences and Technologies,
widgets, like a Firefox plugin to seamlessly add new InSciT2006
bookmarks while browsing, a WYSIWYG editor to offer drag .
and drop inclusion of texts and pictures from the web page the Hearst, M.A. (2006a) Clustering versus faceted categories for
user is bookmarking, and a history of all the times a bookmark information exploration. Communication of the ACM April Vol 49, No.4
has been tagged. .
Future works include testing the application on a real user -- (2006b) Design Recommendations for Hierarchical Faceted Search
base and verifying the outcomes, both in terms of internal Interfaces, ACM SIGIR Workshop on Faceted Search
logic and usability tests to widely prove the benefits of a .
semantic tagging application. -- The Flamenco Search Interface Project
.
Heymann, P., Garcia-Molina, H., (2006) Collaborative Creation of
Communal Hierarchical Taxonomies in Social Tagging Systems,
Technical Report InfoLab .
2 Note that hierarchies are not taxonomies but simply forests of shallow Kome, S H., (2006) Hierarchical Subject Relationships in
trees. Folksonomies
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 5
La Barre, K. (2006) The Use of Faceted Analytico-Synthetic Theory as
Revealed in the Practice of Website Construction and Design,
.
Morville, P., (2005) Ambient Findability, O’Reilly.
Quintarelli, E., (2005) Folksonomies: Power to the People, Proceedings of
1' ISKO Italy-UniMIB meeting (Milano, 24 giugno 2005)
.
Slavic, A., (2002) FATKS: Facet Analytical Theory in managing
Knowledge Structures for humanities, .
Travis, W., (2006) The strict faceted classification model
.
Yee, K.P., Swearingen, K., Li, K., and Hearst, M., (2003) Faceted
Metadata for image searching and browsing, Proceeding of CHI 2003
.
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 6
VIII. SCREENSHOT
Figure 1: The system interface.
FACETAG: INTEGRATING BOTTOM-UP AND TOP-DOWN CLASSIFICATION IN A SOCIAL TAGGING SYSTEM 7
Figure 2: A zooming sample, choosing Resource type > blog + Subjects > Information architecture.