=Paper=
{{Paper
|id=Vol-445/paper-10
|storemode=property
|title=Using Microformats to Personalize Web Experience
|pdfUrl=https://ceur-ws.org/Vol-445/02icwe2008ws-iwwost10-mrissa.pdf
|volume=Vol-445
|dblpUrl=https://dblp.org/rec/conf/icwe/MrissaAT08
}}
==Using Microformats to Personalize Web Experience==
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
Using Microformats to Personalize Web Experience
Michael Mrissa1 , Mohanad Al-Jabari1 , Philippe Thiran2
1
PReCISE Research Center, University of Namur, Belgium
{michael.mrissa, mohanad.al-jabari}@fundp.ac.be
2
PReCISE Research Center, University of Namur and Louvain School of Management, Belgium
philippe.thiran@fundp.ac.be
Abstract from passive (i.e. surfing on the Web) to active (i.e. author-
ing/editing information on the Web) via weblogs, wikis, and
As envisioned by its creator, the WorldWideWeb gath- user-driven contents in general.
ers billions of users from different communities all over Also, webpages now tend to integrate semantic informa-
the world. A recent evolution of the Web has been wit- tion coming from the user. Weblogs and user pages but also
nessed with microformats, which allow authors to seman- official websites massively introduce semantic information
tically annotate the contents of Web documents (webpages, via “tags”, or keywords. A tag is associated to a particu-
blog posts, news articles, RSS feeds, etc.), and enable inter- lar piece of information (i.e. a post in a blog, an article
software interactions by exporting this annotated content in a magazine) and provides some insight on the subject
to external applications (calendars, address books, etc.). this piece of information is about. Web 2.0 sites such as
However, Web users still originate from different communi- del.icio.us or f lickr take advantage of such users’ tags
ties, and thus follow their own local semantics (referred to to proposing sets of tag-related links as answers to users’
as context in this paper) for data interpretation and repre- queries. Semantic wikis are flourishing [2]. New tools are
sentation. Hence, there is a need to transform Web content proposed that link tags to semantic Web applications, thus
created according to the author’s context into the different linking the Web 2.0 to the Semantic Web [10].
contexts of its readers. We refer to such transformation pro-
cess as personalization. In this paper, we identify users’ re- 1.1 Microformats
quirements for Web content personalization and we present
a solution that takes advantage of microformats in order to Another big change that participates in this Web evolu-
enhance users’ experience on the Web with contextualized tion is the birth of microformats [9], which are tiny pieces of
information. We show how microformats offer a great op- information inserted into the XHTML code of a webpage.
portunity to adapt the contents of Web documents to differ- Microformats are developed according to a set of open stan-
ent users’ contexts. dards called microformat specifications [1, 4, 8]. With the
help of microformats, semantic information is directly at-
tached to the contents of webpages. While the objective
of microformats is to enhance user experience, microfor-
1 Introduction
mats are first detected by XML parsers, and provide ex-
plicit, non-ambiguous, machine-interpretable semantic in-
During the last few years, the emergence of the Web 2.0 formation about the content they are attached to.
has revolutionized the way information is designed and ac- Among the most famous Web 2.0 sites such as Twit-
cessed over the Internet. On the client side, manual brows- ter, Flickr, LinkedIn, Upcoming and Yahoo1 have already
ing of websites has given place to automatic aggregation of adopted microformats. Indeed, the -mostly unexploited- po-
RSS feeds into client applications. User-friendly interfaces tential benefits offered by microformats are numerous:
propelled with Asynchronous Javascript and XML (AJAX)
facilitate user interactions while reducing bandwidth [11]. • automatic analysis of Web information,
On the server side, the content of websites now tends
• export of microformatted information to external ap-
to a better structuring, thus adapting more easily to hetero-
plications,
geneous platforms with the use of XHTML and CSS. On
the client side, the user interaction paradigm is switching 1 http://www.yr-bcn.es/demos/microsearch/
57
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
• no need for complex ontologies to add information, users’ Web experience with a personalized display of infor-
mation. We propose a Web document personalizer that pro-
• human readability with the help of browser plugins. vides users with a representation of microformatted infor-
mation in webpages that is adapted to their local contexts.
Microformats are typically utilized as a tool to enable
inter-application interactions. For instance, an event de- 1.3 Paper organization
scribed in a webpage that is annotated with a microformat
enables (via the browser’s plugin) one-click export of the This paper is structured as follows. Section 2 explores
event description into the user’s calendar application. Tools the needs for personalization of information from a user’s
have already been developed that export contact informa- point of view. Section 3 introduces microformats and
tion (hCard microformat) and event information (hCalendar presents the most advanced propositions. Section 4 dis-
microformat) into address book and calendar applications. cusses the relation between microformats and users’ per-
sonalization requirements, before presenting our proposal
1.2 Challenges & motivation for Web contents personalization. Section 5 discusses the
results obtained and gives some insights for future work.
Users typically encounter data interpretation difficulties
while browsing the Web. These difficulties are due to sev- 2 Users’ personalization requirements
eral discrepancies between the semantics of the webpage
author and those of the webpage reader. Most of these
In this section, we identify users’ requirements in terms
discrepancies originate from these persons’ local contexts
of personalization. By no means we claim to propose an
that promote different interpretations of the same contents.
exhaustive list of personalizable concepts, but we try to ad-
A local context is a set of common knowledge (or com-
dress the main concerns that rose up from our own expe-
mon cultural conventions) that is shared between a group
rience surfing the Web. Hence, we focus on the following
of community members, like language, measurement units,
personalizable concepts:
and date/time formats [6, 5]. Although the common local
conventions of group of members are often implicit and can
• Date/time are organized in different ways according to
be viewed from different perspectives, [12] argue that local
the user’s language and country2 .
community members not only share a common language,
but also common culture conventions, such as measurement • Prices are expressed in different formats, (currencies,
units, keyboard configurations, character sets and notational VAT rate included, etc).
standards for writing time, dates, addresses, numbers, cur-
rency, etc. In the following, we present an example moti- • Addresses are structured differently. Postcode formats
vated by the belongings of a webpage author and reader to are different from country to country, sometimes street
French and English communities. number is before street name, (like in France), some-
Currently, the data authored on the Web are written ac- times after (like in Belgium).
cording to the author’s semantics. For example, a French
user browsing an English website on the Web has to trans- • Measure units also depend on the country (mainly En-
late an English-formatted date (mm/dd/yyyy) to its own for- glish and Metric systems are used).
mat (dd/mm/yyyy) in order to interpret it correctly. While
• Telephone numbers depend on the country too.
there are some exceptions (the 6th of June, the 12th of De-
cember), most of the time these differences in the seman-
According to these notions, we identify a set of user
tic organization of data require additional work for correct
characteristics that currently form our user context. Here
data interpretation. A similar situation occurs with prices,
also, we do not aim at building an exhaustive list of required
lengths, weights, in general unit measures, and probably
context parameters but we gathered the parameters that are
many other pieces of information related to local semantics.
required to answer the personalization needs of the notions
At first sight, microformats do not offer very much to listed previously.
users in terms of personalization: while the final goal of One could argue that these personalizable concepts de-
microformats is to enhance human experience, the seman- pend on the user’s country, which can be obtained from the
tic information they offer is not meant to be directly read by IP address contained in HTTP requests. However, we as-
users but machines first. However, they have the character- sume that users connected from a foreign country do not
istic to be machine-interpretable, thus allowing programs want the webpage information to be personalized according
to “understand” them. In this paper, we take advantage
of the possibilities offered with microformats to enhance 2 http://en.wikipedia.org/wiki/Calendar_date
58
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
to the local context of the host country. Furthermore, one has only one mandatory subclass itemInf o which con-
country could have several communities, e.g: Belgium. tains either a f n fullname (with url or photo subclasses), or
As a consequence, we establish a combination of lan- a hCard or a hCalendar subclass (events can be reviewed
guage and country as the main parameter for context, to- too, like concerts for example). Several optional elements
gether with timezone, optional date style and currency complete the microformat (reviewer (hCard), dtreviewed,
parameters to distinguish users’ local contexts. The rating, description, tags, permalink, license).
language(country) parameter is used to adapt the format- hListing. The hListing Microformat provides list-
ting of the original webpage information, and is combined ings format suitable for embedding in (X)HTML, Atom,
with a datestyle parameter to format the dates according RSS, and arbitrary XML. it is identified with a hListing
to the user’s context. timezone and currency parameters class name. Mandatory subclasses are listingAction, lis-
respectively identify the time zone and local currency of the ter(hCard), and description. Several optional subclasses in-
user and enable correct conversion of time and price infor- cludes dtlisted, dtexpired, price, etc.
mation displayed on webpages. hAtom. The hAtom microformat is intended to describe
web contents that can be syndicated, e.g: weblog postings.
3 Microformat specifications It is identified with hentry and optional hfeed class names.
Mandatory subclasses are entry-title, updated, and author.
Several microformats have been designed in order to de- They describe Atom entry title, updated date, and the author
scribe the semantics of the most typical elements users can name, respectively. Optional subclasses like entry-content,
encounter on web documents. The most well-known micro- entry-summary, published, and bookmark are also possible
formats are hCard, hCalendar and hReview. We detail based on the Atom syndication format described in RFC
microformats below according to two categories: accepted 42876 .
standards that have been validated by the community and hMeasure. The hM easure microformat describes phys-
thus that should be used as described in the specification, ical quantities measured according to specific units. Manda-
and emerging proposals that are already advanced specifi- tory subclasses are value and unit that respectively specify
cation drafts but could be subject to further modifications. numeric value and measurement unit of the physical quan-
tity. Optional subclasses include item, type and tolerance
3.1 Accepted standards to specify which item or product is being measured, the di-
mension being measured (e.g. height or width of length
hCard. The hCard microformat describes people and quantity), and the error rate (percentage or nested hMea-
organizations. It is identified with vcard as a class name. It sure).
requires at least the f n or n∗ subclass that identifies an indi- hMoney. The hM oney microformat describes money
vidual with a fullname or another type of name (given name, information. It is identified with the money class name. It
family name, etc.). Then, several other classes are optional requires at least the amount subclass that specifies the nu-
together with their subclasses (nickname, url, email, tel, merical value of money, together with currency, unit, and
adr, org, etc.). This microformat is based on the vCard date optional subclasses, which respectively specify ISO
specification described in RFC 24263 . 42177 currency code, currency unit (e.g: Euro, cent), and
hCalendar. This microformat describes events and cal- the date associated to the value.
endar information. It is identified with a vcalendar or vevent adr. The adr microformat is utilized as an optional sub-
class name. Mandatory subclasses are dtstart and summary, class in several microformats (e.g.: hCard(adr), hCalen-
they respectively describe the starting time and summary dar(location(adr)), hListing(item info(adr)), etc.) that spec-
of an event. Optional subclasses are possible based on the ifies the address information. It is identified by the adr
vCalendar specification described in RFC 24454 . class name, and post-office-box, extended-address, street-
XHTML Friend Networks (XFN). XFN describes rela- address, locality, region, postal-code, and country-name
tionships between people. It allows one to specify other subclasses.
persons as friends, colleague, etc. using the rel attribute5 . geo. The geo microformat is also an optional sub-
class of several microformats (e.g.: hCard(geo), hCalen-
3.2 Emerging proposals dar(location(geo)), hListing(item info(geo)), etc.) that spec-
ify geographic coordinates. It is identified by the geo class
hReview. The hReview microformat allows describing name, together with latitude and longitude subclass names.
online reviews and ratings. It is a composite format that
6 Available on http://www.ietf.org/rfc/rfc4287
3 Available on http://www.ietf.org/rfc/rfc2426.txt 7 Available on http://www.iso.org/iso/support/faqs/
4 Available on http://www.ietf.org/rfc/rfc2445.txt
faqs_widely_used_standards/widely_used_standards_
5 More information on http://www.gmpg.org/xfn/11. other/currency_codes/currency_codes_list-1.htm.
59
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
µ-formats Date/Time Price Measurements Units Address Tel Number
hCard bday,tz geo adr tel
hCalendar dtstart, dtend geo location (adr) (via hCard)
dtstamp, duration description (hMeasure) (via hCard)
rdate, (via hCard)
hReview dtreviewed price description (hMeasure) (via hCard) (via hCard)
(via hCard, hCalendar) (via hCalendar)
hListing dtlisted, dtexpired price item info(geo) item info (adr) (via hCard)
(via hCard, hCalendar) description (hMeasure) (via hCard)
(via hCard,hCalendar) (via hCalendar)
hAtom published, updated, entry-content entry-content (hMeasure) (via hCard) (via hCard)
(via hCard, hCalendar) (via hReview) (via hCard, hCalendar) (via hCalendar)
(via hReview, hListing) (via hListing) (via hReview, hListing) (via hListing)
hMoney date money
Table 1. Correspondences between users’ personalization requirements and µ-format specifications.
There are other microformats that describe licenses (rel- hCalendar microformats. Therefore, personalizable mi-
license), tags, keywords, categories (rel-tag), and also lists croformats class attributes should be directly extracted
and outlines (XOXO). For brevity purpose, we do not give from webpages independently of the container microformat
details on these microformats in this paper, and we refer the and personalized according to the user’s preferences.
reader to http://microformats.org for additional
information. Note that the specifications of microformat 4.2 General approach
proposals could still be subject to major changes as they
are not yet accepted as standards. Our personalization approach focuses on adapting the
contents of webpages based on a set of parameters that help
4 Personalizing Web documents setup the user’s context. We devised a personalizer engine
shown in Fig. 1 as the core component of our approach. Our
In this section, we examine to which extent microformats personalizer engine parses a URL-identified web document
are useful for the personalization purpose, before presenting and user context parameters as inputs and produces a per-
our personalization approach and detailing its implementa- sonalized web document that can be viewed according to
tion and deployment. the user’s context.
4.1 Microformats and users’ personaliza-
tion requirements
Microformats can be atomic, i.e. self-contained like
adr or geo, or they can be composite, like hCard or
hCalendar. Table 1 summarizes the correspondences be-
tween the main composite microformats and users’ person-
alization requirements. Each cell of Table 1 describes the
particular microformat utilized by the composite microfor-
mat in order to represent the semantic information. For
brevity purpose, we exclude atomic microformats, which Figure 1. Personalizer overview.
have straightforward correspondences (i.e. adr corresponds
to the address requirement, geo and hM easure correspond The main idea developed in this work consists in parsing
to the Measurement units requirement). the XHTML Web document and identifying elements with
Table 1 shows that the personalizable concepts afore- class attributes that have for values the names of our per-
mentioned are present in most existing microformats. Fur- sonalizable elements (dtstart, dtend, bday, dtreviewed,
thermore, it is possible for a webpage author to mix/nest tel, etc.). Then, the personalized information obtained the
several microformats that contain different pieces of in- web document is added to the original contents. In order to
formation, as for hReview, which may host hCard and ensure good user understanding, the original version is kept
60
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
Figure 2. Screenshot of the servlet preference page.
as is and the personalized data is put next to it into brackets. Client-side deployment. Client-side deployment is per-
Our prototype currently detects dates, currencies and time formed via a Java program that is made accessible via
zones. a Firefox extension (Fig. 3). In order to link our Java
program to the Firefox extension, XPCOM components
4.3 Implementation and deployment are utilized. The Firefox extension integrates seamlessly
into the user’s browser and adds personalization capa-
Our personalizer has been developed under the bilities to Firefox. Our extension prototype is avail-
EclipseTM environment and JavaTM platform. In order to able at http://perso.fundp.ac.be/˜pthiran/
make our solution embeddable into the largest number of microformats/.
existing architectures, we developed it and tested its deploy- For the purpose of client-side deployment, we inte-
ment in three different fashions: server-side, client-side and grate our personalizer engine as an extension to the Firefox
as a library. The deployment of our personalizer on the browser (Fig. 2). In order to embed our java-based per-
client-side gives users the opportunity to personalize the sonalizer engine; XUL (XML User Interface Language),
contents of all web pages directly on the user’s computer. JavaScript, and XPCOM technologies are utilized. XUL
Also, users’ parameters are kept locally, thus favorizing pri- is used for implementing the user context interface, while
vacy and security concerns. On the other hand, the deploy- JavaScript and XPCOM used as glue, where JavaScript
ment of our personalizer on the server-side in a proxy-like code gets the URL of webpage and the users’ preferences
fashion gives control to the Web server and allows exploit- and send them to java code using XPCOM components.
ing the information entered by users and performing statis-
tics on users’ preferences, number of users, etc. However Java library. Our personalizer is also available as a Java
this deployment method is less reliable when it comes to library (available at the same url address than the exten-
the security and privacy concerns. sion prototype), as we believe it could be adapted to many
(any) other Java-based application dealing with microfor-
Server-side deployment. Our personalizer is deployed on mats: browser (Firefox/IE plugin), RSS feed readers, email
the server-side as a Web servlet that gets the Universal application, calendar application, etc.
Resource Location (URL) of a Web document in addition
to user’s personalization parameters, and returns the same 5 Conclusion
webpage with additional personalized contents (Fig. 2). Our
Web interface acts as a proxy that performs on-the-fly per- In this paper, we identify users’ needs for personalization
sonalization of Web contents. of webpage contents and we take advantage of microfor-
61
ICWE 2008 Workshops, 7th Int. Workshop on Web-Oriented Software Technologies – IWWOST 2008
Figure 3. Screenshot of the Firefox preference extension.
mat annotations in order to personalize the contents of Web [5] W. Barber and A. Badre. Culturability: The merging of cul-
documents. Our proposal relies on a limited set of user pa- ture and usability. In The 4th conference on human factors
rameters in order to enable personalization of webpage con- and the Web, 1998.
tents. We implemented and validated our proposal both on [6] D. Cyr and H. Trevor-Smith. Localization of web design:
An empirical comparison of german, japanese, and united
the client-side with a Firefox plugin and on the server-side
states web site characteristics. JASIST, 55(13):1199–1208,
with a servlet application.
2004.
This work illustrates one of the advantages microformats [7] I. Davis. RDF in HTML (eRDF). http://research.
can bring to the Web. However, as microformats propose talis.com/2005/erdf/wiki/Main/RdfInHtml
a finite set of specifications, they remain rather limited. As (last viewed April 29, 2008).
a future work, we believe it could be interesting to evalu- [8] R. Khare. Microformats: The next (small) thing on the se-
ate to which extent our personalization approach could be mantic web? IEEE Internet Computing, 10(1):68–75, 2006.
adapted to emerging semantic annotation proposals such as [9] R. Khare and T. Çelik. Microformats: a pragmatic path
RDFa [3] or eRDF [7], which do not restrict semantic an- to the semantic web. In L. Carr, D. D. Roure, A. Iyengar,
C. A. Goble, and M. Dahlin, editors, WWW, pages 865–866.
notations to a set of specifications.
ACM, 2006.
[10] A. Passant. MOAT: Meaning Of A Tag - Project Homepage.
References http://moat-project.org/ (last viewed April 29,
2008).
[1] Microformat homepage (wiki). http://www. [11] K.-U. Schmidt, L. Stojanovic, N. Stojanovic, and
microformats.org/wiki/ (last accessed: 20 S. Thomas. On enriching ajax with semantics: The web per-
Apr. 2008). sonalization use case. In E. Franconi, M. Kifer, and W. May,
[2] Semantic MediaWiki Homepage (wiki). http://www. editors, ESWC, volume 4519 of Lecture Notes in Computer
semanticweb.org/wiki/Semantic_MediaWiki Science, pages 686–700. Springer, 2007.
(last viewed April 29, 2008). [12] O. D. Troyer and S. Casteleyn. Designing localized web
[3] B. Adida and M. Birbeck. Rdfa primer 1.0 embedding rdf in sites. In X. Zhou, S. Y. W. Su, M. P. Papazoglou, M. E.
xhtml. W3c working draft, W3C, October 2007. Orlowska, and K. G. Jeffery, editors, WISE, volume 3306
[4] J. Allsopp. Microformats: Empowering Your Markup for of Lecture Notes in Computer Science, pages 547–558.
Web 2.0. Apress, 2007. Springer, 2004.
62