Semantics-based Dynamic Hypermedia Adaptation
                           using the Hidden Markov Model


                             Jayan C Kurian, Payam M. Barnaghi, Michael Ian Hartley
     School of Computer Science and Information Technology, The University of Nottingham (Malaysia Campus),
                          Jalan Broga, 43500 Semenyih, Selangor Darul Ehsan, Malaysia.
                                          eyx5jkc@nottingham.edu.my


                                                           towards semi-automatic authoring in a semantic
Abstract - Information collection, selection,
                                                           environment will facilitate different types of authors in
structuring, and presentation design are the core
                                                           getting through the presentation generation process starting
considerations for general hypermedia presentation
                                                           from initial exploration of a domain to the final presentation.
generation systems. The content collection process
                                                           For semantic-based authoring, semantic web document
can be enhanced by retrieving semantically related
                                                           representation standards [1],[2] are used to represent media
information objects, relevant to the topic selected by
                                                           assets in a machine accessible form. In this context, relations
an author. Once relevant information objects are
                                                           between domain conceptual structures are explicitly defined
available, the content selection process suggests
                                                           by an ontology that makes contents of multimedia objects
semantically related resources for the author’s
                                                           accessible through a rich metadata model.
selection based on data usage history. The
information objects are represented by media assets
                                                              The functionality of hypermedia systems can be enhanced
and descriptive documents. The semantic web
                                                           by making it personalized or adaptable [3] to the author’s
technology that allows resource interoperability can
                                                           requirement. In our proposed system, we describe adaptation
be used for content description and interpretation of
                                                           (i.e. content suggestion) in the content selection context,
these information objects. By utilizing a semi-
                                                           once the author identifies her/his topic of interest. For
automatic approach, authors can be assisted at
                                                           adaptation, we employ the Hidden Markov Model [4], a
different stages of the presentation generation
                                                           statistical model that can determine hidden states from
process. In this research, adaptation constraints are
                                                           observable parameters. It is our belief that the proposed
established independent of the author’s proficiency
                                                           adaptation model supports authors’ for generating adaptable
(i.e. novice, intermediate or expert) by applying the
                                                           hypermedia presentations.
Hidden Markov Model methodology. Semantically
related media objects are suggested to authors for
                                                              The paper is organized as follows. The next section
selection based on their interactive behaviour and
                                                           describes multimedia systems for presentation generation
the strength of semantic relations. The paper
                                                           and section 3 describes knowledge representation and the
describes an application of the Hidden Markov
                                                           data model. The system architecture is described in section 4.
Model in the initial authoring phases of semi-
                                                           Section 5 introduces the Hidden Markov Model for semantic
automatic hypermedia presentation systems.
                                                           content suggestion and section 6 concludes the paper and
                                                           discusses future work.
Keywords: Hypermedia Presentation Generation,
Semantic Web, Semi-Automatic Authoring, Hidden
Markov Model, Adaptation.
                                                           2. Multimedia Presentation Generation

                                                           We investigate and describe systems that employ adaptation,
                                                           temporal constraints, and automated presentation design in
1. Introduction                                            multimedia presentations. Then, we briefly introduce several
                                                           authoring systems that use semantic web technology as
  Multimedia authoring provides an effective way to        means for generating presentation contents with emphasis
communicate the goals of a presentation coherently.        on the semi-automatic authoring process.
Adaptation is one of the features in user modeling that
will support customization of semantic search strategies     The various interaction styles used for hypermedia
based on the author requirements. The approach             navigation and an adaptive web interface that generates
semantically related multimodal output are described by        generation. From the paper, we realize the need for
Taib et al. [5]. The multimodalities (i.e. written text,       considering the presentation parameters, and the temporal
graphics, and speech) are classified by output modality        constraints of media objects for an efficient adaptive system.
classification methodologies. The users are classified
into predefined profiles (e.g. text profile or multimedia        Topia [8] uses RDF multimedia repository of
profile) depending on their interactive behaviour while        Rijksmuseum collection [9] and creates hypermedia
progressing through the authoring process. Predefined          presentations as a result of a query. The SemInf system [10]
presentation templates are used for output generation          semi-automatically generates multimedia presentations by
that adapts progressively according to the interaction         combining semantic inferencing with multimedia
styles of users. Thus, the paper describes an approach to      presentation tools. In this context, Dublin Core (DC) [11]
classify users into interaction style profiles and             metadata and SMIL [12] presentation formats are used in
highlights the need for adaptation in hypermedia               generating multimedia presentations. The Artequakt [13]
systems.                                                       project generates artist’s biographies by applying semantic
                                                               associations between different entities that represent the
   The work of Dalal et al. [6] introduces a knowledge-        artist’s personal and professional life. The aim of the DISC
based system that generates customized temporal                [14] system is to build a multimedia presentation about a
multimedia presentations. The order and duration of            certain topic by traversing a semantic graph that consists of
information objects are represented using temporal             the domain ontology of classes, instances, and relations
constraints and is achieved using negotiation process at       between them, together with the media resources related to
run-time. The adaptation process employed here is              instances.
tailored for different caregivers in a medical domain.
An instance hierarchy creates the knowledge                      A hypermedia presentation generation system in a multi-
representation structure for each patient using domain         facet environment is described in SampLe [15]. The system
and concept hierarchies. The presentation structure            uses semantic web technologies to support authors during
represented as a directed acyclic graph exchanges the          presentation generation process. The process is divided into
information among various system components and                four phases: topic identification, discourse structure building,
expresses the communicative goals to be adapted to             media material collection, and production of final-form
different caregivers. The media coordinator makes a            presentation. SampLe supports authors during every phase
consistent and synchronized presentation by allowing           of the process, independent of a particular workflow. This is
media-specific components to access and update the             achieved using ontology-based and context oriented
presentation plan. The inconsistencies in presentation         information, as well as semantic interrelationships between
plan are resolved using a constraint solver. The paper         different types of meta-data.
introduces user modeling concepts and signifies the
importance of temporal and spatial constraints in the
context of synchronized multimedia presentation                3. Knowledge Representation and Data Model
systems.
                                                               For knowledge representation, we employ an ontology that
   Andre et al. [7] describes an approach towards fully        defines a common vocabulary for machine-interpretable
automated presentation design in the context of                definitions of common concepts in a domain and relations
personalized multimedia presentation generation. The           among them. The domain ontology gives information
paper discusses prototype systems [e.g. WIP and PPP]           related to domain concepts. The media ontology gives media
that produce presentations based on a given set of             specific information and the discourse ontology narratively
presentation parameters and by considering temporal            structures presentation contents. Protégé [16], an ontology
coordination of different media items. The structure of        editor tool is used to develop the knowledge-base for the
coherent media items are described using the                   domain ontology. This is represented in RDF(S) [17]. A
generalization of speech act theory, and the rhetorical        simplified version of the Neural Network domain ontology
structure theory, for communication between multiple           created for the proposed system is illustrated in Fig. 1.
media parts of heterogeneous media objects. The
presentation structure is generated by utilizing the             A data model represents the basic guidelines for
knowledge-base        components.     The      presentation    annotating various media items. The data model adopted
strategies select the relevant content, and structure it for   from [18] has the components content schema, semantic
delivering through an appropriate medium for target            schema, and media schema that describes multimedia
consumers. The qualitative and quantitative constraints        objects proficiently. The content schema is represented by
are taken into consideration for building up a temporal        Dublin Core attributes (e.g. title, identifier), semantic
constraint network for presentation acts and the               schema is represented by Learning Objective Metadata
temporal coordination that facilitates presentation            (LOM) [19] attributes (e.g. language, level), and media
schema is represented by MPEG-7 [20] attributes (e.g.       4. Adaptable Authoring System Architecture
media type, media URI).
                                                            In the proposed system, domain ontology is represented in
                                                            RDF/XML, and the media objects are stored in a database or
                                                            are coming from heterogeneous resources. Jena [21], an
                                                            open source Java based RDF repository and reasoning
                                                            engine, is used to query RDF/XML knowledge-base.

                                                               We adopt the extensive architecture proposed by Bunt et
                                                            al. [22] that describes multimodal interaction and
                                                            coordination for information presentation. The architecture
                                                            identifies the functional and technical requirements for
                                                            intelligent multimodal systems. The main components of the
                                                            architecture are multimodal input, multimodal integration,
                                                            and multimodal output. The multimodal input component
                                                            caters to a mixture of input modalities (e.g. text, audio, and
                                                            video). The multimodal integration component interacts
                                                            with various modeling components for adaptation. The
                                                            content management component interacts with the
                                                            multimodal integration component to provide appropriate
                                                            presentation content. The presentation is delivered by the
                                                            multimodal output component that interacts with the
                                                            application interface. The logical diagram of the
          Fig.1. Neural Network domain ontology             presentation generation process is illustrated in Fig.2.

   The described data model has been chosen since the
representation describes media contents, semantics and
the media attributes effectively. The metadata attributes
for media resources are generated manually. The data
model allows the representation of external media
objects already annotated with Dublin Core, Learning
Objective Metadata, and MPEG-7 attributes. The
ontology and the data model make the knowledge
representation structure independent from data
representation structure. In the data model, the metadata
attribute “language” specifies the adaptation component
for the presentations to be customized according to
language specifications. The metadata attribute “level”
specifies the content proficiency to provide
customizable presentations. To describe the continuous
media objects, MPEG-7 standard is chosen that
represents the spatial and temporal aspects of                Fig.2. Logical diagram of the presentation generation process
multimedia objects. MPEG-7 Multimedia Description
                                                              The content management component has been designed
Schemes can effectively describe multimedia entities.
                                                            with reference to the standard reference model [23] that
MPEG-7 Visual Description Tools describes the visual
                                                            describes an implementation-independent view of the
features (e.g. color, motion). MPEG-7 Audio provides
                                                            processes required for the generation of intelligent
the standard for describing the audio contents (e.g.
                                                            multimedia presentations. The conceptual design of standard
sound recognition). The data model enables us to
                                                            reference architecture modularizes multimedia presentation
represent content specific information (e.g. who, what),
                                                            generation process into five layers: control layer, content
media specific information (e.g. size, height), and
                                                            layer, design layer, realization layer, and the presentation
semantic specific information (e.g. when, how) of
                                                            display layer.
multimedia objects. Thus, the data model represents
media dependent features of multimedia resources
                                                              In our methodology, the author selects a theme for the
efficiently for enhanced data selection.
                                                            presentation (e.g. Lecture Notes) followed by a title (e.g.
                                                            Introduction to Neural Network Architecture) that is
supported by the discourse ontology. This is represented     recognition or classification to series prediction or vice-
by the control layer of the standard reference               versa. The closest probabilities between the concepts
architecture. The author selects relevant concepts by        indicate that the annotated media items are strongly related.
browsing through the domain ontology. Depending on           The farthest probabilities between the concepts indicate that
author’s selection, related concepts are suggested by the    the media items are vaguely related. Thus, the strong and
system based on data usage history. Once the contents        weak semantic relations between media assets can be
are selected, corresponding media items are added and        predicted for suggesting to potential authors. The author’s
ordered. This is represented by the content layer of         proficiency (i.e. novice, intermediate, or expert) is the
reference architecture. The design layer conveys the         hidden state in this model and is not determined explicitly.
presentation layout structure for the media objects. The     The author’s proficiency is not determined since average
realization layer integrates displayable media objects       proficiency may vary depending on the cohort of authors
with the layout information. The presentation display        and annotating media resources based on the author’s
layer converts the realization layer representation into a   proficiency limits the knowledge space for potential authors.
hypermedia presentation that will be generated in the        In this context, the Markov Model holds its significance
form of SMIL. The system architecture is illustrated         since the model predicts the probability based on the
[24] in Fig. 3.                                              previous state and the probability for future events can be
                                                             determined by extending the model to “n” previous states.

                                                               The browsing pattern of authors is given in Table 1 and
                                                             the calculated transition probabilities are given in Table 2.
                                                             Based on these values, the authors’ browsing behaviour of
                                                             concepts can be predicted by the Markov Model. If the
                                                             calculated probability of authors’ accessing the
                                                             classification and series prediction concepts are near, it can
                                                             be predicted that the classification and series prediction
                                                             concepts are semantically related.

                                                             Table 1
                                                             Browsing pattern of the authors


           Fig.3. Logical diagram of the system

5. Semantic Content Suggestion using the Hidden
Markov Model

A Markov process [4] is defined as a process which
moves from state to state depending on the previous
state of the process. Our objective is to suggest            Table 2
semantically related information objects to authors          Transition probabilities
based on data usage history, without explicitly
determining the author types. The Hidden Markov
Model, which is an extension to the Markov Model, can
make probabilistic assumptions of the hidden states
based on observable states. Here the hidden states refer
to the author types and the observable states refer to the
author data usage history. Thus, the Hidden Markov
Model can be used to examine and predict semantically
related information objects.
                                                               Using the Markov Model, authors’ browsing behaviour of
                                                             concepts is predicted as 0.234 for classification, 0.483 for
  In the proposed system, media items are annotated
                                                             pattern recognition, and 0.283 for series prediction. The
with the domain ontology concepts: classification (CS),
                                                             concept pattern recognition has the highest probability of
pattern recognition (PR), and series prediction (SP).
                                                             selection followed by the concepts series prediction, and
The Markov model takes into account the number of
                                                             classification. The nearest probability between the concepts
authors browsing from classification to pattern
classification and series prediction implies that they          [2] The Web Ontology Language (OWL), available at:
have a strong semantic relation. Moreover, the concepts             http://www.w3.org/2004/OWL/
classification and pattern recognition implies a weak           [3] P. Brusilovsky, “Methods and techniques of adaptive
semantic relation since their probabilities are far-off.            hypermedia”, User Modeling and User-Adapted
The subsequent predictions based on previous                        Interaction, 1996.
probability results give the probability values as 0.245        [4] K. Seymore, A. McCallum, and R. Rosenfeld, “Learning
for classification, 0.477 for pattern recognition, and              Hidden Markov Model Structure for Information
0.278 for series prediction. In this case, it is evident that       Extraction”, AAAI 99 Workshop on Machine Learning
the probability difference between the concepts                     for Information Extraction, 1999.
classification and series prediction has reduced. This          [5] R. Taib, N. Ruiz, “Multimodal interaction styles for
signifies that the strength of semantic relation has                hypermedia adaptation”, Proceedings of the 11th
increased. Thus, the authoring system suggests semantic             international conference on Intelligent user interfaces,
contents related with the author’s preferred topic.                 2006.
Moreover, based on predicted semantic relations,                [6] M. Dalal, S. Feiner, K. McKeown, S. Pan, M. Zhou, T.
additional media items annotated with related concepts              Hollerer, J. Shaw, Y. Feng, and J. Fromer, “Negotiation
can be supplied to the multimedia repository for                    for automated generation of temporal multimedia
generating resourceful hypermedia presentations. To                 presentations”, ACM Multimedia, 1996.
make the system effective at the initialization phase,          [7] E. Andre, J. Muller, and T. Rist, “WIP/PPP: Automatic
semantic search strategies have to be incorporated for              generation of personalized multimedia presentations”,
supporting the authors’ in content selection since the              In Proc. of the 4th ACM Int. Multimedia Conference
system suggests concepts based on data usage history.               (Multimedia'96), 1996.
                                                                [8] L. Rutledge, M. Alberink, R. Brussee, S. Pokraev, W.
                                                                    Van Dieten, and M. Veenstra, “Finding the Story —
6. Conclusion                                                       Broader Applicability of Semantics and Discourse for
                                                                    Hypermedia Generation”, In Proceedings of the 14th
This paper describes an ongoing research to develop a               ACM Conference on Hypertext and Hypermedia, pp.
semi-automatic hypermedia authoring system. The                     67–76, 2003.
research concentrates on the content collection and             [9] Rijksmuseum Amsterdam, a museum of Dutch art and
content selection phases of authoring, and suggests                 history, available at: <http://www.rijksmuseu m.nl>
semantically related information objects to potential           [10] S. Little, J. Geurts, and J. Hunter, “Dynamic Generation
authors. The presentation generation process describes              of Intelligent Multimedia Presentations through
the adaptation component in the context of user                     Semantic Inferencing”, In 6th European Conference on
modeling for hypermedia presentations. The Hidden                   Research and Advanced Technology for Digital
Markov Model is employed for semantic content                       Libraries, pp. 158–189, 2002.
suggestion based on data usage history. The architecture        [11] Dublin Core Metadata Element Set Version1.1 (DC),
of the proposed authoring system complies with the                  Reference Description, Dublin Core Metadata Initiative,
multimodal interactive information presentation                     July 1999, available at: <http://dublincore.org/docum
architecture and the standard reference architecture,               ents/1999/07/02/dces>
which describes extensively the fundamental authoring           [12] Synchronized Multimedia Integration Language
stages of intelligent multimedia presentations.                     (SMIL), available at: <http://www.w3.org/TR/REC-
                                                                    smil>
  Future work concentrates on a prototype                       [13] S. Kim, H. Alani, W. Hall, P.H. Lewis, D.E. Millard, N.
implementation to evaluate the Hidden Markov                        Shadbolt, and M.J. Weal, “Artequakt: Generating
Methodology based on user trails. Once the adaptation               Tailored Biographies with Automatically Annotated
component is realized, an integrated system can                     Fragments from the Web”, Workshop on Semantic
generate hypermedia presentations that would adapt                  Authoring, Annotation & Knowledge Markup, 15
dynamically to authors’ proficiencies.                              European Conf. on Artificial Intelligence (ECAI), pp.1-
                                                                    6, 2002.
                                                                [14] J. Geurts, S. Bocconi, J. van Ossenbruggen, and L.
                                                                    Hardman, “Towards Ontology-driven Discourse: From
References                                                          Semantic Graphs to Multimedia Presentations”, In
                                                                    Second International Semantic Web Conference
[1] The Resource Description Framework (RDF),                       (ISWC2003), pp. 597–612, 2003.
    available at: <http://www.w3.org/RDF>                       [15] K. Falkovych, and S. Bocconi, “Creating a Semantic-
                                                                    based Discourse Model for Hypermedia Presentations:
     (Un)discovered Problems”, In Workshop on
     Narrative, Musical, Cinematic and Gaming
     Hyperstructure, 2005.
[16] Protégé, an ontology editor and knowledge-base
     framework,                  available            at:
     <http://protege.stanford.edu>
[17] Resource Description Framework (RDF) Schema,
     available at: <http://www.w3.org/TR/rdf-schema>
[18] P.M. Barnaghi, and S.A. Kareem, “Ontology-
     Based Multimedia Presentation Generation”, IEEE
     TENCON 2005 Conference, 2005.
[19] Learning Object Metadata (LOM), available at:
     <http://ltsc.ieee.org/wg12/20020612-Final-LOM-
     Draft.html>
[20] B.S. Manjunath, P. Salembier, and T. Sikora,
     “Introduction to MPEG-7: Multimedia Content
     Description Interface”, John Wiley, 2002.
[21] Jena, a semantic web framework for Java, available
     at: <http://jena.sourceforge.net>
[22] H. Bunt, M. Kipp, M. Maybury and W. Wahlster,
     “Fusion and Coordination for Multimodal
     Interactive Information Presentation”, In: O. Stock
     and M. Zancanaro (eds), Multimodal Intelligent
     Information Presentation, Springer, 2005.
[23] M.Bordegoni, G.Faconti, M.T.Maybury, T.Rist,
     S.Ruggieri, P.Trahanias, and M.Wilson, “A
     Standard Reference Model for Intelligent
     Multimedia Presentation Systems”, Proceedings of
     IJCAI-97 Workshop on Intelligent Multimodal
     Systems, International Joint Conferences on
     Artificial Intelligence Inc., 1997.
[24] Web resources, available at: www.w3.org, gbn.
     glenbrook.k12.il.us, www.cdc.gov, www.inns.org,
     www.uni-duisburg.de.