Semantics-based Dynamic Hypermedia Adaptation using the Hidden Markov Model Jayan C Kurian, Payam M. Barnaghi, Michael Ian Hartley School of Computer Science and Information Technology, The University of Nottingham (Malaysia Campus), Jalan Broga, 43500 Semenyih, Selangor Darul Ehsan, Malaysia. eyx5jkc@nottingham.edu.my towards semi-automatic authoring in a semantic Abstract - Information collection, selection, environment will facilitate different types of authors in structuring, and presentation design are the core getting through the presentation generation process starting considerations for general hypermedia presentation from initial exploration of a domain to the final presentation. generation systems. The content collection process For semantic-based authoring, semantic web document can be enhanced by retrieving semantically related representation standards [1],[2] are used to represent media information objects, relevant to the topic selected by assets in a machine accessible form. In this context, relations an author. Once relevant information objects are between domain conceptual structures are explicitly defined available, the content selection process suggests by an ontology that makes contents of multimedia objects semantically related resources for the author’s accessible through a rich metadata model. selection based on data usage history. The information objects are represented by media assets The functionality of hypermedia systems can be enhanced and descriptive documents. The semantic web by making it personalized or adaptable [3] to the author’s technology that allows resource interoperability can requirement. In our proposed system, we describe adaptation be used for content description and interpretation of (i.e. content suggestion) in the content selection context, these information objects. By utilizing a semi- once the author identifies her/his topic of interest. For automatic approach, authors can be assisted at adaptation, we employ the Hidden Markov Model [4], a different stages of the presentation generation statistical model that can determine hidden states from process. In this research, adaptation constraints are observable parameters. It is our belief that the proposed established independent of the author’s proficiency adaptation model supports authors’ for generating adaptable (i.e. novice, intermediate or expert) by applying the hypermedia presentations. Hidden Markov Model methodology. Semantically related media objects are suggested to authors for The paper is organized as follows. The next section selection based on their interactive behaviour and describes multimedia systems for presentation generation the strength of semantic relations. The paper and section 3 describes knowledge representation and the describes an application of the Hidden Markov data model. The system architecture is described in section 4. Model in the initial authoring phases of semi- Section 5 introduces the Hidden Markov Model for semantic automatic hypermedia presentation systems. content suggestion and section 6 concludes the paper and discusses future work. Keywords: Hypermedia Presentation Generation, Semantic Web, Semi-Automatic Authoring, Hidden Markov Model, Adaptation. 2. Multimedia Presentation Generation We investigate and describe systems that employ adaptation, temporal constraints, and automated presentation design in 1. Introduction multimedia presentations. Then, we briefly introduce several authoring systems that use semantic web technology as Multimedia authoring provides an effective way to means for generating presentation contents with emphasis communicate the goals of a presentation coherently. on the semi-automatic authoring process. Adaptation is one of the features in user modeling that will support customization of semantic search strategies The various interaction styles used for hypermedia based on the author requirements. The approach navigation and an adaptive web interface that generates semantically related multimodal output are described by generation. From the paper, we realize the need for Taib et al. [5]. The multimodalities (i.e. written text, considering the presentation parameters, and the temporal graphics, and speech) are classified by output modality constraints of media objects for an efficient adaptive system. classification methodologies. The users are classified into predefined profiles (e.g. text profile or multimedia Topia [8] uses RDF multimedia repository of profile) depending on their interactive behaviour while Rijksmuseum collection [9] and creates hypermedia progressing through the authoring process. Predefined presentations as a result of a query. The SemInf system [10] presentation templates are used for output generation semi-automatically generates multimedia presentations by that adapts progressively according to the interaction combining semantic inferencing with multimedia styles of users. Thus, the paper describes an approach to presentation tools. In this context, Dublin Core (DC) [11] classify users into interaction style profiles and metadata and SMIL [12] presentation formats are used in highlights the need for adaptation in hypermedia generating multimedia presentations. The Artequakt [13] systems. project generates artist’s biographies by applying semantic associations between different entities that represent the The work of Dalal et al. [6] introduces a knowledge- artist’s personal and professional life. The aim of the DISC based system that generates customized temporal [14] system is to build a multimedia presentation about a multimedia presentations. The order and duration of certain topic by traversing a semantic graph that consists of information objects are represented using temporal the domain ontology of classes, instances, and relations constraints and is achieved using negotiation process at between them, together with the media resources related to run-time. The adaptation process employed here is instances. tailored for different caregivers in a medical domain. An instance hierarchy creates the knowledge A hypermedia presentation generation system in a multi- representation structure for each patient using domain facet environment is described in SampLe [15]. The system and concept hierarchies. The presentation structure uses semantic web technologies to support authors during represented as a directed acyclic graph exchanges the presentation generation process. The process is divided into information among various system components and four phases: topic identification, discourse structure building, expresses the communicative goals to be adapted to media material collection, and production of final-form different caregivers. The media coordinator makes a presentation. SampLe supports authors during every phase consistent and synchronized presentation by allowing of the process, independent of a particular workflow. This is media-specific components to access and update the achieved using ontology-based and context oriented presentation plan. The inconsistencies in presentation information, as well as semantic interrelationships between plan are resolved using a constraint solver. The paper different types of meta-data. introduces user modeling concepts and signifies the importance of temporal and spatial constraints in the context of synchronized multimedia presentation 3. Knowledge Representation and Data Model systems. For knowledge representation, we employ an ontology that Andre et al. [7] describes an approach towards fully defines a common vocabulary for machine-interpretable automated presentation design in the context of definitions of common concepts in a domain and relations personalized multimedia presentation generation. The among them. The domain ontology gives information paper discusses prototype systems [e.g. WIP and PPP] related to domain concepts. The media ontology gives media that produce presentations based on a given set of specific information and the discourse ontology narratively presentation parameters and by considering temporal structures presentation contents. Protégé [16], an ontology coordination of different media items. The structure of editor tool is used to develop the knowledge-base for the coherent media items are described using the domain ontology. This is represented in RDF(S) [17]. A generalization of speech act theory, and the rhetorical simplified version of the Neural Network domain ontology structure theory, for communication between multiple created for the proposed system is illustrated in Fig. 1. media parts of heterogeneous media objects. The presentation structure is generated by utilizing the A data model represents the basic guidelines for knowledge-base components. The presentation annotating various media items. The data model adopted strategies select the relevant content, and structure it for from [18] has the components content schema, semantic delivering through an appropriate medium for target schema, and media schema that describes multimedia consumers. The qualitative and quantitative constraints objects proficiently. The content schema is represented by are taken into consideration for building up a temporal Dublin Core attributes (e.g. title, identifier), semantic constraint network for presentation acts and the schema is represented by Learning Objective Metadata temporal coordination that facilitates presentation (LOM) [19] attributes (e.g. language, level), and media schema is represented by MPEG-7 [20] attributes (e.g. 4. Adaptable Authoring System Architecture media type, media URI). In the proposed system, domain ontology is represented in RDF/XML, and the media objects are stored in a database or are coming from heterogeneous resources. Jena [21], an open source Java based RDF repository and reasoning engine, is used to query RDF/XML knowledge-base. We adopt the extensive architecture proposed by Bunt et al. [22] that describes multimodal interaction and coordination for information presentation. The architecture identifies the functional and technical requirements for intelligent multimodal systems. The main components of the architecture are multimodal input, multimodal integration, and multimodal output. The multimodal input component caters to a mixture of input modalities (e.g. text, audio, and video). The multimodal integration component interacts with various modeling components for adaptation. The content management component interacts with the multimodal integration component to provide appropriate presentation content. The presentation is delivered by the multimodal output component that interacts with the application interface. The logical diagram of the Fig.1. Neural Network domain ontology presentation generation process is illustrated in Fig.2. The described data model has been chosen since the representation describes media contents, semantics and the media attributes effectively. The metadata attributes for media resources are generated manually. The data model allows the representation of external media objects already annotated with Dublin Core, Learning Objective Metadata, and MPEG-7 attributes. The ontology and the data model make the knowledge representation structure independent from data representation structure. In the data model, the metadata attribute “language” specifies the adaptation component for the presentations to be customized according to language specifications. The metadata attribute “level” specifies the content proficiency to provide customizable presentations. To describe the continuous media objects, MPEG-7 standard is chosen that represents the spatial and temporal aspects of Fig.2. Logical diagram of the presentation generation process multimedia objects. MPEG-7 Multimedia Description The content management component has been designed Schemes can effectively describe multimedia entities. with reference to the standard reference model [23] that MPEG-7 Visual Description Tools describes the visual describes an implementation-independent view of the features (e.g. color, motion). MPEG-7 Audio provides processes required for the generation of intelligent the standard for describing the audio contents (e.g. multimedia presentations. The conceptual design of standard sound recognition). The data model enables us to reference architecture modularizes multimedia presentation represent content specific information (e.g. who, what), generation process into five layers: control layer, content media specific information (e.g. size, height), and layer, design layer, realization layer, and the presentation semantic specific information (e.g. when, how) of display layer. multimedia objects. Thus, the data model represents media dependent features of multimedia resources In our methodology, the author selects a theme for the efficiently for enhanced data selection. presentation (e.g. Lecture Notes) followed by a title (e.g. Introduction to Neural Network Architecture) that is supported by the discourse ontology. This is represented recognition or classification to series prediction or vice- by the control layer of the standard reference versa. The closest probabilities between the concepts architecture. The author selects relevant concepts by indicate that the annotated media items are strongly related. browsing through the domain ontology. Depending on The farthest probabilities between the concepts indicate that author’s selection, related concepts are suggested by the the media items are vaguely related. Thus, the strong and system based on data usage history. Once the contents weak semantic relations between media assets can be are selected, corresponding media items are added and predicted for suggesting to potential authors. The author’s ordered. This is represented by the content layer of proficiency (i.e. novice, intermediate, or expert) is the reference architecture. The design layer conveys the hidden state in this model and is not determined explicitly. presentation layout structure for the media objects. The The author’s proficiency is not determined since average realization layer integrates displayable media objects proficiency may vary depending on the cohort of authors with the layout information. The presentation display and annotating media resources based on the author’s layer converts the realization layer representation into a proficiency limits the knowledge space for potential authors. hypermedia presentation that will be generated in the In this context, the Markov Model holds its significance form of SMIL. The system architecture is illustrated since the model predicts the probability based on the [24] in Fig. 3. previous state and the probability for future events can be determined by extending the model to “n” previous states. The browsing pattern of authors is given in Table 1 and the calculated transition probabilities are given in Table 2. Based on these values, the authors’ browsing behaviour of concepts can be predicted by the Markov Model. If the calculated probability of authors’ accessing the classification and series prediction concepts are near, it can be predicted that the classification and series prediction concepts are semantically related. Table 1 Browsing pattern of the authors Fig.3. Logical diagram of the system 5. Semantic Content Suggestion using the Hidden Markov Model A Markov process [4] is defined as a process which moves from state to state depending on the previous state of the process. Our objective is to suggest Table 2 semantically related information objects to authors Transition probabilities based on data usage history, without explicitly determining the author types. The Hidden Markov Model, which is an extension to the Markov Model, can make probabilistic assumptions of the hidden states based on observable states. Here the hidden states refer to the author types and the observable states refer to the author data usage history. Thus, the Hidden Markov Model can be used to examine and predict semantically related information objects. Using the Markov Model, authors’ browsing behaviour of concepts is predicted as 0.234 for classification, 0.483 for In the proposed system, media items are annotated pattern recognition, and 0.283 for series prediction. The with the domain ontology concepts: classification (CS), concept pattern recognition has the highest probability of pattern recognition (PR), and series prediction (SP). selection followed by the concepts series prediction, and The Markov model takes into account the number of classification. The nearest probability between the concepts authors browsing from classification to pattern classification and series prediction implies that they [2] The Web Ontology Language (OWL), available at: have a strong semantic relation. Moreover, the concepts http://www.w3.org/2004/OWL/ classification and pattern recognition implies a weak [3] P. Brusilovsky, “Methods and techniques of adaptive semantic relation since their probabilities are far-off. hypermedia”, User Modeling and User-Adapted The subsequent predictions based on previous Interaction, 1996. probability results give the probability values as 0.245 [4] K. Seymore, A. McCallum, and R. Rosenfeld, “Learning for classification, 0.477 for pattern recognition, and Hidden Markov Model Structure for Information 0.278 for series prediction. In this case, it is evident that Extraction”, AAAI 99 Workshop on Machine Learning the probability difference between the concepts for Information Extraction, 1999. classification and series prediction has reduced. This [5] R. Taib, N. Ruiz, “Multimodal interaction styles for signifies that the strength of semantic relation has hypermedia adaptation”, Proceedings of the 11th increased. Thus, the authoring system suggests semantic international conference on Intelligent user interfaces, contents related with the author’s preferred topic. 2006. Moreover, based on predicted semantic relations, [6] M. Dalal, S. Feiner, K. McKeown, S. Pan, M. Zhou, T. additional media items annotated with related concepts Hollerer, J. Shaw, Y. Feng, and J. Fromer, “Negotiation can be supplied to the multimedia repository for for automated generation of temporal multimedia generating resourceful hypermedia presentations. To presentations”, ACM Multimedia, 1996. make the system effective at the initialization phase, [7] E. Andre, J. Muller, and T. Rist, “WIP/PPP: Automatic semantic search strategies have to be incorporated for generation of personalized multimedia presentations”, supporting the authors’ in content selection since the In Proc. of the 4th ACM Int. Multimedia Conference system suggests concepts based on data usage history. (Multimedia'96), 1996. [8] L. Rutledge, M. Alberink, R. Brussee, S. Pokraev, W. Van Dieten, and M. Veenstra, “Finding the Story — 6. Conclusion Broader Applicability of Semantics and Discourse for Hypermedia Generation”, In Proceedings of the 14th This paper describes an ongoing research to develop a ACM Conference on Hypertext and Hypermedia, pp. semi-automatic hypermedia authoring system. The 67–76, 2003. research concentrates on the content collection and [9] Rijksmuseum Amsterdam, a museum of Dutch art and content selection phases of authoring, and suggests history, available at: semantically related information objects to potential [10] S. Little, J. Geurts, and J. Hunter, “Dynamic Generation authors. The presentation generation process describes of Intelligent Multimedia Presentations through the adaptation component in the context of user Semantic Inferencing”, In 6th European Conference on modeling for hypermedia presentations. The Hidden Research and Advanced Technology for Digital Markov Model is employed for semantic content Libraries, pp. 158–189, 2002. suggestion based on data usage history. The architecture [11] Dublin Core Metadata Element Set Version1.1 (DC), of the proposed authoring system complies with the Reference Description, Dublin Core Metadata Initiative, multimodal interactive information presentation July 1999, available at: which describes extensively the fundamental authoring [12] Synchronized Multimedia Integration Language stages of intelligent multimedia presentations. (SMIL), available at: Future work concentrates on a prototype [13] S. Kim, H. Alani, W. Hall, P.H. Lewis, D.E. Millard, N. implementation to evaluate the Hidden Markov Shadbolt, and M.J. Weal, “Artequakt: Generating Methodology based on user trails. Once the adaptation Tailored Biographies with Automatically Annotated component is realized, an integrated system can Fragments from the Web”, Workshop on Semantic generate hypermedia presentations that would adapt Authoring, Annotation & Knowledge Markup, 15 dynamically to authors’ proficiencies. European Conf. on Artificial Intelligence (ECAI), pp.1- 6, 2002. [14] J. Geurts, S. Bocconi, J. van Ossenbruggen, and L. Hardman, “Towards Ontology-driven Discourse: From References Semantic Graphs to Multimedia Presentations”, In Second International Semantic Web Conference [1] The Resource Description Framework (RDF), (ISWC2003), pp. 597–612, 2003. available at: [15] K. Falkovych, and S. Bocconi, “Creating a Semantic- based Discourse Model for Hypermedia Presentations: (Un)discovered Problems”, In Workshop on Narrative, Musical, Cinematic and Gaming Hyperstructure, 2005. [16] Protégé, an ontology editor and knowledge-base framework, available at: [17] Resource Description Framework (RDF) Schema, available at: [18] P.M. Barnaghi, and S.A. Kareem, “Ontology- Based Multimedia Presentation Generation”, IEEE TENCON 2005 Conference, 2005. [19] Learning Object Metadata (LOM), available at: [20] B.S. Manjunath, P. Salembier, and T. Sikora, “Introduction to MPEG-7: Multimedia Content Description Interface”, John Wiley, 2002. [21] Jena, a semantic web framework for Java, available at: [22] H. Bunt, M. Kipp, M. Maybury and W. Wahlster, “Fusion and Coordination for Multimodal Interactive Information Presentation”, In: O. Stock and M. Zancanaro (eds), Multimodal Intelligent Information Presentation, Springer, 2005. [23] M.Bordegoni, G.Faconti, M.T.Maybury, T.Rist, S.Ruggieri, P.Trahanias, and M.Wilson, “A Standard Reference Model for Intelligent Multimedia Presentation Systems”, Proceedings of IJCAI-97 Workshop on Intelligent Multimodal Systems, International Joint Conferences on Artificial Intelligence Inc., 1997. [24] Web resources, available at: www.w3.org, gbn. glenbrook.k12.il.us, www.cdc.gov, www.inns.org, www.uni-duisburg.de.