=Paper=
{{Paper
|id=None
|storemode=property
|title=An Ontology-based Visual Interface for Browsing and Summarizing Conversations
|pdfUrl=https://ceur-ws.org/Vol-694/paper5.pdf
|volume=Vol-694
}}
==An Ontology-based Visual Interface for Browsing and Summarizing Conversations==
<pdf width="1500px">https://ceur-ws.org/Vol-694/paper5.pdf</pdf>
<pre>
      An Ontology-based Visual Interface for Browsing and
                 Summarizing Conversations
                        Shama Rashid                                              Giuseppe Carenini
                 University of British Columbia                              University of British Columbia
                    Vancouver, BC, Canada                                       Vancouver, BC, Canada
                       shama@cs.ubc.ca                                            carenini@cs.ubc.ca

ABSTRACT                                                              the dominant approach to summarization has been extrac-
In this paper we present a visual interactive interface to cre-       tive, which means that the summary is generated by select-
ate focused summaries of human conversations via mapping              ing and concatenating the most informative sentences from
to the concepts within an ontology. The ontology includes             the source document(s). Extractive summarization has been
nodes for the conversation participants, for Dialog Act (DA)          popular at least in part because it can be framed as a binary
properties such as decision, action-item or subjectivity, as          classification task that lends itself well to machine learning
well as for entities mentioned in the conversation. The clas-         techniques, and does not require a natural language gen-
sifiers used to annotate conversation data with DA property           eration component. Extrinsic evaluations have also shown
and entity tags can be applied to any conversational modal-           that, while extractive summaries may be less coherent than
ity including face to face meetings, emails, blogs and chats.         human abstracts, users still find them to be valuable tools
Our interface allows the user to explore these conversations          for browsing documents [9, 15, 19]. However, these same
and identify informative sentences by their association with          evaluations also indicate that concise abstracts are generally
nodes of interest on the tree-structured visual representation        preferred by users and lead to higher objective task scores.
of the ontology. The sentences thus selected by the user as           The limitation of a cut-and-paste summary is that the end-
potentially important components of the summary can then              users do not know why the selected sentences are important;
be used to derive a brief and focused overview of the conver-         this can often only be discerned by exploring the context in
sation. The display interface and data parsing components in          which each sentence originally appeared. One possible im-
the initial prototype were all developed based on Java frame-         provement is to create structured summaries that represent
works and toolkits.                                                   an increased level of abstraction, where selected sentences
                                                                      are grouped according to the entities they mention as well
                                                                      as to phenomena such as decisions, action items and sub-
INTRODUCTION
                                                                      jectivity, thereby giving the users more information on why
Multimodal conversations have become an integrated part               the sentences are being selected. For example, the sentence
of our everyday communication with others. We email for               Let’s go with a simple layout is about a simple layout and
business and personal purposes, attend meetings in person             represents both a decision and the expression of a positive
and remotely, chat online, and participate in blog or forum           subjective statement.
discussions. It is clear that automatic summarization can be
of benefit in dealing with this overwhelming amount of in-            Our first attempt to build an interface to create visual struc-
teractional information. Automatic meeting abstracts would            tured summaries of conversations was presented in [2]. This
allow us to prepare for an upcoming meeting or review the             interface relied on mapping the utterances of the conversa-
decisions of a previous group. Email summaries would aid              tion into an ontology, similar to the faceted browsers in [30,
corporate memory and provide efficient indices into large             6], that then could be used to search the conversation ac-
mail folders. Summaries of technical blogs could become an            cording to the annotation. Our ontology initially contained
important support platform for developers, administrators,            only the participants of the conversation and properties of
and technology enthusiasts in general.                                the utterance such as whether it was expressing a decision,
                                                                      a subjective statement, etc. Our first prototype comprised
Summarization of human conversations have been addressed              two panels (see Fig 1). The right panel displayed the ontol-
in the past for different modes of conversation, including            ogy, while the left panel displayed the whole conversation,
meetings [7], emails [23, 3], telephone conversations [32]            where sentences are temporally ordered. Given the infor-
and internet relay chats [31]. In all these previous works            mation shown in the two panels, the users could generate
                                                                      visual, structured summaries by selecting nodes in the on-
                                                                      tology. As a result, the sentences that were mapped in the
                                                                      selected nodes would be highlighted.

                                                                      In this paper we present a novel interface (see Fig 2) that
                                                                      addresses several limitations of our initial prototype. First,
                                                                      we have extended the ontology to also include entities men-
Workshop on Visual Interfaces to the Social and Semantic Web
(VISSW2011), Co-located with ACM IUI 2011, Feb 13, 2011, Palo Alto,
                                                                      tioned in the conversation. Searching the conversation us-
US. Copyright is held by the author/owner(s).
              Figure 1. Initial prototype for generating visually structured summary of human conversations, as presented in [2]


ing a particular keyword is suitable only when users already              vant to their current information needs). For instance, users
have an idea about the content and want additional infor-                 may need a summary of all the positive and negative com-
mation on a particular entity. Our assumption is that rep-                ments that were expressed in the conversation about two par-
resenting entities on the ontology tree will not only enable              ticular entities (e.g., new design and interface layout). The
the users to perform a more refined search and browsing of                new interface allows the users to trigger the generation of
the conversation, but the entities would also provide them                such summaries, which will be shown in the bottom panel
with a quick overview of the content of the whole conversa-               (see Fig 2). Most importantly, these summaries can be gen-
tion. Secondly, we have provided a satisfactory solution to               erated either by extraction or by abstraction; in the latter case
the problem of highlighting the sentences mapped to nodes                 by applying techniques presented in [18].
selected by the users in the ontology. Instead of using color
(a non-scalable solution that we initially explored) we have              We have developed our prototype using instances of meet-
added a column to the left of the interface layout, in which              ing conversations from the AMI corpus [4]. We are cur-
the (selected) mapping to the (knowledge concepts within)                 rently extending the interface to Web-based modes of con-
ontology of the corresponding utterance can be displayed.                 versation like emails, blogs, chats etc. and working on this
The third extension, the summary view (discussed in de-                   with the BC3 corpus [27]. As we move to asynchronous
tails at Display Design section), is the most critical one, as            conversations, one additional complication is that the con-
it opens the door to a possibly highly beneficial integration             versational structure is not linear anymore, but can be a tree
of structured visual (extractive) summaries and abstractive               or a graph. We are currently exploring how the interface
focused summaries. Our hypothesis is that the users, after                can be extended to deal with more complex conversational
they have inspected the conversation through the mapping to               structures. In contrast, the mapping of sentences to the on-
the ontology, may wish to generate summaries covering only                tology can be easily transfered from meetings to other con-
some aspects of the conversation (which are especially rele-              versational modes, as it relies on very general methods. The
mapping is performed by first identifying all the entities re-        tive techniques as we are also exploring focused summariza-
ferred to in the conversation (via syntactic parsing), and then       tion by abstraction. The interface proposed in [12] also con-
by utilizing classifiers relating to a variety of sentence-level      siders features that are specific to conversations about de-
phenomena such as decisions and subjective sentences. High            signing a new product (see AMI corpus [4]), in which you
classification accuracy is achieved by using a very large fea-        typically do not have only a single meeting but a series of
ture set integrating conversation structure, lexical patterns,        meetings, the kickoff, the conceptual design, the detailed
part-of-speech (POS) tags and character n-grams.                      design, and the evaluation meetings. While we also aim to
                                                                      consider series of related conversation we intend to do it in
In this paper we will first describe related work. Then we            a general way, i.e., not being limited to conversations about
will present the process of deriving an ontology and map-             designing a product.
ping sentences to it. After that, we shall discuss our inter-
face to display the conversation to the users for interactive         The Ferret Meeting Browser [28] provides the ability to quickly
exploration of the data with the help of the ontology.                find and play back a combination of available audio, video,
                                                                      transcript and projected display segments from a meeting
                                                                      side by side for comparison and inspection synchronously
RELATED WORK                                                          and allows navigation by clicking on a vertical scrollable
In HCI and NLP different approaches have been proposed to             timeline of the transcript. Users can zoom into particular
support the browsing and summarization of data/documents              places of interest by means of a button and by zooming out
with the aid of an interactive interface. Here, we focus on the       they get an overview of the meeting in terms of who talked
ones that are more critical for our current and future work.          the most, what meeting actions etc. In the future, we’ll ex-
                                                                      tend our interface to include an overview of the conversation
The idea of using an ontology to explore data in an orderly           integrating ideas from the following projects.
manner is not novel. For instance, the Flamenco [30] and the
Mambo [6] systems make use of hierarchical faceted meta-              The Meeting Miner [1] aids browsing multimodal meeting
data for browsing through image or music collections. In              through recordings of online text and speech collaborative
our approach we adopt similar techniques to support the ex-           meetings using timeline navigators of content of edits as the
ploration of conversations. More specifically, in Flamenco            main control for browsing. In addition, it can retrieve a set
[30], while navigating an image collection along conceptual           of speech turns spread throughout the conversation focused
dimensions or facets (e.g. date, theme, artist, media, size,          on particular keywords that can be selected from a list of
color, material etc.), every facet hyperlink that can be se-          automatically generated keywords and topic. The users can
lected to derive a new result set is displayed with a count as        also navigate to the audio segments that have been identified
an indicator of the number of results to expect i.e. the count        as relevant using the audio timeline for random access of the
works as a query preview. Similarly, we have included a               file. The Meeting Miner [1] automatically identifies a set of
count beside each node of the ontology to indicate the num-           potential keywords and the users can decide to view these
ber of sentences in the conversation that have been mapped            in alphabetical order, ranked by term frequency or simply
to it. Another idea we have borrowed from the Flamenco                by time of appearance in the conversation. A similar con-
and Mambo systems is to use summary paths to simplify                 cept has been discussed in the future work of FacetMap [25]
the user interaction with the ontology. In Flamenco, differ-          where the authors mention implementing the ability to dy-
ent paths may lead to a collection of images at a particular          namically order the facets, such as by count, alphabetically
time; so Flamenco uses a summary path along the top of                by label, by usage, or by some specific facet ordering. The
the interface to show exactly which path was taken and uses           entities on the ontology tree of our interface are equivalent to
links along this path to retract to a previous decision along         Meeting Miner’s keyword panel entries and we are currently
the path. Similarly, the Mambo system provides breadcrumb             listing the entities in alphabetical order; but a different or-
style filter history, which gives an interactive overview of the      dering based on the count etc. may prove more helpful to
active facet filter. In our interface, to facilitate the inspection   the users.
of a possibly large ontology, nodes can be minimized (i.e.,
their children are hidden). So, it may happen that the set of         The CALO meeting assistant [26] is used for capturing audio
tags selected by the users is not fully visible. To address this      signals and optional handwriting recorded by digital pens
problem, we are working on including a summary of the on-             for distributed meetings. During the meeting the system
tology node selection at the top of our interface, as it is done      automatically transcribes the speech to text and the partic-
in Flamenco and Mambo.                                                ipants are fed back a real-time transcript to which annota-
                                                                      tions can be attached. At the end of the meeting the system
An extractive approach for generating a decision-focused              performs further semantic analysis on the transcript like di-
summary suitable for debriefing tasks has been proposed in            alog act segmentation and tagging, topic identification and
[12]. This type of summary includes only 1-2% of a meet-              segmentation, question-answer pair identification, addressee
ing recording related to decision making. In addition to the          detection, action item recognition, decision extraction and
transcripts, the interface takes advantage of the audio-video         summarization. The result of this analysis is made available
recordings to better understand decision points. While the            via a web-based interface. The off-line meeting browser in-
interface in [12] makes use of only dialog acts for focused           terface displays the meeting transcript segmented according
summary generation, ours additionally uses speaker and en-            to dialog acts. Each dialog act is shown along side its start
tity information. Furthermore, we are not limited to extrac-
time, speaker, and a link for streaming audio feedback for the   and negative subjective sentences.1 The entities in a conver-
transcript segment (in case the users want to overcome any       sation are noun phrases with mid-range document frequency.
speech transcription errors). The CALO browser also pro-         Our ontology is populated with the instance data for a given
vides the users views of the extractive summary of the meet-     conversation that the user is attempting to browse and sum-
ing and above mentioned annotations in separate tabs. A lot      marize.
of the annotations provided by the CALO system overlap
with our segmentation of the transcript and knowledge con-       Our definition of entities is similar to the definition of con-
cepts represented in the ontology tree but the CALO browser      cept as defined by Xie et al. [29], where n-grams are weighted
provides more flexibility by providing the users means to at-    by tf.idf scores, except that we use noun phrases rather than
tach their own annotations, which is an interesting direction    any n-grams. We use mid-range document frequency instead
we could explore in our future prototypes. Our interface dif-    of idf as in [5], where the entities occur in between 10% and
fers from CALO by providing a way to focus on the users’         90% of the documents in the collection. We do not currently
particular information need by referring to the ontology and     attempt coreference resolution for entities; recent work has
by providing an option to generate abstractive or extractive     investigated coreference resolution for multi-party dialogues
summaries.                                                       [16, 8], but the challenge of resolution on such noisy data is
                                                                 highlighted by low accuracy (e.g. F-measure of 21.21) com-
In iBlogVis [13], the authors use social interaction cues like   pared with using well-formed text (e.g. monologues).
comment length, number of comments, regular commenters
etc. and content cues like topics of a blog, blogger’s post-     Using a number of supervised classifiers trained on labelled
ing habits etc. to provide the users with an overview of a       decision sentences, action sentences etc. the sentences in
blog archive and to support them in deciding which entry to      the conversation were mapped to our ontology. The classi-
read. The font size of a tag for blog topic representation in-   fiers have been evaluated both on meeting and email data,
dicates its popularity, a concept that we shall employ in the    the AMI [4] and BC3 [27] corpus respectively, as mentioned
future for our textual collage representation of conversation    in [17] and found to perform well on both sets of data. The
content. iBlogVis uses the idea of read wear [10], a means       flexibility of our mapping approach is that it only relies on
of graphically portraying the documents readership history,      generic conversational features and can therefore be applied
to help users keep track of entries that have been read, have    to a multi-modal conversation, for example a conversation
not been read, or the one that is currently being read using     that spans both an email thread and a meeting. We used a
different colors. Similarly, we are currently working to pro-    feature set related to generic conversational structure, which
vide users an option to log the current ontology settings so     include sentence length, sentence position in the conversa-
that they can keep track of the combinations tried before.       tion and in the current turn, pause-style features, lexical co-
                                                                 hesion, centroid scores, and features that measure how terms
MostVis [24] uses a multiple co-ordinated view for brows-        cluster between conversation participants and conversation
ing a catalog for multimedia components in a car. Besides        turns. Despite using generic features the classifiers achieve
the textual label of each node in the catalog node-link tree     similar results to [11, 21, 20, 22], that rely on meeting-
representation there is an additional icon representing ele-     specific or email-specific features (e.g., prosody for meet-
ment type (car series, function block, functions, parameters     ings). Details can be found in [2].
etc.). This is similar to our use of a short string represen-
tation or icon beside the ontology tree nodes. MostVis also      For AMI corpus, a particular conversation for a meeting con-
has a history window with undo and redo button where an          sists of utterances that have the following format:
entry is logged every time an expansion or minimization of
                                                                 <Utterance rdf:about="#TS3012a.A.dialog-act.vkaraisk.14">
the node-link tree occurs. We are exploring how a similar            <rdf:type rdf:resource="&owl;Thing"/>
mechanism could be added to our interface.                           <hasSpeaker rdf:resource="#ProjectManager"/>
                                                                     <hasDAType rdf:resource="#NegativeSubjective"/>
                                                                     <begTime>21.66</begTime>
                                                                     <endTime>22.74</endTime>
MAPPING SENTENCES TO ONTOLOGY                                    </Utterance>
Our summarization method relies on mapping the sentences
in a conversation to an ontology written in OWL/RDF (Web         The above utterance is a negative subjective statement or a
Ontology Language/Resource Description Framework), a wi-         negative comment made by the ProjectManager at the meet-
dely used open-source standard develop tool compatible with      ing. The beginning time of utterance is used to temporally
the architecture of the Semantic Web in particular.              order the whole conversation and the unique identifier of the
                                                                 Utterance object is used to match the utterance with the ac-
Our ontology contains three core upper-level classes: Par-       tual sentence being said and thus any relevant entities.
ticipant, Dialog Act (DA) types and Entities. When addi-
tional information is available about participant roles in a
given domain, Participant subclasses can be utilized. For
our AMI meeting scenarios the Participant class consisted        1
of four subclasses ProjectManager (PM), IndustrialDesigner         Our classifiers are designed for identifying five subclasses of the
                                                                 DA-type class, namely decisions, actions, problems, positive sub-
(ID), UserInterfaceExpert (UIE) and MarketingExpert (ME).        jectives, and negative subjectives; but we could easily include ad-
The DA-Type class, on the other hand, contains subclasses        ditional classifiers to identify other types of dialog acts according
decisions, actions, problems, positive subjective sentences,     to the information need.
Figure 2. A refinement of the visual interface in Fig 1, intended for both browsing and summarizing conversations and consisting of three integrated
views - the Ontology View (right), the Transcript View (middle) and the Summary View (bottom)


DISPLAY DESIGN                                                               The Ontology View
Once the ontology is populated with the participants, DA                     The ontology view provides a structured way for the users
types and entities of a particular conversation, the transcript              to explore all the relevant concepts in the conversation and
of the conversation is displayed ordered temporally. The de-                 their relations. It contains a tree hierarchy with core nodes
sign of the interface is intended to satisfy two key goals. The              Speaker, DAType (Dialogue Act Type), and Entity. Concep-
first goal is to support the exploration of the conversation                 tually the top node in the ontology tree represents all the
through annotating the discourse with an ontology. This is                   utterances or sentences in the conversation, while any other
achieved by allowing the users to select subclasses from the                 node represents a subset or subclass of those sentences that
ontology that seem promising to fulfill their particular infor-              satisfies a particular property. For instance, the node Pro-
mation needs and by allowing them to inspect the sentences                   jectManager (PM) represent all the sentences uttered by the
that are associated to those subclasses in the context of the                PM, while the node ActionItem represents all the utterances
whole transcript. The second goal is to support the genera-                  that were classified as containing an action item. The Enti-
tion of focused summaries that cover only aspects of the con-                ties core node, on the other hand, does not represent all the
versations which are especially relevant to the users. This is               noun phrases detected in the conversation but only the ones
achieved by allowing the users to select classes of sentences                deemed important on the basis of their frequency and the
that they find particularly informative and that should be in-               ones that are associated with messages in the conversation.
cluded in the summary (include verbatim for an extractive
summary vs. include their content for an abstractive one).                   As shown in Fig. 2, like in [2], the nodes each have a check
                                                                             box and a label. Additionally we have included a count
In this section we discuss in more detail how the achieve-                   within parentheses beside the labels. For leaf nodes, the
ment of these two key goals is supported by our novel dis-                   count indicates how many sentences were mapped to this
play, shown in Figure 2. Our visual interface consists of                    node and imply its relevance for the summary; for non-leaf
three integrated views, the Ontology View (right), the Tran-                 nodes this is just the sum of all its descendant leaf node
script View (middle) and the Summary View (bottom). Con-                     count; a count is an implication of the node’s relevance.
trast this with the simpler interface presented in [2] (see Fig              For Speaker subtree nodes the sets are mutually exclusive
1). Our interface does not feature audio-video data streams                  and these counts give a sense how dominant a role was in
in addition to transcripts as in Meeting Miner [1] or Ferret                 this particular meeting. For Entity and DAType subtrees, the
[28] because we have designed it to explore and summarize                    sets can be overlapping and the counts at the non-leaf core
multi-modal conversations in general. The prototype was                      nodes indicate the extent of overlap. Our initial prototype
developed using Java Swing components and Jena, an open                      displays the entities in an alphabetical order but we could
source Java framework that provides a programmatic envi-                     use the counts to order the entities on the ontology tree as an
ronment for building semantic web applications.                              indicator to their relevance.
The Transcript View                                                ‘+’ and ‘-’ in the Tags column of sentences that map to each
The transcript view is designed to allow the users to in-          of these nodes and would also highlight every occurrence of
spect the whole conversation as well as the mapping of each        the word ‘board’, providing the user scope for closer inspec-
sentence into the ontology. This view has two columns -            tion by scrolling through the transcript.
Transcript and Tags. The Transcript column displays the
whole conversation one sentence per row, while keywords            When the transcript view is generated for a conversation, the
and icons for the nodes in the ontology to which each sen-         Tags column is initially empty and all the nodes on the ontol-
tence was mapped to are shown in the corresponding Tags            ogy tree in the ontology view are shown and are de-selected.
column (to the left of the Transcript column), in case of se-      If for a particular conversation the ontology is too large, the
lected nodes under the Speaker and DAType core nodes; or           users can expand/minimize nodes they are/are-not interested
highlighted in the Transcript column, in case of nodes un-         in, as in any standard outline based interface. Once the users
der Entity core node (refer to the Interaction Design Section      select a node (or de-selects an already selected node) on the
for a user scenario). We have decided to display the entities      ontology tree, the keyword or icon associated with that node
highlighted in the transcript instead of mentioning them in        appears in (or disappears from) the Tags column of all the
the Tags column so that the users could inspect them in their      rows that contain sentences that can be mapped to that par-
actual context. Also, adding a number of long noun phrases         ticular node. Once the users have selected the nodes of inter-
to the Tags column would have widened that particular col-         est from the ontology tree, they can scroll through the tran-
umn space making it difficult for the users to inspect both        script view and select sentences that appear to be promising
the Tags and Transcript columns at the same time.                  for generating a focused summary. When they choose a sen-
                                                                   tence, all sentences that have the exact same tag set asso-
The Transcript View is scrollable both vertically and hori-        ciated are detected and the summary view is updated with
zontally which can be used to inspect a sentence in its con-       the summary generated based on these sentences. These
text i.e. its position in the conversation. A sentence may         sentences are included verbatim for an extractive summary,
convey additional information in conjunction with its sur-         while an abstractive summary will summarize their content.
rounding sentences. For example, when users inspect the            The generated summary is also interactive, for the extrac-
sentence ‘That’s it, you just put it on the board.’ mentioning     tive case, clicking on a sentence in the summary view high-
the entity ‘board’ in its context, they may decide to include      lights it and also re-focuses the vertical scroll bar position
the entity ‘pen’ for further investigation since the ‘it’ in the   on the transcript view to show the context of that particular
sentence refers to the ‘pen’ that appeared in a preceding sen-     sentence. For the abstractive case, we are still investigating
tence.                                                             ways to provide a similar functionality, which needs to take
                                                                   into account that there is no one-to-one mapping between
The Summary View                                                   the sentences in the summary and the ones in the transcript.
The summary view is a text area where the candidate sum-
mary of the conversation appears for user assessment. The          CONCLUSION AND FUTURE WORK
summary is based on sentences selected from the transcript         This paper presents a visual interface that not only allows
using the criteria set from the ontology tree and is generated     users to explore a conversation using a mapping to an on-
using either extraction or abstraction as in [18]. The sum-        tology, but also allows them to interactively generate fo-
mary view provides an easier way to assess the conversation        cused summaries of human conversation. The classifiers for
overview on a particular information need without scrolling        the mapping phase are not dependent on features specific to
through the whole transcript. When the summary is extrac-          any particular mode of conversation which makes this ap-
tive, to support the users in interpreting the summary in the      proach extensible to multi-modal conversations. We are cur-
context of the whole transcript, each sentence in the sum-         rently working to extend our interface to asynchronous con-
mary view is prefixed with a keyword indicating the speaker        versations like emails and blogs, which entails the additional
of the sentence. We are currently exploring how to provide         complication of having a non-linear conversational structure
a similar support for abstractive summaries.                       (i.e., a graph). In the future, we plan to investigate corefer-
                                                                   ence resolution for the noun phrases in the conversation, this
INTERACTION DESIGN                                                 would also deal with synonymy of entities.
The list of entities gives the users an idea of the conver-
sation content without requiring them to browse the whole          Our interface supports the generation of focused summaries
transcript. Furthermore, by selecting a few entities out of        by allowing the users to select classes of sentences that they
the list the users can satisfy particular information needs on     find particularly informative and that should be included in
the direction the conversation took regarding those partic-        the summary. If these sentences are included verbatim, we
ular entities. For instance, a user may be interested in all       generate extractive summaries, whereas if the content of those
the comments made by the ProjectManager on the ‘board’             sentences is extracted and aggregated, we generated abstract
and whether these comments were positive or negative. To           summaries. There is evidence that human abstractors at times
achieve this goal, the user would select the node ‘board’          use sentences from the source documents nearly verbatim in
under the Entity sub-tree, the node ‘ProjectManager’ under         their own summaries, justifying this approach to some ex-
the Speaker core node and ‘PositiveSubjective’ and ‘Neg-           tent [14]. However, other studies also show that users usu-
ativeSubjective’ nodes under the DA Type core node. As             ally prefer concise abstracts and find them more coherent[9,
shown in figure 2, this would display the keywords ‘PM’,           15, 19]. In our interface, we can generate both types of
summaries, so it represents an ideal environment in which           7. M. Galley. A Skip-Chain Conditional Random Field for
to explore the pros and cons of these two methods and the              Ranking Meeting Utterances by Importance. In Proc. of
possible benefits of their integration.                                EMNLP 2006, Sydney, Australia, pages 364–372, 2006.

To assist users to decide on the informativeness of a set of        8. S. Gupta, J. Niekrasz, M. Purver, and D. Jurafsky.
sentences chosen according to a criteria set on the ontology           Resolving “You” in Multi-Party Dialog. In Proc. of
tree, in the future we would also provide users with a task            SIGdial 2007, Antwerp, Belgium, 2007.
history. Depending on the size of the Entities subtree, it may
be prohibitively time-consuming for the users to recreate a         9. L. He, E. Sanocki, A. Gupta, and J. Grudin.
previously examined criterion by reselecting all of the rel-           Auto-summarization of audio-video presentations. In
evant nodes since the userw would have to recall and find              Proc. of ACM MULTIMEDIA ’99, Orlando, FL, USA,
the exact nodes selected before and re-select them. A task             pages 489–498, 1999.
history may record such promising criteria so that users can
re-assess them later on using a single selection from this his-    10. W. C. Hill, J. D. Hollan, D. Wroblewski, and
tory view.                                                             T. McCandless. Edit Wear and Read Wear. In Proc. of
                                                                       SIGCHI, Monterey, California, USA, 1992.
Our current prototype only shows a local view of the conver-
sation, with fewer than fifty sentences. We plan to provide        11. P.-Y. Hsueh, J. Kilgour, J. Carletta, J. Moore, and
a second visualization that shows a global view of the entire          S. Renals. Automatic Decision Detection in Meeting
conversation and possibly of the entire corpus of conversa-            Speech. In Proc. of MLMI 2007, Brno, Czech Republic,
tions (as in [13]). A possible approach could be to display a          2007.
textual collage like tag clouds or word clouds in the overview
window of the list of entities to give users a sense of the con-   12. P.-Y. Hsueh and J. D. Moore. Improving Meeting
tent of the conversation or corpus transcript without brows-           Summarization by Focusing on User Needs : A
ing. We could also show the representation of speaker par-             Task-Oriented Evaluation. In Proc. of IUI, Florida,
ticipation information as a vertical scrollable timeline as in         USA, 2009.
[28].
                                                                   13. Indratmo, J. Vassileva, and C. Gutwin. Exploring Blog
                                                                       Archives with Interactive Visualization. In Proc. of AVI,
Finally, before engaging in a second redesign exercise, we
                                                                       Naples, Italy, 2008.
plan to run a formative user evaluation of the interface using
objective task scores.                                             14. J. Kupiec, J. Pederson, and F. Chen. A Trainable
                                                                       Document Summarizer. In Proc. of the 18th Annual
REFERENCES                                                             International ACM SIGIR Conference on Research and
 1. M.-M. Bouamrane and S. Laz. Navigating Multimodal                  Development in Information Retrieval. Seattle,
    Meeting Recordings with the Meeting Miner. In Proc.                Washington, USA, pages 68–73, 1995.
    of FQAS, Milan, Italy, 2006.
                                                                   15. K. McKeown, J. Hirschberg, M. Galley, and S. Maskey.
 2. G. Carenini and G. Murray. Visual Structured                       From Text to Speech Summarization. In Proc. of
    Summaries of Human Conversations. In Proc. of IVITA                ICASSP 2005, Philadelphia, USA, pages 997–1000,
    2010, HongKong, China, pages 41–44, 2010.                          2005.

 3. G. Carenini, R. Ng, and X. Zhou. Summarizing Email             16. C. Muller. Resolving It, This and That in Unrestricted
    Conversations with Clue Words. In Proc. of ACM                     Multi-Party Dialog. In Proc. of ACL 2007, Prague,
    WWW 2007, Banff, Canada, 2007.                                     Czech Republic, 2007.

 4. J. Carletta, S. Ashby, S. Bourban, M. Flynn,                   17. G. Murray and G. Carenini. Interpretation and
    M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos,                    Transformation for Abstracting Conversations. In
    W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln,                  North American ACL, Los Angeles, CA, USA, 2010.
    A. Lisowska, I. McCowan, W. Post, D. Reidsma, and
    P. Wellner. The AMI Meeting Corpus: A                          18. G. Murray, G. Carenini, and R. Ng. Generating
    Pre-Announcement. In Proc. of MLMI 2005,                           Abstracts of Meeting Conversations: A User Study. In
    Edinburgh, UK, pages 28–39, 2005.                                  Proc. of INLG, Dublin, Ireland, 2010.

 5. K. Church and W. Gale. Inverse Document Frequency              19. G. Murray, T. Kleinbauer, P. Poller, S. Renals,
    IDF: A Measure of Deviation from Poisson. In Proc. of              T. Becker, and J. Kilgour. Extrinsic Summarization
    the Third Workshop on Very Large Corpora, pages                    Evaluation: A Decision Audit Task. In Proc. of MLMI
    121–130, 1995.                                                     2008, Utrecht, the Netherlands, 2008.

 6. R. Dachselt and M. Frisch. Mambo : A Facet-based               20. G. Murray and S. Renals. Detecting Action Items in
    Zoomable Music Browser. In Proc. of MUM, Oulu,                     Meetings. In Proc. of MLMI 2008, Utrecht, the
    Finland, 2007.                                                     Netherlands, 2008.
21. M. Purver, J. Dowding, J. Niekrasz, P. Ehlen, and
    S. Noorbaloochi. Detecting and Summarizing Action
    Items in Multi-Party Dialogue. In Proc. of the 9th
    SIGdial Workshop on Discourse and Dialogue,
    Antwerp, Belgium, 2007.
22. S. Raaijmakers, K. Truong, and T. Wilson. Multimodal
    Subjectivity Analysis of Multiparty Conversation. In
    Proc. of EMNLP 2008, Honolulu, HI, USA, 2008.
23. O. Rambow, L. Shrestha, J. Chen, and C. Lauridsen.
    Summarizing Email Threads. In Proc. of HLT-NAACL
    2004, Boston, USA, 2004.
24. M. Sedlmair, C. Bernhold, D. Herrscher, S. Boring, and
    A. Butz. MostVis: An Interactive Visualizing
    Supporting Automotive Engineers in MOST Catalog
    Exploration. In Proc. of InfoVis09, New Jersey, USA,
    2009.
25. G. Smith, M. Czerwinski, B. Meyers, D. Robbins,
    G. Robertson, and D. S. Tan. FacetMap: A Scalable
    Search and Browse Visualization. In Proc. of InfoVis,
    Baltimore, Maryland, USA, 2006.
26. G. Tur, A. Stolcke, L. Voss, S. Peters, D. Hakkani-Tr,
    J. Dowding, B. Favre, R. Fernndez, M. Frampton,
    M. Frandsen, C. Frederickson, M. Graciarena,
    D. Kintzing, K. Leveque, S. Mason, J. Niekrasz,
    M. Purver, K. Riedhammer, E. Shriberg, J. Tien,
    D. Vergyri, and F. Yang. The CALO Meeting Assistant
    System. In IEEE Transaction on Audio, Speech and
    Language Processing, 2010.
27. J. Ulrich, G. Murray, and G. Carenini. A Publicly
    Available Annotated Corpus for Supervised Email
    Summarization. In Proc. of AAAI EMAIL-2008
    Workshop, Chicago, USA, 2008.
28. P. Wellner, M. Flynn, and M. Guillemot. Browsing
    Recorded Meetings with Ferret. In Proc. of MLMI
    2004, Martigny, Switzerland, pages 12–21, 2004.
29. S. Xie, B. Favre, D. Hakkani-Tür, and Y. Liu.
    Leveraging Sentence Weights in a Concept-based
    Optimization Framework for Extractive Meeting
    Summarization. In Proc. of Interspeech 2009, Brighton,
    England, 2009.
30. K.-P. Yee, K. Swaringen, K. LI, and M. Hearst. Faceted
    Metadata for Image Search and Browsing. In Proc. of
    CHI, Boston, USA, 2003.
31. L. Zhou and E. Hovy. Digesting Virtual “Geek”
    Culture: The Summarization of Technical Internet
    Relay Chats. In Proc. of ACL 2005, Ann Arbor, MI,
    USA, 2005.
32. X. Zhu and G. Penn. Summarization of Spontaneous
    Conversations. In Proc. of Interspeech 2006,
    Pittsburgh, USA, pages 1531–1534, 2006.

</pre>