=Paper=
{{Paper
|id=None
|storemode=property
|title=UWCL at MediaEval 2013: Similar Segments in Social Speech Task
|pdfUrl=https://ceur-ws.org/Vol-1043/mediaeval2013_submission_66.pdf
|volume=Vol-1043
|dblpUrl=https://dblp.org/rec/conf/mediaeval/Levow13
}}
==UWCL at MediaEval 2013: Similar Segments in Social Speech Task==
UWCL at MediaEval 2013: Similar Segments in Social Speech Task Gina-Anne Levow Department of Linguistics University of Washington Box 352425 Seattle, WA USA levow@uw.edu ABSTRACT particular topics or events. In addition much of that prior This paper describes the participation of the University of work emphasized retrieval from broadcast news sources. Re- Washington Computational Linguistics Laboratory (UWCL) trieval from less formal audio sources has focused on voice- in the Similar Segments in Social Speech task at MediaE- mail [2] and oral history interviews [5]. val 2013. Participants in this task develop systems that, The remainder of the paper is organized as follows. Sec- given a span of speech from a recorded conversation, aim to tion 2 presents key challenges in the task and UWCL’s ap- identify all and only highly similar regions in other record- proach to addressing them. Section 3 describes the exper- ings. As this was a new task for this year, the goal was imental configuration, official runs, and results, along with to establish a baseline and a framework for future experi- discussion. Section 4 concludes with plans for future work. mentation. The approach aimed to address two particular challenges posed by the task: the lack of prior segmentation 2. CHALLENGES AND APPROACH of the conversations and the limited material provided by a The SSSS task posed a wide range of interesting chal- single brief example segment. To this end, the system em- lenges. These issues included: ployed a query-by-example information retrieval framework using passage retrieval to identify segments dynamically and • Task modeling: would the task be best modelled as query expansion to support robust retrieval. Query expan- retrieval, clustering, ranking, or something else? sion provided substantial gains when applied to both manual • Sources of similarity: How should similarity be as- and automatic transcriptions; results using automatic tran- sessed —through lexical or acoustic information or some scripts were competitive with those using manual ones. combination? • Segmentation: how should segments be identified —via fixed segmentation, agglomeration, or other means? 1. INTRODUCTION • Generalization: given a simple example segment, how Recent years have seen a dramatic growth in the use of so- can we overcome differences across speakers? cial media as well as the sharing of multimedia materials in • Transcription: what is the effect of transcription type venues ranging from Facebook to YouTube. Users increas- —manual or automatic —on task effectiveness? ingly share personal content in these social media settings. However, flexible search and targeted access to this material For this challenging new task, UWCL’s approach built on remains challenging, relying largely on manually assigned and extended existing methodologies. In particular the ap- metadata, such as titles and tags, to identify content, rather proach adopts an information retrieval perspective, using the than directly indexing the content of these multimedia mate- text transcriptions of the spoken data. From this perspec- rials themselves. Furthermore, not only is extraction of con- tive, system design focused on the latter three issues identi- tent from multimedia streams more challenging than from fied above: segmentation, generalization, and transcription. text, but skimming or browsing in a media stream is slower Segmentation Much prior work on spoken document re- and more difficult than in text. trieval has either provided a gold standard segmentation or The Similar Segments in Social Speech (SSSS) Task devel- assumed its existence. In contrast, the SSSS task does not oped for MediaEval 2013 aims to overcome these limitations provide a segmentation, and one could imagine different seg- in information access. As described in the task overview mentations based on different notions of similarity. Thus, paper [6], the task requires participating systems to iden- the strategy aimed to create segments and jump-in points tify similar spans of speech given an exemplar span. The sensitive to the similarity measure and to the exemplar seg- resulting spans can be viewed as jump-in points for listeners ment. The UWCL system exploits passage retrieval [1] to searching or browsing through a multi-media stream. extract overlapping windowed spans within recordings, with In contrast to the significant prior work on spoken docu- fixed length and step in words, that have high similarity with ment retrieval [3] and topic detection and tracking [7], this the example. Overlapping and adjacent retrieved passages task applies a more general and abstract notion of similar- are merged and receive the rank of the highest ranked con- ity, rather than focusing on retrieval of documents related to tributing passage. Based on experimentation on the training corpus, retrieval returned the top 75 passages, which were 60 terms in length with a 30 term step. Copyright is held by the author/owner(s). Generalization Differences in lexical choice between ma- MediaEval 2013 Workshop, October 18-19, 2013, Barcelona, Spain terials being searched and the searcher’s specification of their Name Trans. Exp. Exp. set NSUR F stantially increased effectiveness. With the mismatched ex- uwclman man no n.a. 0.57 0.58 pansion corpus, the divergence between manual and auto- uwclauto auto no n.a. 0.57 0.58 matic transcription led to a smaller, but still noticeable, im- uwclmanexp man yes man 0.82 0.81 provement. Finally, it is interesting to note that, with suit- uwclautoexp auto yes man 0.66 0.68 able query expansion, a configuration based on automatic uwclauto2exp auto yes auto 0.796 0.80 transcription greatly outperformed one using manual tran- scripts without query expansion and was highly competitive Table 1: Contrastive official run settings and results with one using manual transcripts with query expansion. 4. CONCLUSIONS information need are a well-known issue in information re- trieval. The segments in the SSSS task, which average about UWCL’s approach to the MediaEval 2013 SSSS task em- 50 seconds in the training set and about 30 seconds in the ployed a text-based information retrieval approach, using test set, are not particularly short. However, it seems likely passage retrieval to create segments dynamically. Auto- that variation between speakers and the broad notion of matic query expansion yielded strong improvements for both similarity will make lexical match highly challenging. To manual and automatic transcripts. While these approaches address this issue, the UWCL system investigates the use showed promise, many avenues for improvement remain. In of query expansion [8]. In pseudo-relevance feedback query addition to tuning retrieval factors, such as passage length expansion, the original query is used in a preliminary search and retrieval models, I plan to explore the integration of pass. The query is then augmented and, one hopes, im- acoustic, especially acoustic-prosodic, evidence into mea- proved by adding highly ranked terms from the top-ranked sures of segment similarity, in addition to the lexical evi- spans which are presumed to relevant. The resulting query dence already in use. Such measures could be particularly is used for final retrieval. In the UWCL system, the training helpful in recognizing segments with similarity based less on set data is used to augment the small test set during expan- topical content than on emotional or attitudinal content. sion. The procedure used the top five passages retrieved to Acknowledgments create a relevance model and selected the ten terms with highest likelihood under that model for expansion. Many thanks to the task organizers and also to Steve Renals Transcription Both manual and automatic transcripts for providing the high-quality automatic transcriptions. of the spoken data are employed. 5. REFERENCES 3. EXPERIMENTATION [1] J. P. Callan. Passage-level evidence in document retrieval. In Proceedings of the 17th annual ACM 3.1 Experimental Setup SIGIR conference, pages 302–310, 1994. The UWCL system employed the INDRI/LEMUR infor- [2] J. Hirschberg, M. Bacchiani, D. Hindel, P. Isenhour, mation retrieval engine (http://www.lemurproject.org) for A. Rosenberg, L. Stark, L. Stead, S. Whittaker, and indexing and retrieval with default settings [4]. The LEMUR G. Zamchick. Scanmail: Browsing and searching speech system provides a sophisticated query language, has built-in data by content. In Proceedings of EUROSPEECH support for passage retrieval, and supports pseudo-relevance 2001, 2001. feedback query expansion. We made use of two different [3] M. Larson and G. J. F. Jones. Spoken content retrieval: transcriptions of the conversations: manual transcripts pro- A survey of techniques and technologies. Foundations vided by the task organizers and automatic transcripts gen- and Trends in Information Retrieval, 5(4–5):235–422, erously provided by the University of Edinburgh. Each con- 2012. versation was converted to a single TREC-format text doc- [4] D. Metzler, T. Strohman, and W. B. Croft. Indri at ument for indexing. For query formulation, the system ex- TREC 2006: Lessons learned from three terabyte tracted all tokens in any time-aligned span which overlapped tracks. In Proceedings of TREC 2006, 2006. the exemplar segment. These terms were then linked through [5] D. W. Oard, J. Wang, G. J. Jones, R. W. White, unweighted combination (the #combine operator). Manual P. Pecina, D. Soergel, X. Huang, and I. Shafran. transcriptions were aligned by turn; conversion of automatic Overview of the CLEF-2006 cross-language speech transcriptions relied on alignments at the word level. retrieval track. In CLEF-2006, 2006. [6] N. G. Ward, S. D. Werner, D. G. Novick, E. E. 3.2 Experiment runs and results Shriberg, C. Oertel, L.-P. Morency, and T. Kawahara. Five official runs on the test data were submitted and The similar segments in social speech task. In scored. As shown in Table 1, contrasting conditions explored Proceedings of MediaEval 2013, Barcelona, Spain, the impact of transcription (manual/automatic), query ex- October 18-19 2013. pansion (yes/no), and expansion corpus (manual/automatic). [7] C. Wayne. Multilingual topic detection and tracking: The official results are also tabulated, for the primary met- Successful research enabled by corpora and evaluation. rics, Normalized Searcher Utility Ratio (NSUR) and F-measure, In Language Resources and Evaluation Conference as described in the task overview [6]. (LREC) 2000, pages 1487–1494, 2000. 3.3 Discussion [8] J. Xu and W. Croft. Query expansion using local and global document analysis. In Proceedings of the 19th We find that, although the baseline query formulation Annual International ACM SIGIR Conference, 1996. achieves modest effectiveness, query expansion using pseudo- relevance feedback based on a matched corpus yielded sub-