|
ABSTRACT
In this work, we present a novel means of browsing recorded audio conversations. The method we develop produces scalable summaries of the recognized speech, in which we can increase the amount of text continuously with the desired level of detail to best fill the available space. We present an interface in which a user can view an entire conversation in one screen, but can also quickly zoom in to see the full transcript; the corresponding audio can be easily played as well. The scaling is achieved via a combination of topic segmentation and informative phrase selection, where the threshold for informativeness decreases with increasing level of detail. Finally, we evaluate our method and interface against a baseline interface with a user study.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Bederson, B., Hollan, J. D., Perlin, K., Meyer, J., Bacon, D., & Furnas, G. W. (1996). "Pad++: A Zoomable Graphical Sketchpad for Exploring Alternate Interface Physics." Journal of Visual Languages and Computing, 7, 3--31
|
| |
2
|
D. Beeferman, A. Berger, and J. Lafferty, "Statistical Models of Text Segmentation." Machine Learning. 6(1--3), 1999, pp. 177--210.
|
| |
3
|
Alexandra Canavan, David Graff, and George Zipperlen, CALLHOME American English Speech, LDC Catalog Number LDC97S42, Linguistic Data Consortium, Philadelphia, 1997.
|
| |
4
|
H. Christensen, B. Kolluru, Y. Gotoh and S. Renals, "From Text Summarization to Style-Specific Summarization for Broadcast News." In Proc. of (ECIR'04), Sunderland, UK, 2004.
|
| |
5
|
L. He, E. Sanocki, A. Gupta, and J. Grudin, "Auto-Summarization of Audio-Video Presentations," In Proceedings of ACM Multimedia, 1999.
|
| |
6
|
M. Hearst, "TextTiling: Segmenting Text into Multi-Paragraph Sub-Topic Passages," Computational Linguistics, Vol. 23, No. 1, 1997, pp. 33--64.
|
| |
7
|
J. Hirschberg, "Speech Summarization." Lecture Slides available at http://www1.cs.columbia.edu/~julia/cs4706/sum.ppt
|
| |
8
|
C. Hori and S. Furui, "A New Approach to Automatic Speech Summarization." IEEE Transactions on Multimedia, Vol. 5, NO. 3, September 2003, pp. 368--378.
|
| |
9
|
K. Hornbæk, Bederson, B. B., & Plaisant, C., "Navigation Patterns and Usability of Zoomable User Interfaces With and Without an Overview," ACM Transactions on Computer-Human Interaction, 9(4):362--389, 2003.
|
| |
10
|
W. Hsu, L. Kennedy, S.-F. Chang, M. Franz, J. Smith, "Columbia-IBM News Video Story Segmentation In TRECVID 2004." Columbia ADVENT Technical Report 209-2005-3, 2005.
|
| |
11
|
K. Koumpis and S. Renals, "Automatic Summarization of Voicemail Messages Using Lexical and Prosodic Features." ACM Transactions on Speech and Language Processing. February, 2 (1), February 2005.
|
| |
12
|
L. Lamel and J. L. Gauvain, "Alternate Phone Models for Conversational Speech," Proc. IEEE ICASSP'05, Philadelphia, March 2005.
|
| |
13
|
H. R. Lindman, Analysis of Variance in Complex Experimental Designs, San Francisco: W. H. Freeman and Co., 1974.
|
| |
14
|
S. R. Maskey and J. Hirschberg, "Summarizing Speech Without Text Using Hidden Markov Models," in Proceedings of HLT-NAACL, 2006.
|
| |
15
|
M. T. Maybury, "Discourse Cues for Broadcast News Segmentation," In Proceedings of COLING, 1998, pp.819--822.
|
| |
16
|
K. Ries, "Segmenting Conversations by Topic, Initiative, and Style," Proceedings of SIGIR Work-shop on Information Retrieval, 2001.
|
| |
17
|
G. Salton and M. J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.
|
| |
18
|
K. Zechner and A. Waibel, "DIASUMM: Flexible Summarization of Spontaneous Dialogues in Unrestricted Domains," Proceedings of COLING-2000, 2000.
|
| |
19
|
K. Zechner, "Summarization of Spoken Language - Challenges, Methods, and Prospects," Speech Technology Expert eZine, Issue 6, January 2002.
|
|