ACM Home Page
Please provide us with feedback. Feedback
Scalable summaries of spoken conversations
Full text PdfPdf (1.42 MB)
Source
International Conference on Intelligent User Interfaces archive
Proceedings of the 13th international conference on Intelligent user interfaces table of contents
Gran Canaria, Spain
SESSION: Speech table of contents
Pages 267-275  
Year of Publication: 2008
ISBN:978-1-59593-987-6
Authors
Sumit Basu  Microsoft Research, Redmond, WA
Surabhi Gupta  Microsoft Research, Redmond, WA and Stanford University, Stanford, CA
Milind Mahajan  Microsoft Research, Redmond, WA
Patrick Nguyen  Microsoft Research, Redmond, WA
John C. Platt  Microsoft Research, Redmond, WA
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGCHI : ACM Special Interest Group on Computer Human Interaction
AAAI : Association for the Advancement of Artifical Intelligence
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 30,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1378773.1378809
What is a DOI?

ABSTRACT

In this work, we present a novel means of browsing recorded audio conversations. The method we develop produces scalable summaries of the recognized speech, in which we can increase the amount of text continuously with the desired level of detail to best fill the available space. We present an interface in which a user can view an entire conversation in one screen, but can also quickly zoom in to see the full transcript; the corresponding audio can be easily played as well. The scaling is achieved via a combination of topic segmentation and informative phrase selection, where the threshold for informativeness decreases with increasing level of detail. Finally, we evaluate our method and interface against a baseline interface with a user study.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
B. Bederson, B., Hollan, J. D., Perlin, K., Meyer, J., Bacon, D., & Furnas, G. W. (1996). "Pad++: A Zoomable Graphical Sketchpad for Exploring Alternate Interface Physics." Journal of Visual Languages and Computing, 7, 3--31
 
2
D. Beeferman, A. Berger, and J. Lafferty, "Statistical Models of Text Segmentation." Machine Learning. 6(1--3), 1999, pp. 177--210.
 
3
Alexandra Canavan, David Graff, and George Zipperlen, CALLHOME American English Speech, LDC Catalog Number LDC97S42, Linguistic Data Consortium, Philadelphia, 1997.
 
4
H. Christensen, B. Kolluru, Y. Gotoh and S. Renals, "From Text Summarization to Style-Specific Summarization for Broadcast News." In Proc. of (ECIR'04), Sunderland, UK, 2004.
 
5
L. He, E. Sanocki, A. Gupta, and J. Grudin, "Auto-Summarization of Audio-Video Presentations," In Proceedings of ACM Multimedia, 1999.
 
6
M. Hearst, "TextTiling: Segmenting Text into Multi-Paragraph Sub-Topic Passages," Computational Linguistics, Vol. 23, No. 1, 1997, pp. 33--64.
 
7
J. Hirschberg, "Speech Summarization." Lecture Slides available at http://www1.cs.columbia.edu/~julia/cs4706/sum.ppt
 
8
C. Hori and S. Furui, "A New Approach to Automatic Speech Summarization." IEEE Transactions on Multimedia, Vol. 5, NO. 3, September 2003, pp. 368--378.
 
9
K. Hornbæk, Bederson, B. B., & Plaisant, C., "Navigation Patterns and Usability of Zoomable User Interfaces With and Without an Overview," ACM Transactions on Computer-Human Interaction, 9(4):362--389, 2003.
 
10
W. Hsu, L. Kennedy, S.-F. Chang, M. Franz, J. Smith, "Columbia-IBM News Video Story Segmentation In TRECVID 2004." Columbia ADVENT Technical Report 209-2005-3, 2005.
 
11
K. Koumpis and S. Renals, "Automatic Summarization of Voicemail Messages Using Lexical and Prosodic Features." ACM Transactions on Speech and Language Processing. February, 2 (1), February 2005.
 
12
L. Lamel and J. L. Gauvain, "Alternate Phone Models for Conversational Speech," Proc. IEEE ICASSP'05, Philadelphia, March 2005.
 
13
H. R. Lindman, Analysis of Variance in Complex Experimental Designs, San Francisco: W. H. Freeman and Co., 1974.
 
14
S. R. Maskey and J. Hirschberg, "Summarizing Speech Without Text Using Hidden Markov Models," in Proceedings of HLT-NAACL, 2006.
 
15
M. T. Maybury, "Discourse Cues for Broadcast News Segmentation," In Proceedings of COLING, 1998, pp.819--822.
 
16
K. Ries, "Segmenting Conversations by Topic, Initiative, and Style," Proceedings of SIGIR Work-shop on Information Retrieval, 2001.
 
17
G. Salton and M. J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.
 
18
K. Zechner and A. Waibel, "DIASUMM: Flexible Summarization of Spontaneous Dialogues in Unrestricted Domains," Proceedings of COLING-2000, 2000.
 
19
K. Zechner, "Summarization of Spoken Language - Challenges, Methods, and Prospects," Speech Technology Expert eZine, Issue 6, January 2002.

Collaborative Colleagues:
Sumit Basu: colleagues
Surabhi Gupta: colleagues
Milind Mahajan: colleagues
Patrick Nguyen: colleagues
John C. Platt: colleagues