ACM Home Page
Please provide us with feedback. Feedback
Time is of the essence: an evaluation of temporal compression algorithms
Full text PdfPdf (649 KB)
Source Conference on Human Factors in Computing Systems archive
Proceedings of the SIGCHI conference on Human Factors in computing systems table of contents
Montréal, Québec, Canada
SESSION: Managing voice input table of contents
Pages: 329 - 338  
Year of Publication: 2006
ISBN:1-59593-372-7
Authors
Simon Tucker  Sheffield University, UK, Sheffield, UK
Steve Whittaker  Sheffield University, UK, Sheffield, UK
Sponsors
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 57,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1124772.1124822
What is a DOI?

ABSTRACT

Although speech is a potentially rich information source, a major barrier to exploiting speech archives is the lack of useful tools for efficiently accessing lengthy speech recordings. This paper develops and evaluates techniques for temporal compression - reducing the time people take to listen to a recording while still extracting critical information. We first describe an exploratory study that identifies novel excision techniques that remove unimportant words or utterances from the recording. We then develop a new method for evaluating how well temporal compression supports users in forming a general understanding of a recording. Applying this method, we demonstrate that excision techniques are generally more effective than standard compression techniques that simply speed up the entire recording.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
AMI Project. http://www.amiproject.org/
2
 
3
 
4
Beasley, D.S. and Maki, J.E. Time and frequency altered speech. In Contemporary Issues in Experimental Phonetics, Academic Press, (1976), 419--458.
5
 
6
Covell, M., Withgott, M. and Slaney, M. Mach1: Nonuniform time-scale modification of speech. Proc. IEEE ICASSP 1998, (1998), 493--496.
7
 
8
Garofolo, J., Auzanne, C.G.P. and Voorhees, E.M. The TREC-9 spoken document retrieval track: A success story. Proc. RIAO-2000, (2000).
 
9
Hays, W.L. Statistics for the Social Sciences. Holt, Rinehart and Winston, 1973.
 
10
He, L. and Gupta, A. User benefits of non-linear time compression. Microsoft Research Technical Report MSR-TR-2000-96, Microsoft, (2000).
 
11
Hejna, D. Real-time time-scale modification of speech via the synchronized overlap-add algorithm. MSc Dissertation, M.I.T., (1990).
 
12
Hori, C. and Furui, S. A new approach to automatic speech summarization. IEEE Trans. Multimedia 5, 3 (2003), 368--378.
 
13
Lin, C-W. ROUGE: A package for automatic evaluation of summaries. Proceedings of ACL 2004, (2004), 56--60.
 
14
McKeown, K., Hirschberg, J., Galley, M. and Maskey, S.. From text to speech summarization. In Proc. of ICASSP 2005, (2005).
 
15
MLMI 2005. http://groups.inf.ed.ac.uk/mlmi05/techprog.html.
 
16
 
17
Nenkova, A. and Passonneau, R. Evaluating content selection in summarization: the pyramid model. In Proc HLT-NAACL 2004, (2004), 145--152.
 
18
Sticht, T.G. Comprehension of repeated time-compression recordings. Journal of Experimental Education 37, 4 (1969).
19
 
20
Tucker, S. and Whittaker, S. Accessing multimodal meeting data: systems, problems and possibilities. In Lecture Notes in Computer Science 3361, (2005), 1--11.
 
21
Tucker, S. and Whittaker, S. Novel techniques for time-compressing speech: An exploratory study. In Proc of ICASSP 2005, (2005).
22
 
23
Voorhees, E.M. and Buckland, L.P. The Thirteenth Text REtrieval Conference Proceedings. NIST Special Publication, (2004).
 
24
Walker, M., Prasad, R. and Stent, A. A trainable generator for recommendations in multimodal dialog. In EUROSPEECH: European Conference on Speech Processing, (2003), 1697--1701.
25
26
27


Collaborative Colleagues:
Simon Tucker: colleagues
Steve Whittaker: colleagues