skip to main content
article
Free Access

Capturing, structuring, and representing ubiquitous audio

Published:01 October 1993Publication History
Skip Abstract Section

Abstract

Although talking is an integral part of collaboration, there has been little computer support for acquiring and accessing the contents of conversations. Our approach has focused on ubiquitous audio, or the unobtrusive capture of speech interactions in everyday work environments. Speech recognition technology cannot yet transcribe fluent conversational speech, so the words themselves are not available for organizing the captured interactions. Instead, the structure of an interaction is derived from acoustical information inherent in the stored speech and augmented by user interaction during or after capture. This article describes applications for capturing and structuring audio from office discussions and telephone calls, and mechanisms for later retrieval of these stored interactions. An important aspect of retrieval is choosing an appropriate visual representation, and this article describes the evolution of a family of representations across a range of applications. Finally, this work is placed within the broader context of desktop audio, mobile audio applications, and social implications.

References

  1. ADES, S., AND SWINEHART, D.C. 1986. Voice annotation and editing in a workstation enwronment. In Proceedings of the 1986 Conference. The American Voice I/O Society, San Jose, Calif., 13 28.]]Google ScholarGoogle Scholar
  2. ARONS, B 1993. Interactlvely skimming recorded speech In the Symposium on User Inter/ace Software and Technology UIST'93 Conference Proceedings. ACM, New York.]] Google ScholarGoogle Scholar
  3. ARONS, B. 1992a. Techniques, perception, and applications of time-compressed speech. In Proceedings of the 1992 Conference. The American Voice I/O Society, San Jose, Calif., 169-177.]]Google ScholarGoogle Scholar
  4. ARONS, B 1992b. Tools for building asynchronous servers to support speech and audio applications. In the Symposium on User Interface Software and Technology UIST'92 Conference Proceedings. ACM, New York, 71-78.]] Google ScholarGoogle Scholar
  5. ARONS, B. 1991. Hyperspeech Navigating in speech-only hypermedia. In tIypertext '91 ACM, New York, 133 146.]] Google ScholarGoogle Scholar
  6. BEATTIE, G. W., AND BARNARD, P. J 1979. The temporal structure of natural telephone conversations (directory enquiry calls) Lmguistics 17, 213 229.]]Google ScholarGoogle Scholar
  7. BELLOTTI, V., AND SELLEN, A. 1993. Design for privacy in ubiquitous computing environments. In Proceedings of European Conference oil Computer Szepported Cooperative Work. Available as Rank Xerox EuroPARC Tech Rep EPC-93-103]] Google ScholarGoogle Scholar
  8. BI,~, S. A., HARmSON, S. R., AND IRWIN, S. Media Spaces: Video, audio, and computing. Commun. ACM 36, 1 (Jan.), 28-46.]] Google ScholarGoogle Scholar
  9. CHALFONTI~;, B L, FISH, R S , ANn KRAUT, R. E 1991. Expressive richness: A comparison of speech and text as media for revision. In Human Factors in Computer Systems CHI'91 Conference Proceedings. ACM, New York, 21 26.]] Google ScholarGoogle Scholar
  10. CHEN, F. R., AND WITHGOTT, M M. 1992 The use of emphasis to automatically summarize a spoken discourse. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing IEEE, New York, 1-229-232.]]Google ScholarGoogle Scholar
  11. DEGEN, L., MANDER, R., AND SALOMON, G. 1992. Working withaudio: Integrating personal tape recorders and desktop computers. In Human Factors Ln ComputerSystems--CHI'92 Conference Proceedings. ACM, New York, 413-418.]] Google ScholarGoogle Scholar
  12. DENNIS, A. R., GEORGE, J. F., JESSUP, L. M., NUNAMAKER, J. F., JR., AND VOGEL, D.R. 1988. Information technology to support electronic meetings. MIS Q. 12, 4, 591-624.]] Google ScholarGoogle Scholar
  13. DOURISH, P. 1993. Culture and control in a media space In Proceedzngs of the European Conference on Computer Supported Cooperatme Work. Available as Rank Xerox EuroPARC Tech Rep. EPC-93-101]] Google ScholarGoogle Scholar
  14. DUNLOP, C., AND KLING, R., EDS. 1991. Computerization and Controversy: Value Conflicts and Social Chomes. Academic Press, New York.]] Google ScholarGoogle Scholar
  15. EGIDO, C. 1990 Teleconi~rencing as a technology to support cooperative work' Its possibilities and limitations. In Intellectual Teamwork' Social and Technological Foundatzons of Cooperative Work. Lawrence Erlbaum, Hillsdale, N.J., Chapter 13, 351 371.]] Google ScholarGoogle Scholar
  16. FISH, R. S., K~AUT, R. E., LELAND, M.D., AND COHEN, M. 1988 Quilt: A collaborative tool for cooperative writing. In Co,ference on Office Information Systenzs--COIS'88 Conference Proceedings. ACM, New York, 30-37.]] Google ScholarGoogle Scholar
  17. F~su, R., K~UT, R., ROOT, R., AND RICE, R. ~993. Video informal communication. Commun. ACM 36, i (Jan.), 48-61.]] Google ScholarGoogle Scholar
  18. GAVER, W., MORAN, T., MACLEAN, A., LOVSTRAND, L., DOURISH, P., CARTER, K., AND BUXTON, B. 1992. Realizing a video environment: EuroPARC's RAVE system. In Human Factors in Computer Systems CHI'92 Conference Proceedings. ACM, New York, 27-35.]] Google ScholarGoogle Scholar
  19. HINDUS, D. 1992. Semi-structured capture and display of telephone conversations. Master's thesis, Massachusetts Institute of Technolog~y, Cambridge, Mass.]]Google ScholarGoogle Scholar
  20. HORNER, C. 1993. NewsTime: A graphical user interface to audio news. Master's thesis, Massachusetts Institute of Technology, Cambridge, Mass.]]Google ScholarGoogle Scholar
  21. ISHn, H. 1990. TeamWorkStation: Towards a seamless shared workspace. In Computer Supported Cooperative Work--CSCW'90 Conference Proceedings. ACM, New York, 13-26.]] Google ScholarGoogle Scholar
  22. ISAACS, E. A., AND TANG, J.C. 1993. What video can and can't do for collaboration. In the 1st International Conference on Multimedia. ACM, New York, 199 206.]] Google ScholarGoogle Scholar
  23. LAMMING, M., AND NEWMAN, W. 1992. Activity-based information retrieval: Technology in support of human memory. Tech. Rep. 92-002, Rank Xerox EuroPARC.]]Google ScholarGoogle Scholar
  24. MACKAY, W. E., MALONE, T. W., CROWSTON, K., RAO, R., ROSENBLITT, D., AND CARD, S.K. 1989. How do experienced Information Lens users use rules? In Human Factors in Computer Systems--CHI'89 Conference Proceedings. ACM, New York, 211 216.]] Google ScholarGoogle Scholar
  25. MALONE, T. W., GRANT, K. R., LAI, K.-Y., RAO, R., AND ROSENBLITT, D. 1987. Semi-structured messages are surprisingly useful for computer-supported coordination. ACM Trans. Office Inf. Syst. 5, 2, 115 131.]] Google ScholarGoogle Scholar
  26. MANTEI, M. 1988. Capturing the Capture Lab concepts: A case study in the design of computer supported meeting environments. In Computer Supported Cooperative Work--CSCW'88 Conference Proceedings. ACM, New York, 257 270.]] Google ScholarGoogle Scholar
  27. MANTEi, M., BAECKER, R., SELLEN, A., BUXTON, W., AND MILLIGAN, T. 1991. Experiences in the use of a media space. In Human Factors ~n Computer Systems--CHI'91 Conference Proceedings. ACM, New York, 203-208.]] Google ScholarGoogle Scholar
  28. MILLS, M., COHEN, J., AND WONG, Y.Y. 1992. A magnifier tool for video data. In Human Factors in Computer Systems--CHI'92 Con/krence Proceedings. ACM, New York.]] Google ScholarGoogle Scholar
  29. MULLER, M. J., AND DANIEL, J.E. 1990. Toward a definition of voice documents. In Conference on Office Informatzon Systems COIS'90 Conference Proceedings. ACM, New York, 174-183.]] Google ScholarGoogle Scholar
  30. MYERS, B.A. 1985. The importance of percent-done progress indicators for computer-human In Human Factors in Computer Systems--CHI'85 Conference Proceedings. ACM, New York, 11-17.]] Google ScholarGoogle Scholar
  31. OSCHMAN, a. B., AND CHAPANIS, h. 1974. The effects of ten communication modes on the behavior of teams during co-operative problem solving. Int. J. Man/Machine Syst. 6, 579 619.]]Google ScholarGoogle Scholar
  32. REDER, S., AND SCHWAB, R.G. 1990. The temporal structure of cooperative activity. In Computer Supported Cooperattve Work CSCW'90 Conference Proceedings. ACM, New York, 303-316.]] Google ScholarGoogle Scholar
  33. RESNICK, P. 1992. HyperVoice: A phone-based CSCW platform. In Computer Supported Cooperative Work--CSCW'92 Confkrence Proceedtngs. ACM, New York, 218-225.]] Google ScholarGoogle Scholar
  34. RESNICK, P., AND VIRZi, R. A. 1992. Skip and Scan: Cleaning up telephone interfaces. In Human Factors in Computer Systems--CHI'92 Conference Proceedtngs. ACM, New York, 419-426.]] Google ScholarGoogle Scholar
  35. ROTHFEDER, J. 1992. Privacy for Sale. Simon and Schuster, New York.]]Google ScholarGoogle Scholar
  36. RUq~rER. D.R. 1987. Communicating by Telephone. Pergamon Press, New York.]]Google ScholarGoogle Scholar
  37. SCHMANDT, C. 1993. Phoneshelh The telephone as computer terminal. In the 1st Internatwnal Conference on Multzmedta. ACM, New York, 373 382.]] Google ScholarGoogle Scholar
  38. SCHMANgT, C. 1990. Caltalk: A multi-media calendar. In Proceedings of the 1990 Conference. The American Voice I/O Society, San Jose, Calif., 71-75.]]Google ScholarGoogle Scholar
  39. SCUMANDT, C. 1981. The Intelligent Ear: A graphical interfaceto digital audio. In Proceedings of the IEEE Conference on Cybernctlc~' altd Hocle(v. IEEE, New York, 393 397.]]Google ScholarGoogle Scholar
  40. SCHMANDT, C., AND ARONS, B. 1985. Phone Slave: A graphical telecommunications interface. Proc. Soc. Inf. D~splay 26, 1, 79 82.]]Google ScholarGoogle Scholar
  41. SOCLOF, M., AND ZUE, V. 1990. Collection and analysis of spontaneous and read corpora for spoken language system development. In Proceedmgs of ICSLP. 1105-1108.]]Google ScholarGoogle Scholar
  42. SPROULL, L., AND KIESLER, S. 1991. Connections: New Ways of Working zn the Networked Organization. MIT Press, Cambridge, Mass.]] Google ScholarGoogle Scholar
  43. TIFELMAN, L.J. 1992. VoiceNotes: An application for a voice-controlled hand-held computer. Master's thesm, Massachusetts Institute of Technology, Cambridge, Mass]]Google ScholarGoogle Scholar
  44. STIFELMAN, L. J. 1991. Not just another voice mail system. In Proceedings of the 1991 Conference. American Voice I/O Society, San Jose, Calif., 21-26.]]Google ScholarGoogle Scholar
  45. STIFELMAN, L. J., ARONS, B., SCHMANDT, C., AND HULTEEN, E. A 1993. VoiceNotes: A speech interlace for a hand*held voice notetaker. In Human Factors in Computer Systems InterCHI'93 Conference Proceedings. ACM, New York, 179-186.]] Google ScholarGoogle Scholar
  46. WANT, R., HOPPER, A., FALCCO, V., AND GIBBONS, d. 1992. The active badge location system. ACM Trans. Office Inf. Syst. 10, 1, 91-102]] Google ScholarGoogle Scholar
  47. WATABE, K., SAKATA, S., MAENO, K., FUKUOKA, H., AND OHMORI, T. 1991. Distributed desktop conferenclng system with multluser multimedia interface. IEEE J. Sel. Areas Commun. 9, 4, 531 539.]]Google ScholarGoogle Scholar
  48. WEISER, M. 1991. The computer for the 21st century. Sc~. Am. 265, 3 (Sept.), 66 75.]]Google ScholarGoogle Scholar
  49. WILCOX, L., AND BUSH, M. 1991. HMM-based wordspotting for vmce editing and indexing. In Proceedings of Eurospeech 91. 25 28.]]Google ScholarGoogle Scholar
  50. ZELLWECER, P., TERRY, D., ANO SWlNE~ART, D. 1988. An overview of the Etherphone system and its applications. In Proceedings of the 2nd IEEE Conference on Computer Workstatmns. IEEE, New York, 160-168.]]Google ScholarGoogle Scholar
  51. ZuE, V.W. 1991. From signals to symbols to meaning. On machine understanding of spoken language. In Proceedings of the 12th International Congress of Phonetic Sciences.]]Google ScholarGoogle Scholar

Index Terms

  1. Capturing, structuring, and representing ubiquitous audio

                      Recommendations

                      Comments

                      Login options

                      Check if you have access through your login credentials or your institution to get full access on this article.

                      Sign in

                      Full Access

                      PDF Format

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader