skip to main content
10.1145/2384916.2384920acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

Thematic organization of web content for distraction-free text-to-speech narration

Published:22 October 2012Publication History

ABSTRACT

People with visual disabilities, especially those who are blind, have digital content narrated to them by text-to-speech (TTS) engines (e.g., with the help of screen readers). Naively narrating web pages, particularly the ones consisting of several diverse pieces (e.g., news summaries, opinion pieces, taxonomy, ads), with TTS engines without organizing them into thematic segments will make it very difficult for the blind user to mentally separate out and comprehend the essential elements in a segment, and the effort to do so can cause significant cognitive stress. One can alleviate this difficulty by segmenting web pages into thematic pieces and then narrating each of them separately. Extant segmentation methods typically segment web pages using visual and structural cues. The use of such cues without taking into account the semantics of the content, tends to produce "impure" segments containing extraneous material interspersed with the essential elements. In this paper, we describe a new technique for identifying thematic segments by tightly coupling visual, structural, and linguistic features present in the content. A notable aspect of the technique is that it produces segments with very little irrelevant content. Another interesting aspect is that the clutter-free main content of a web page, that is produced by the Readability tool and the "Reader" feature of the Safari browser, emerges as a special case of the thematic segments created by our technique. We provide experimental evidence of the effectiveness of our technique in reducing clutter. We also describe a user study with 23 blind subjects of its impact on web accessibility.

References

  1. Document object model (DOM) technical reports (http://www.w3.org/DOM/DOMTR). 2010.Google ScholarGoogle Scholar
  2. x-path (http://www.w3.org/tr/xpath/). 2010.Google ScholarGoogle Scholar
  3. Apple. Voiceover, screen reader from apple (http://www.apple.com/accessibility/voiceover). 2010.Google ScholarGoogle Scholar
  4. Y. Borodin, F. Ahmed, M. A. Islam, Y. Puzis, V. Melnyk, S. Feng, I. V. Ramakrishnan, and G. Dausch. Hearsay: a new generation context-driven multi-modal assistive web browser. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Cai, S. Yu, J.-R. Wen, and W.-Y. Ma. VIPS: a vision-based page segmentation algorithm. Microsoft Technical Report, (MSR-TR-2003-79), 2003.Google ScholarGoogle Scholar
  6. D. Chakrabarti, R. Kumar, and K. Punera. A graph-theoretic approach to webpage segmentation. In WWW, pages 377--386, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Egnor. Document segmentation based on visual gaps. L.L.P. HARRITY and SNYDER, 2006.Google ScholarGoogle Scholar
  8. G. H. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. Journal of the Society for Industrial and Applied Mathematics, pages 205--224, 1965.Google ScholarGoogle ScholarCross RefCross Ref
  9. H.-F. Guo, J. Mahmud, Y. Borodin, A. Stent, and I. V. Ramakrishnan. A general approach for partitioning web page content based on geometric and style information. In ICDAR, pages 929--933, 2007.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Hattori, K. Hoashi, K. Matsumoto, and F. Sugaya. Robust web page segmentation for mobile terminal using content-distances and page layout information. In WWW, pages 361--370, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. A. Islam, F. Ahmed, Y. Borodin, and I. V. Ramakrishnan. Tightly coupling visual and linguistic features for enriching audio-based web browsing experience. In CIKM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. JAWS. (http://www.freedomscientific.com). 2010.Google ScholarGoogle Scholar
  13. T. K. Landauer and S. T. Dumais. Latent semantic analysis. Scholarpedia, 3(11):43--56, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Mahmud, Y. Borodin, and I. V. Ramakrishnan. Csurf: a context driven non-visual web-browser. In WWW, pages 31--40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. D. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Pnueli, R. Bergman, S. Schein, and O. Barko. Web page layout via visual segmentation. (HPL-2009-160).Google ScholarGoogle Scholar
  17. Readability. (https://www.readability.com). 2010.Google ScholarGoogle Scholar
  18. G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613--620, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Strehl. Relationship-based clustering and cluster ensembles for high-dimensional data mining. PhD thesis, The University of Texas at Austin, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Thematic organization of web content for distraction-free text-to-speech narration

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASSETS '12: Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility
        October 2012
        321 pages
        ISBN:9781450313216
        DOI:10.1145/2384916

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 October 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate436of1,556submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader