skip to main content
article

The Story Picturing Engine---a system for automatic text illustration

Published: 01 February 2006 Publication History

Abstract

We present an unsupervised approach to automated story picturing. Semantic keywords are extracted from the story, an annotated image database is searched. Thereafter, a novel image ranking scheme automatically determines the importance of each image. Both lexical annotations and visual content play a role in determining the ranks. Annotations are processed using the Wordnet. A mutual reinforcement-based rank is calculated for each image. We have implemented the methods in our Story Picturing Engine (SPE) system. Experiments on large-scale image databases are reported. A user study has been performed and statistical analysis of the results has been presented.

References

[1]
Agosti, M., Crestani, F., and Pasi, G. 2000. Lectures on information retrieval. Lecture Notes in Computer Science, vol. 1980. Springer-Verlag, Germany.
[2]
Barnard, K., Duygulu, P., Forsyth, D., de. Freitas, N., Blei, D. M., and Jordan, M. I. 2003. Matching words and pictures. J. Mach. Learn. Res. 3, 1107--1135.
[3]
Barnard, K. and Forsyth, D. 2001. Learning the semantics of words and pictures. In Proceedings of the International Conference on Computer Vision. 408--415.
[4]
Blei, D. M. and Jordan, M. I. 2003. Modeling annotated data. In Proceedings of the 26th Annual ACM SIGIR Conference on Research and Development in Information Retrieval. 127--134.
[5]
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference. 107--117.
[6]
Brown, D. C. and Chandrasekaran, B. 1981. Design considerations for picture production in a natural language graphics system. ACM SIGGRAPH Comput. Graph. 15, 2, 174--207.
[7]
Budanitsky, A. and Hirst, G. 2001. Semantic distance in wordnet: An experimental, application-oriented evaluation of five measures. In NAACL Workshop on WordNet and Other Lexical Resources.
[8]
Carneiro, G. and Vasconcelos, N. 2005. A database centric view of semantic image annotation and retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 559--566.
[9]
Carson, C., Belongie, S., Greenspan, H., and Malik, J. 2002. Blobworld: Color and texture-based image segmentation using em and its application to image querying and classification. IEEE Trans. Patt. Anal. Machine Intell. 24, 8, 1026--1038.
[10]
Chen, C., Wactlar, H., Wang, J. Z., and Kiernan, K. 2005. Digital imagery for significant cultural and historical materials---an emerging research field bridging people, culture, and technologies. Int. J. Digital Libraries Special Issue: Towards the New Generation Digital Libraries 5, 4, 275--286.
[11]
Chen, Y., Wang, J. Z., and Krovetz, R. 2005. Clue: Cluster-based retrieval of images by unsupervised learning. IEEE Trans. Image Proces. 14, 8, 1187--1201.
[12]
Clay, S. R. and Wilhelms, J. 1996. Put: Language-based interactive manipulation of objects. IEEE Comput. Graph. Applica. 16, 2, 31--39.
[13]
Coyne, B. and Sproat, R. 2001. Wordseye: An automatic text-to-scene conversion system. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. 487--496.
[14]
Fellbaum, C. 1998. WordNet---An Electronic Lexical Database. MIT Press, Cambridge, MA and London, UK.
[15]
Garfield, E. 1972. Citation analysis as a tool in journal evaluation. Science 178, 471--479.
[16]
Kahn, K. M. 1979. Creation of computer animation from story descriptions. Ph.D. thesis, MIT, Cambridge, MA.
[17]
Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5, 604--632.
[18]
Lavrenko, V., Manmatha, R., and Jeon, J. 2003. A model for learning the semantics of pictures. In Proceedings of Advances in Neural Information Processing Systems 16.
[19]
Li, J. and Wang, J. Z. 2003. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Patt. Analy. Machine Intell. 25, 9, 1075--1088.
[20]
Li, J. and Wang, J. Z. 2004. Studying digital imagery of ancient paintings by mixtures of stochastic models. IEEE Trans. Image Proces. 13, 3, 340--353.
[21]
Li, L., Shang, Y., and Zhang, W. 2002. Improvement of hits-based algorithms on web documents. In Proceedings of the 11th International World Wide Web Conference. 527--535.
[22]
Lu, R. and Zhang, S. 2002. Automatic Generation of Computer Animation. Lecture Notes in Artificial Intelligent, vol. 2160. Springer-Verlag, Germany.
[23]
Ma, W. Y. and Manjunath, B. S. 1999. Netra: A toolbox for navigating large image databases. Multimedia Syst. 7, 3, 184--198.
[24]
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to wordnet: An online lexical database. J. Lexicography 3, 4, 235--244.
[25]
Pinski, G. and Narin, F. 1976. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Inform. Proces. Manag. 12, 297--312.
[26]
Reynolds, C. W. 1982. Computer animation with scripts and actors. Comput. Graph. 16, 3, 289--296.
[27]
Smeulders, A. W., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Patt. Analy. Machine Intell. 22, 12, 1349--1380.
[28]
Wang, J. Z., Li, J., and Wiederhold, G. 2001. Simplicity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. Patt. Analy. Machine Intell. 23, 9, 947--963.
[29]
Zha, H. 2002. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 113--120.

Cited By

View all
  • (2024)Text-to-Image Synthesis With Generative Models: Methods, Datasets, Performance Metrics, Challenges, and Future DirectionIEEE Access10.1109/ACCESS.2024.336504312(24412-24427)Online publication date: 2024
  • (2023)A Pipeline for Story Visualization from Natural LanguageApplied Sciences10.3390/app1308510713:8(5107)Online publication date: 19-Apr-2023
  • (2023)WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AIProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606772(1-17)Online publication date: 29-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 2, Issue 1
February 2006
89 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/1126004
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2006
Published in TOMM Volume 2, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Markov chain
  2. Story picturing
  3. image retrieval
  4. lexical referencing
  5. mutual reinforcement

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Text-to-Image Synthesis With Generative Models: Methods, Datasets, Performance Metrics, Challenges, and Future DirectionIEEE Access10.1109/ACCESS.2024.336504312(24412-24427)Online publication date: 2024
  • (2023)A Pipeline for Story Visualization from Natural LanguageApplied Sciences10.3390/app1308510713:8(5107)Online publication date: 19-Apr-2023
  • (2023)WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AIProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606772(1-17)Online publication date: 29-Oct-2023
  • (2023)Open-world story generation with structured knowledge enhancementNeurocomputing10.1016/j.neucom.2023.126792559:COnline publication date: 28-Nov-2023
  • (2022)Automatic and intelligent content visualization system based on deep learning and genetic algorithmNeural Computing and Applications10.1007/s00521-022-06887-134:3(2473-2493)Online publication date: 1-Feb-2022
  • (2021)TIPS: A Framework for Text Summarising with Illustrative PicturesEntropy10.3390/e2312161423:12(1614)Online publication date: 30-Nov-2021
  • (2020)Illustrate Your StoryProceedings of the 13th International Conference on Web Search and Data Mining10.1145/3336191.3371866(849-852)Online publication date: 20-Jan-2020
  • (2020)Variational Recurrent Sequence-to-Sequence Retrieval for Stepwise IllustrationAdvances in Information Retrieval10.1007/978-3-030-45439-5_4(50-64)Online publication date: 14-Apr-2020
  • (2019)Improving Arabic Text to Image Mapping Using a Robust Machine Learning TechniqueIEEE Access10.1109/ACCESS.2019.28967137(18772-18782)Online publication date: 2019
  • (2019)Story-telling maps generated from semantic representations of eventsBehaviour & Information Technology10.1080/0144929X.2019.1569162(1-23)Online publication date: 8-Feb-2019
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media