skip to main content
10.1145/1298306.1298309acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system

Published: 24 October 2007 Publication History

Abstract

User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better understand the impact of UGC systems, we have analyzed YouTube, the world's largest UGC VoD system. Based on a large amount of data collected, we provide an in-depth study of YouTube and other similar UGC systems. In particular, we study the popularity life-cycle of videos, the intrinsic statistical properties of requests and their relationship with video age, and the level of content aliasing or of illegal content in the system. We also provide insights on the potential for more efficient UGC VoD systems (e.g. utilizing P2P techniques or making better use of caching). Finally, we discuss the opportunities to leverage the latent demand for niche videos that are not reached today due to information filtering effects or other system scarcity distortions. Overall, we believe that the results presented in this paper are crucial in understanding UGC systems and can provide valuable information to ISPs, site administrators, and content owners with major commercial and technical implications.

References

[1]
Daum UCC. http://ucc.daum.net.
[2]
Imdb statistics. http://www.imdb.com/database_statistics.
[3]
Lovefilm. http://www.lovefilm.com.
[4]
Netflix prize. http://www.netflixprize.com.
[5]
Yahoo! Movies. http://movies.yahoo.com.
[6]
YouTube. http://www.youtube.com.
[7]
Surveys: Internet Traffic Touched by YouTube, January 2006. http://www.lightreading.com/document.asp?doc_id=115816.
[8]
L. Amaral, A. Scala, M. Barthélémy, and H. E. Stanley. Classes of Small-World Networks. In Proc. Natl. Acad. Sci. USA, 2000.
[9]
C. Anderson. A Problem With the LongTail. http://www.longtail.com/scifoo.ppt.
[10]
C. Anderson. The Long Tail: Why the Future of Business Is Selling Less of More. Hyperion, 2006.
[11]
E. Auchard. Participation on Web 2.0 Sites Remains Weak, April 2007. http://www.reuters.com/article/internetNews/idUSN1743638820070418.
[12]
A.-L. Barabási and R. Albert. Emergence of Scaling in Random Networks. Science, 286:509--512, 1999.
[13]
S. Bausch and L. Han. YouTube U.S. Web Traffic Grows 75 Percent Week over Week, July 2006. Neilsen/Netratings, http://www.nielsen-netratings.com/pr/pr_060721_2.pdf.
[14]
B. Cheng, X. Liu, Z. Zhang, and H. Jin. A Measurement Study of a Peer-to-Peer Video-on-Demand System. In Proc. of IPTPS, 2007.
[15]
J. Cho and S. Roy. Impact of Search Engines on Page Popularity. In Proc. of WWW, 2004.
[16]
C. Costa, I. Cunha, A. Borges, C. Ramos, M. Rocha, J. Almeida, and B. Ribeiro-Neto. Analyzing client interactivity in streaming media. In Proc. of WWW, 2004.
[17]
M. E. Crovella and A. Bestavros. Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes. IEEE/ACM ToN, 5(6):835--846, 1997.
[18]
T. Do, K. A. Hua, and M. Tantaoui. P2VoD: Providing Fault Tolerant Video-on-Demand Streaming in Peer-to-Peer Environment. Proc. of IEEE ICC, 2004.
[19]
A. B. Downey. The Structural Cause of File Size Distributions. In Proc. of IEEE MASCOTS, 2001.
[20]
T. Fenner, M. Levene, and G. Loizou. A Stochastic Evolutionary Model Exhibiting Power-Law Behaviour with an Exponential Cutoff. Physica, (13), 2005.
[21]
S. Fortunato, A. Flammini, F. Menczer, and A. Vespignani. Topical Interests and the Mitigation of Search Engine Bias. In Proc. Natl. Acad. Sci. USA, 2006.
[22]
C. Gkantsidis, T. Karagiannis, P. Rodriguez, and M. Vojnovic. Planet Scale Software Updates. In Proc. of ACM SIGCOMM, 2006.
[23]
L. Gomes. Will all of us get our 15 minutes on a youtube video?, The Wall Street Journal Online, August 2006.
[24]
C. Griwodz, M. Biig, and L. C. Wolf. Long-term Movie Popularity Models in Video-on-Demand Systems. In Proc. of ACM Multimedia, 1997.
[25]
S. Guha, S. Annapureddy, C. Gkantsidis, D. Gunawardena, and P. Rodriguez. Is High-Quality VoD Feasible using P2P Swarming? In Proc. of WWW, 2007.
[26]
K. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and J. Zahorjan. Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload. In Proc. of ACM SOSP, 2003.
[27]
Y. Guo, K. Suh, J. Kurose, and D. Towsley. P2Cast: Peer-to-Peer Patching Scheme for VoD Service. In Proc. of WWW, 2003.
[28]
B. Holt, H. R. Lynn, and M. Sowers. Analysis of Copyrighted Videos on YouTube.com. http://www.vidmeter.com/i/vidmeter_copyright_report.pdf.
[29]
C. Huang, J. Li, and K. Ross. Peer-Assisted VoD: Making Internet Video Distribution Cheap. In Proc. of IPTPS, 2007.
[30]
Y. Ijiri and H. Simon. Skew Distributions and the Size of Business Firms. North Holland, Amsterdam, 1977.
[31]
D. A. L. Li, J. Doyle, and W. Willinger. Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications. Internet Mathematics, 2(4), 2006.
[32]
E. Limpert, W. A. Stahel, and M. Abbt. Log-normal Distributions across the Sciences: Keys and Clues. BioScience, 51(5):341, 2001.
[33]
N. Magharei and R. Rejaie. PRIME: Peer-to-Peer Receiver-drIven MEsh-based Streaming. In Proc. of IEEE INFOCOM, 2007.
[34]
N. Miller. Manifesto for a New Age. Wired Magazine, March 2007.
[35]
M. Mitzenmacher. A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics, 1(2):226--251, 2004.
[36]
S. Mossa, M. Barthélémy, H. E. Stanley, and L. A. N. Amaral1. Truncation of Power Law Behavior in "Scale-Free" Network Models due to Information Filtering. Phys. Rev. Lett., (13), 2002.
[37]
M. E. J. Newman. Power laws, Pareto distributions and Zipf 's law. Contemporary Physics, 46:323, 2005.
[38]
V. M. W. Gong, Y. Liu and D. Towsley. On the Tails of Web File Size Distributions. In Proc. of 39th Allerton Conference on Communication, Control, and Computing, 2001.
[39]
H. Yu, D. Zheng, B. Y. Zhao, and W. Zheng. Understanding User Behavior in Large-Scale Video-on-Demand Systems. In Proc. of ACM Eurosys, 2006.
[40]
G. U. Yule. A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J. C. Willis, F. R. S. Royal Society of London Philosophical Transactions Series B, 213:21--87, 1925.

Cited By

View all
  • (2025)Meta-Learning for Fast Adaption in Caching NetworksIEEE Transactions on Networking10.1109/TNET.2024.347885333:1(65-77)Online publication date: Feb-2025
  • (2025)Social links vs. language barriers: decoding the global spread of streaming contentHumanities and Social Sciences Communications10.1057/s41599-025-04400-212:1Online publication date: 23-Jan-2025
  • (2024)NÜFUZ PAZARLAMASI KAPSAMINDA SOSYAL MEDYA REKLAMLARINA YÖNELİK TÜKETİCİ İNANÇLARININ TUTUM VE SATIN ALMA NİYETİ ÜZERİNE ETKİLERİGiresun Üniversitesi İktisadi ve İdari Bilimler Dergisi10.46849/guiibd.140112710:1(32-56)Online publication date: 30-Jun-2024
  • Show More Cited By

Index Terms

  1. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
      October 2007
      390 pages
      ISBN:9781595939081
      DOI:10.1145/1298306
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 October 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. caching
      2. content aliasing
      3. long tail
      4. p2p
      5. popularity analysis
      6. power-law
      7. user generated content
      8. vod

      Qualifiers

      • Article

      Conference

      IMC07
      Sponsor:
      IMC07: Internet Measurement Conference
      October 24 - 26, 2007
      California, San Diego, USA

      Acceptance Rates

      Overall Acceptance Rate 277 of 1,083 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)373
      • Downloads (Last 6 weeks)35
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Meta-Learning for Fast Adaption in Caching NetworksIEEE Transactions on Networking10.1109/TNET.2024.347885333:1(65-77)Online publication date: Feb-2025
      • (2025)Social links vs. language barriers: decoding the global spread of streaming contentHumanities and Social Sciences Communications10.1057/s41599-025-04400-212:1Online publication date: 23-Jan-2025
      • (2024)NÜFUZ PAZARLAMASI KAPSAMINDA SOSYAL MEDYA REKLAMLARINA YÖNELİK TÜKETİCİ İNANÇLARININ TUTUM VE SATIN ALMA NİYETİ ÜZERİNE ETKİLERİGiresun Üniversitesi İktisadi ve İdari Bilimler Dergisi10.46849/guiibd.140112710:1(32-56)Online publication date: 30-Jun-2024
      • (2024)Identifier et cartographier les producteurs d’analyses politiques sur YouTubeRESET10.4000/12cn713Online publication date: 2024
      • (2024)Peculiarities of online user’s content search in the context of inbound marketingBusiness: Theory and Practice10.3846/btp.2024.1717725:2(502-508)Online publication date: 27-Jun-2024
      • (2024)Blockchain Based Decentralized and Proactive Caching Strategy in Mobile Edge Computing EnvironmentSensors10.3390/s2407227924:7(2279)Online publication date: 3-Apr-2024
      • (2024)Modelling & Analyzing View Growth Pattern of YouTube Videos inculcating the impact of Subscribers, Word of Mouth and Recommendation SystemsInternational Journal of Mathematical, Engineering and Management Sciences10.33889/IJMEMS.2024.9.3.0239:3(435-450)Online publication date: 1-Jun-2024
      • (2024)A Survey of Edge Caching: Key Issues and ChallengesTsinghua Science and Technology10.26599/TST.2023.901005129:3(818-842)Online publication date: Jun-2024
      • (2024)Reimagining Communities through Transnational Bengali Decolonial Discourse with YouTube Content CreatorsProceedings of the ACM on Human-Computer Interaction10.1145/36869008:CSCW2(1-36)Online publication date: 8-Nov-2024
      • (2024)COBIRAS: Offering a Continuous Bit Rate Slide to Maximize DASH Streaming Bandwidth UtilizationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367737920:10(1-24)Online publication date: 12-Jul-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media