ABSTRACT
In this demonstration, we will showcase Myria, our novel cloud service for big data management and analytics designed to improve productivity. Myria's goal is for users to simply upload their data and for the system to help them be self-sufficient data science experts on their data -- self-serve analytics. Using a web browser, Myria users can upload data, author efficient queries to process and explore the data, and debug correctness and performance issues. Myria queries are executed on a scalable, parallel cluster that uses both state-of-the-art and novel methods for distributed query processing. Our interactive demonstration will guide visitors through an exploration of several key Myria features by interfacing with the live system to analyze big datasets over the web.
- F. N. Afrati and J. D. Ullman. Optimizing joins in a Map-Reduce environment. In EDBT, 2010. Google ScholarDigital Library
- P. Beame, P. Koutris, and D. Suciu. Communication steps for parallel query processing. In PODS, pages 273--284, 2013. Google ScholarDigital Library
- B. Howe et al. Collaborative science workflows in SQL. Computing in Science & Engineering, Special Issue on Science Data Management, 15(2), May/June 2013. Google ScholarDigital Library
- D. Karger et al. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In STOC, 1997. Google ScholarDigital Library
- H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a social network or a news media? In Intl. Conference on World Wide Web (WWW), 2010. Google ScholarDigital Library
- M. Moyers et al. A demonstration of iterative parallel array processing in support of telescope image analysis. PVLDB, 6(12):1322--1325, 2013. Google ScholarDigital Library
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. In SIGMOD, 2008. Google ScholarDigital Library
- J. Ortiz, V. Teixeira de Almeida, and M. Balazinska. A vision for personalized service level agreements in the cloud. In SIGMOD DanaC Workshop, 2013. Google ScholarDigital Library
- T. L. Veldhuizen. Leapfrog Triejoin: a worst-case optimal join algorithm. CoRR, abs/1210.0481, 2012.Google Scholar
Index Terms
- Demonstration of the Myria big data management service
Recommendations
A survey of big data management
The rapid growth of emerging applications and the evolution of cloud computing technologies have significantly enhanced the capability to generate vast amounts of data. Thus, it has become a great challenge in this big data era to manage such voluminous ...
Management of distributed big data for social networks
CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid ComputingIn the current era of big data, high volumes of a wide variety of valuable data can be easily collected and generated from a broad range of data sources of different veracities at a high velocity. Due to the well-known 5V's of these big data, many ...
From Big Data to Big Data Mining: Challenges, Issues, and Opportunities
Proceedings of the 18th International Conference on Database Systems for Advanced Applications - Volume 7827While "big data" has become a highlighted buzzword since last year, "big data mining", i.e., mining from big data, has almost immediately followed up as an emerging, interrelated research area. This paper provides an overview of big data mining and ...
Comments