ABSTRACT
Users create large numbers of IoT stream queries with data streams generated from various IoT devices. Current stream processing systems such as Storm and Flink are unable to support such large numbers of IoT stream queries efficiently, as their execution models cause a flurry of cache misses while processing the events of the queries. To solve this problem, we present a new group-aware execution model, which processes the events of IoT stream queries in a way that exploits the locality of data and code references, to reduce cache misses and improve system performance. The group-aware execution model leverages the fact that users create the groups of queries according to their interests or location contexts and that queries in the same group can share the same data and codes. We realize the group-aware execution model on MIST---a new stream processing system tailored for processing many IoT stream queries efficiently---to scale up the number of IoT queries that can be processed in a machine. Our preliminary evaluation shows that our group-aware execution model increases the number of queries that can be processed within a single machine up to 3.18X compared to the Flink-based execution model.
- Daniel J Abadi, Yanif Ahmad, Magdalena Balazinska, Ugur Cetintemel, Mitch Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, Anurag Maskey, Alex Rasin, Esther Ryvkina, et al. 2005. The Design of the Borealis Stream Processing Engine. In CIDR.Google Scholar
- Daniel J Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. VLDB 12, 2 (2003), 120--139.Google ScholarDigital Library
- Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Keith Ito, Itaru Nishizawa, Justin Rosenstein, and Jennifer Widom. 2003. STREAM: the stanford stream data manager (demonstration description). In ACM SIGMOD.Google Scholar
- Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J Franklin, Joseph M Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel R Madden, Fred Reiss, and Mehul A Shah. 2003. TelegraphCQ: continuous dataflow processing. In ACM SIGMOD.Google Scholar
- Jianjun Chen, David J DeWitt, Feng Tian, and Yuan Wang. 2000. NiagaraCQ: A scalable continuous query system for internet databases. In ACM SIGMOD.Google Scholar
- Robert Gallager Dimitri Bertsekas. 1992. Data Networks (2nd ed.). Prentice Hall.Google Scholar
- EMQ Enterprise. 2017. EMQ - Erlang MQTT Broker. http://emqtt.io/docs/v2/index.html. (2017).Google Scholar
- Apache Flink. 2017. Apache Flink: Scalable Stream and Batch Data Processing. https://flink.apache.org. (2017).Google Scholar
- Lukasz Golab, Kumar Gaurav Bijay, and M Tamer Özsu. 2006. Multiquery optimization of sliding window aggregates by schedule synchronization. In CIKM.Google Scholar
- Mahanth Gowda, Ashutosh Dhekne, Sheng Shen, Romit Roy Choudhury, Lei Yang, Suresh Golwalkar, and Alexander Essanian. 2017. Bringing IoT to Sports Analytics. In NSDI.Google Scholar
- Trinabh Gupta, Rayman Preet Singh, Amar Phanishayee, Jaeyeon Jung, and Ratul Mahajan. 2014. Bolt: Data Management for Connected Homes.. In NSDI.Google Scholar
- IFTTT. 2017. IFTTT. https://ifttt.com/about. (2017).Google Scholar
- A. Khanna and R. Anand. 2016. IoT based smart parking system. In IOTA. Google ScholarCross Ref
- J. Kreps, N. Narkhede, and J. Rao. 2011. Kafka: A distributed messaging system for log processing. In NetDB.Google Scholar
- Nest Labs. 2017. Nest. https://nest.com/. (2017).Google Scholar
- Samuel Madden, Michael J Franklin, Joseph M Hellerstein, and Wei Hong. 2002. TAG: A tiny aggregation service for ad-hoc sensor networks. In OSDI.Google ScholarCross Ref
- Samuel Madden, Mehul Shah, Joseph M Hellerstein, and Vijayshankar Raman. 2002. Continuously adaptive continuous queries over streams. In ACM SIGMOD.Google Scholar
- Samuel R Madden, Michael J Franklin, Joseph M Hellerstein, and Wei Hong. 2005. TinyDB: an acquisitional query processing system for sensor networks. ACM TODS 30, 1 (2005), 122--173.Google ScholarDigital Library
- Microsoft. 2017. Azure IoT Suite. https://www.microsoft.com/en-us/cloud-platform/internet-of-things-azure-iot-suite. (2017).Google Scholar
- KVM Naidu, Rajeev Rastogi, Scott Satkin, and Anand Srinivasan. 2011. Memory-constrained aggregate computation over data streams. In ICDE.Google Scholar
- Attila Reiss and Didier Stricker. 2012. Creating and benchmarking a new dataset for physical activity monitoring. In ACM PETRA.Google Scholar
- Amazon Web Services. 2017. AWS Internet of Things. https://aws.amazon.com/iot/?nc1=h_ls. (2017).Google Scholar
- Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. 2014. Storm@ twitter. In ACM SIGMOD.Google Scholar
- Deepak Vasisht, Zerina Kapetanovic, Jongho Won, Xinxin Jin, Ranveer Chandra, Sudipta Sinha, Ashish Kapoor, Madhusudhan Sudarshan, and Sean Stratman. 2017. FarmBeats: An IoT Platform for Data-Driven Agriculture. In NSDI.Google Scholar
- Song Wang, Elke Rundensteiner, Samrat Ganguly, and Sudeept Bhatnagar. 2006. State-slice: New paradigm of multi-query optimization of window-based stream queries. In VLDB.Google Scholar
- Matt Welsh, David Culler, and Eric Brewer. 2001. SEDA: an architecture for well-conditioned, scalable internet services. In SIGOPS. Google ScholarDigital Library
- Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized streams: Fault-tolerant streaming computation at scale. In SOSP.Google Scholar
- Rui Zhang, Nick Koudas, Beng Chin Ooi, and Divesh Srivastava. 2005. Multiple aggregations over data streams. In ACM SIGMOD.Google Scholar
- Rui Zhang, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, and Pu Zhou. 2010. Streaming multiple aggregations using phantoms. VLDB 19, 4 (2010), 557--583. Google ScholarDigital Library
- Yu Zheng, Xing Xie, and Wei-Ying Ma. 2010. GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory. IEEE Data Eng. Bull. 33, 2 (2010), 32--39.Google Scholar
Recommendations
Distributed stream join query processing with semijoins
This paper addresses the distributed stream processing of window-based multi-way join queries considering the semijoin as a key join operator. In distributed stream processing, data streams arriving at remote sites need to be shipped to the processing ...
Multilevel secure data stream processing: Architecture and implementation
DBSec 2011The proliferation of sensors and mobile devices and their connectedness to the network have given rise to numerous types of situation monitoring applications. Data Stream Management Systems DSMSs have been proposed to address the data processing needs ...
Comments