skip to main content
10.1145/1620432.1620448acmconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

NexusDS: a flexible and extensible middleware for distributed stream processing

Published: 16 September 2009 Publication History

Abstract

Techniques for efficient and distributed processing of huge, unbound data streams have made some impact in the database community. Sensors and data sources, such as position data of moving objects, continuously produce data that is consumed, e.g., by location-aware applications. Depending on the domain of interest, e.g. visualization, the processing of such data often depends on domain-specific functionality. This functionality is specified in terms of dedicated operators that may require specialized hardware, e.g. GPUs. This creates a strong dependency which a data stream processing system must consider when deploying such operators. Many data stream processing systems have been presented so far. However, these systems assume homogeneous computing nodes, do not consider operator deployment constraints, and are not designed to address domain-specific needs.
In this paper, we identify necessary features that a flexible and extensible middleware for distributed stream processing of context data must satisfy. We present NexusDS, our approach to achieve these requirements. In NexusDS, data processing is specified by orchestrating data flow graphs, which are modeled as processing pipelines of predefined and general operators as well as custom-built and domain-specific ones. We focus on easy extensibility and support for domain-specific operators and services that may even utilize specific hardware available on dedicated computing nodes.

References

[1]
D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The Design of the Borealis Stream Processing Engine. In Second Biennial Conference on Innovative Data Systems Research (CIDR 2005), Asilomar, CA, January 2005.
[2]
D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, 2003.
[3]
Y. Ahmad, B. Berg, U. Cetintemel, M. Humphrey, J.-H. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik. Distributed operation in the borealis stream processing engine. In SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pages 882--884, New York, NY, USA, 2005. ACM.
[4]
L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. Spc: a distributed, scalable platform for data mining. In DMSSP '06: Proceedings of the 4th international workshop on Data mining standards, services and platforms, pages 27--37, New York, NY, USA, 2006. ACM.
[5]
G. Antoniu, P. Hatcher, M. Jan, and D. Noblet. Performance evaluation of jxta communication layers. Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on, 1:251--258 Vol. 1, May 2005.
[6]
A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, R. Motwani, I. Nishizawa, U. Srivastava, D. Thomas, R. Varma, and J. Widom. Stream: The stanford stream data manager. IEEE Data Eng. Bull., 26(1):19--26, 2003.
[7]
B. Babcock, S. Babu, R. Motwani, and M. Datar. Chain: operator scheduling for memory minimization in data stream systems. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 253--264, New York, NY, USA, 2003. ACM.
[8]
M. Balazinska, H. Balakrishnan, and M. Stonebraker. Load management and high availability in the medusa distributed stream processing system. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 929--930, New York, NY, USA, 2004. ACM.
[9]
D. Carney, U. Çetintemel, A. Rasin, S. Zdonik, M. Cherniack, and M. Stonebraker. Operator scheduling in a data stream manager. In VLDB '2003: Proceedings of the 29th international conference on Very large data bases, pages 838--849. VLDB Endowment, 2003.
[10]
M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Çetintemel, Y. Xing, and S. B. Zdonik. Scalable distributed stream processing. In CIDR, 2003.
[11]
I. Foster and C. Kesselman. The Grid 2: Blueprint for a New Computing Infrastructure (The Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann, November 2003.
[12]
B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo. Spade: the system s declarative stream processing engine. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1123--1134, New York, NY, USA, 2008. ACM.
[13]
L. Golab and M. T. Özsu. Issues in data stream management. SIGMOD Rec., 32(2):5--14, 2003.
[14]
N. Goodnight, R. Wang, and G. Humphreys. Computation on programmable graphics hardware. IEEE Computer Graphics and Applications, 25(5):12--15, 2005.
[15]
R. B. Haber and D. A. McNabb. Visualization idioms: A conceptual model for scientific visualization systems. In B. Schriver, G. M. Nielson, and L. J. Rosenblum, editors, Visualization in Scientific Computing, pages 74--93. IEEE Computer Society Press, 1990.
[16]
N. Hönle, U.-P. Käppeler, D. Nicklas, T. Schwarz, and M. Großmann. Benefits of integrating meta data into a context model. In PerCom Workshops, pages 25--29. IEEE Computer Society, 2005.
[17]
C. Kesselman and I. Foster. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, November 1998.
[18]
J. Krämer and B. Seeger. Pipes: a public infrastructure for processing and exploring streams. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 925--926, New York, NY, USA, 2004. ACM.
[19]
R. Kuntschke, B. Stegmaier, A. Kemper, and A. Reiser. Streamglobe: processing and sharing data streams in grid-based p2p infrastructures. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 1259--1262. VLDB Endowment, 2005.
[20]
D. Nicklas, M. Großmann, T. Schwarz, S. Volz, and B. Mitschang. A model-based, open architecture for mobile, spatially aware applications. In C. S. Jensen, M. Schneider, B. Seeger, and V. J. Tsotras, editors, SSTD, volume 2121 of Lecture Notes in Computer Science, pages 117--135. Springer, 2001.
[21]
D. Nicklas and B. Mitschang. On building location aware applications using an open platform based on the NEXUS augmented world model. Software and System Modeling, 3:303--313, 2004.
[22]
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A. E. Lefohn, and T. J. Purcell. A survey of general-purpose computation on graphics hardware. In Eurographics 2005, State of the Art Reports, pages 21--51, Aug. 2005.
[23]
B. Stegmaier, R. Kuntschke, and A. Kemper. Streamglobe: Adaptive query processing and optimization in streaming p2p environments. In DMSN '04: Proceedings of the 1st international workshop on Data management for sensor networks, pages 88--97, New York, NY, USA, 2004. ACM.
[24]
N. Tatbul and S. Zdonik. Dealing with overload in distributed stream processing systems. In ICDEW '06: Proceedings of the 22nd International Conference on Data Engineering Workshops, page 24, Washington, DC, USA, 2006. IEEE Computer Society.
[25]
T. Urhan and M. J. Franklin. Dynamic pipeline scheduling for improving interactive query performance. In VLDB '01: Proceedings of the 27th International Conference on Very Large Data Bases, pages 501--510, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.
[26]
Y. Yang, J. Krämer, D. Papadias, and B. Seeger. Hybmig: A hybrid approach to dynamic plan migration for continuous queries. IEEE Transactions on Knowledge and Data Engineering, 19(3):398--411, 2007.
[27]
M. Zhu, Q. Wu, N. Rao, and S. Iyengar. Adaptive visualization pipeline decomposition and mapping onto computer networks. Image and Graphics, 2004. Proceedings. Third International Conference on, pages 402--405, Dec. 2004.
[28]
Y. Zhu, E. A. Rundensteiner, and G. T. Heineman. Dynamic plan migration for continuous queries over data streams. In SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pages 431--442, New York, NY, USA, 2004. ACM.

Cited By

View all
  • (2021)Processing Big Data in Motion: Core Components and System Architectures with Applications to the Maritime DomainTechnologies and Applications for Big Data Value10.1007/978-3-030-78307-5_22(497-518)Online publication date: 1-Jul-2021
  • (2019)Complex event recognition in the Big Data era: a surveyThe VLDB Journal10.1007/s00778-019-00557-wOnline publication date: 25-Jul-2019
  • (2014)Challenges for Personal Data Stream Management in Smart BuildingsCreating Personal, Social, and Urban Awareness through Pervasive Computing10.4018/978-1-4666-4695-7.ch009(201-219)Online publication date: 2014
  • Show More Cited By

Index Terms

  1. NexusDS: a flexible and extensible middleware for distributed stream processing

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          IDEAS '09: Proceedings of the 2009 International Database Engineering & Applications Symposium
          September 2009
          347 pages
          ISBN:9781605584027
          DOI:10.1145/1620432
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 16 September 2009

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. P2P and networked data management
          2. data stream processing
          3. database services and applications
          4. middleware platforms for data management
          5. stream databases

          Qualifiers

          • Research-article

          Funding Sources

          • Collaborative Research Center Nexus: Spatial World Models for Mobile Context-Aware Applications

          Conference

          IDEAS '09
          Sponsor:
          • ACM
          • Concordia University

          Acceptance Rates

          Overall Acceptance Rate 74 of 210 submissions, 35%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)2
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 20 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2021)Processing Big Data in Motion: Core Components and System Architectures with Applications to the Maritime DomainTechnologies and Applications for Big Data Value10.1007/978-3-030-78307-5_22(497-518)Online publication date: 1-Jul-2021
          • (2019)Complex event recognition in the Big Data era: a surveyThe VLDB Journal10.1007/s00778-019-00557-wOnline publication date: 25-Jul-2019
          • (2014)Challenges for Personal Data Stream Management in Smart BuildingsCreating Personal, Social, and Urban Awareness through Pervasive Computing10.4018/978-1-4666-4695-7.ch009(201-219)Online publication date: 2014
          • (2014)Quality mattersProceedings of the 8th ACM International Conference on Distributed Event-Based Systems10.1145/2611286.2611292(1-12)Online publication date: 26-May-2014
          • (2013)Compilation of ReferencesCreating Personal, Social, and Urban Awareness through Pervasive Computing10.4018/978-1-4666-4695-7.chcrf(0-0)Online publication date: 31-Oct-2013
          • (2013)Towards a model-based approach for context-aware assistance systems in offshore operations2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)10.1109/PerComW.2013.6529456(55-60)Online publication date: Mar-2013
          • (2011)Distributed context-aware visualization2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)10.1109/PERCOMW.2011.5766878(251-256)Online publication date: Mar-2011
          • (2010)Usability analysis of compression algorithms for position data streamsProceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/1869790.1869825(240-249)Online publication date: 2-Nov-2010
          • (2010)NexusVIS: A distributed visualization toolkit for mobile applications2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)10.1109/PERCOMW.2010.5470557(841-843)Online publication date: Mar-2010
          • (2010)Exploiting constraints to build a flexible and extensible data stream processing middleware2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)10.1109/IPDPSW.2010.5470847(1-8)Online publication date: Apr-2010
          • Show More Cited By

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media