skip to main content
research-article

Design and implementation trade-offs for wide-area resource discovery

Published: 06 October 2008 Publication History

Abstract

We describe the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. In contrast to previous systems, SWORD allows users to describe desired resources as a topology of interconnected groups with required intragroup, intergroup, and per-node characteristics, along with the utility that the application derives from specified ranges of metric values. This design gives users the flexibility to find geographically distributed resources for applications that are sensitive to both node and network characteristics, and allows the system to rank acceptable configurations based on their quality for that application.
Rather than evaluating a single implementation of SWORD, we explore a variety of architectural designs that deliver the required functionality in a scalable and highly available manner. We discuss the trade-offs of using a centralized architecture as compared to a fully decentralized design to perform wide-area resource discovery. To summarize our results, we found that a centralized architecture based on 4-node server cluster sites at network-peering facilities outperforms a decentralized DHT-based resource discovery infrastructure with respect to query latency for all but the smallest number of sites. However, although a centralized architecture shows significant promise in stable environments, we find that our decentralized implementation has acceptable performance and also benefits from the DHT's self-healing properties in more volatile environments. We evaluate the advantages and disadvantages of centralized and distributed resource discovery architectures on 1000 hosts in emulation and on approximately 200 PlanetLab nodes spread across the Internet.

References

[1]
Albrecht, J., Tuttle, C., Snoeren, A. C., and Vahdat, A. 2006. PlanetLab application management using Plush. SIGOPS Oper. Syst. Rev. 40, 1, 33--40.
[2]
Aspnes, J., Kirsch, J., and Krishnamurthy, A. 2004. Load balancing and locality in range-queriable data structures. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC).
[3]
Aspnes, J. and Shah, G. 2003. Skip graphs. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA).
[4]
AuYoung, A., Chun, B. N., Snoeren, A. C., and Vahdat, A. 2004. Resource allocation in federated distributed computing infrastructures. In Proceedings of the Symposium on Reliable Infrastructures for XML (OASIS).
[5]
Awerbuch, B. and Scheidler, C. 2003. Peer-to-Peer systems for prefix search. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC).
[6]
Balazinska, M., Balakrishnan, H., and Karger, D. 2002. INS/Twine: A scalable peer-to-peer architecture for intentional resource discovery. In Proceedings of the IEEE International Conference on Program Comprehension (ICPC).
[7]
Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Muir, S., Peterson, L., Roscoe, T., Spalink, T., and Wawrzoniak, M. 2004. Operating systems support for planetary-scale network services. In Proceedings of the ACM Symposium on Networked Systems Design and Implementation (NSDI).
[8]
Bharambe, A., Agrawal, M., and Seshan, S. 2004. Mercury: Supporting scalable multi-attribute range queries. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[9]
Chang, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2002. Towards capturing representative AS-level Internet topologies. In Proceedings of the ACM Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS).
[10]
Chawathe, Y., Ramabhadran, S., Ratnasamy, S., LaMarca, A., Shenker, S., and Hellerstein, J. 2005. A case study in building layered DHT applications. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[11]
Chen, Y., Bindel, D., Song, H., and Katz, R. 2004. An algebraic approach to practical and scalable overlay network monitoring. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[12]
Chun, B. 2008. Slicestat. http://berkeley.intel-research.net/bnc/slicestat/.
[13]
Considine, J., Byers, J., and Mayer-Patel, K. 2003. A constraint satisfication approach to testbed embedding services. In Proceedings of the Workshop on Hot Topics in Network (HotNets).
[14]
Crainiceanu, A., Linga, P., Gehrke, J., and Shanmugasundaram, J. 2004. Querying peer-to-peer networks using P-trees. In Proceedings of the International Workshop on Web and Databases (WebDB).
[15]
Czajkowski, K., Fitzgerald, S., Foster, I., and Kesselman, C. 2001. Grid information services for distributed resource sharing. In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC).
[16]
Czajkowski, K., Foster, I., Kesselman, C., Sander, V., and Tuecke, S. 2002. SNAP: A protocol for negotiating service level agreements and coordinating resource management in distributed systems. In Proceedings of the 8th Workshop on Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2537. Springer, 153--183.
[17]
Dabek, F., Cox, R., Kaahoek, F., and Morris, R. 2004. Vivaldi: A decentralized network coordinate system. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[18]
Dabek, F., Zhao, B., Druschel, P., Kubiatowicz, J., and Stoica, I. 2003. Towards a common API for structured P2P overlays. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS).
[19]
DNS 1987. Domain names-implementation and specification. http://www.ietf.org/rfc/rfc1035.txt.
[20]
Douceur, J. R. 2002. The Sybil attack. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS).
[21]
Ferguson, D., Nikolaou, C., Sairamesh, J., and Yemini, Y. 1996. Economic Models for Allocating Resources in Computer Systems. World Scientific (Scott Clearwater, Ed.).
[22]
Foster, I. and Kesselman, C. 2003. The Grid 2. Morgan Kaufmann.
[23]
Foster, I., Kesselman, C., and Tuecke, S. 2001. The anatomy of the grid: Enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15, 3, 200--222.
[24]
Fu, Y., Chase, J., Chun, B., Schwab, S., and Vahdat, A. 2003. SHARP: An architecture for secure resource peering. In Proceedings of the SIGOPS Symposium on Operating Systems Principles (SOSP).
[25]
Gupta, A., Agrawal, D., and Abbad, A. E. 2003. Approximate range selection queries in peer-to-peer systems. In Proceedings of the Conference on Innovative Data Systems Research (CIDR).
[26]
Huang, A. and Steenkiste, P. 2003. Network-Sensitive service discovery. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS).
[27]
Huebsch, R. 2004. PlaneTlab application manager. http://appmanager.berkeley.intel-research.net/.
[28]
Huebsch, R., Hellerstein, J. M., Boon, N. L., Loo, T., Shenker, S., and Stoica, I. 2003. Querying the Internet with PIER. In Proceedings of the International Conference on Very Large Databases (VLDB).
[29]
Ibaraki, T. and Katoh, N. 1988. Resource Allocation Problems: Algorithmic Approaches. MIT Press, Cambridge, MA.
[30]
Jagadish, H. V. 1990. Linear clustering of objects with multiple attributes. In Proceedings of the ACM SIGMOD International Conference on Management of Data.
[31]
Jini. 1998. Jini homepage. http://java.sun.com/products/jini.
[32]
Karger, D. and Ruhl, M. 2004. Simple efficient load balancing algorithms for peer-to-peer systems. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS).
[33]
Kazaa. 2001. Kazaa homepage. http://www.kazaa.com/us/index.htm.
[34]
Kee, Y.-S., Logothetis, D., Huang, R., Casanova, H., and Chien, A. 2005. Efficient resource description and high quality selection for virtual grids. In Proceedings of the IEEE International Symposium on Cluster Computing and the Gird (CCGrid).
[35]
Krishnamurthy, B. and Wang, J. 2000. On network-aware clustering of Web clients. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[36]
LDAP 1997. LDAP homepage. http://www.ietf.org/rfc/rfc2251.txt.
[37]
Li, J., Stribling, J., Morris, R., Kaashoek, M. F., and Gil, T. M. 2005. A performance vs. cost framework for evaluating DHT design tradeoffs under churn. In Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM).
[38]
Linux VServer. 2003. VServer homepage. http://linux-vserver.org/.
[39]
Litzkow, M., Livny, M., and Mutka, M. 1988. Condor--A hunter of idle workstations. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS).
[40]
Liu, C. and Foster, I. 2004. A constraint language approach to matchmaking. In Proceedings of the IEEE International Workshop on Research Issues in Data Engineering (RIDE).
[41]
Liu, C., Yang, L., Foster, I., and Angulo, D. 2002. Design and evaluation of a resource selection framework. In Proceedings of the International Symposium on High Performance Distributed Computing (HPDC).
[42]
Massie, M., Chun, B., and Culler, D. 2004. The Ganglia distributed monitoring system: Design, implementation, and experience. Parallel Comput. 30, 7 (Jul.).
[43]
Nath, S., Ke, Y., Gibbons, P. B., Karp, B., and Seshan, S. 2003. IrisNet: An architecture for enabling sensor-enriched Internet services. Tech. Rep. IRP-TR-03-04, Intel Research, Pittsburgh, Pennsylvania. June.
[44]
Ng, T. S. E. and Zhang, H. 2002. Predicting Internet network distance with coordinates-based approaches. In Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM).
[45]
Ng, T. S. E. and Zhang, H. 2004. A network positioning system for the Internet. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC).
[46]
Oppenheimer, D., Chun, B., Patterson, D., Snoeren, A. C., and Vahdat, A. 2006. Service placement in shared wide-area platforms. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC).
[47]
Pai, V. 2008. CoTop: A slice-based top for PlanetLab. http://codeen.cs.princeton.edu/cotop/.
[48]
Pai, V. S., Wang, L., Park, K., Pang, R., and Peterson, L. 2003. The dark side of the Web: An open proxy's view. In Proceedings of the Workshop on Hot Topics in Networks (HotNets).
[49]
Ramabhadran, S., Ratnasamy, S., Hellerstein, J. M., and Shenker, S. 2004. Prefix hash tree. In Proceedings of the Annual ACM SIGOPS Symposium on Principles of Distributed Computing (PODC).
[50]
Raman, R., Livny, M., and Solomon, M. 1998. Matchmaking: Distributed resource management for high throughput computing. In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC).
[51]
Raman, R., Livny, M., and Solomon, M. 2003. Policy driven heterogeneous resource co-allocation with gangmatching. In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC).
[52]
Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. 2001. A content addressable network. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[53]
Red Herring Magazine. 2004. Distributed computing: We come in peace. Red Herring Mag. (Aug.).
[54]
Reynolds, P. and Vahdat, A. 2003. Efficient peer-to-peer keyword searching. In Proceedings of the ACM/IFIP/USENIX International Middleware Conference.
[55]
Rhea, S., Chun, B.-G., Kubiatowicz, J., and Shenker, S. 2005. Fixing the embarrassing slowness of OpenDHT on PlanetLab. In Proceedings of the Conference on Real, Large Distributed Systems (WORLDS).
[56]
Rhea, S., Geels, D., Roscoe, T., and Kubiatowicz, J. 2004. Handling churn in a DHT. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC).
[57]
Rhea, S., Godfrey, B., Karp, B., Kubiatowicz, J., Ratnasamy, S., Shenker, S., Stoica, I., and Yu, H. 2005. OpenDHT: A public DHT service and its uses. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[58]
SLP. 1987. SLP. http://www.ietf.org/rfc/rfc2165.txt.
[59]
Spence, D. and Harris, T. 2003. XenoSearch: Distributed resource discovery in the XenoServer open platform. In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC).
[60]
Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of the ACM SIGCOMM Data Communications Conference.
[61]
Tang, C. and Dwarkadas, S. 2004. Hybrid global-local indexing for efficient peer-to-peer information retrieval. In Proceedings of the ACM Symposium on Networked Systems Design and Implementation (NSDI).
[62]
Tang, C., Xu, Z., and Mahalingam, M. 2003. pSearch: Information retrieval in structured overlays. ACM SIGCOMM Comput. Commun. Rev. 33, 1, 89--94.
[63]
Vahdat, A., Yocum, K., Walsh, K., Mahadevan, P., Kostić, D., Chase, J., and Becker, D. 2002. Scalability and accuracy in a large-scale network emulator. In Proceedings of the ACM USENIX Symposium on Operating Systems Design and Implementation (OSDI).
[64]
van Renesse, R., Birman, K., and Vogels, W. 2003. Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining. ACM Trans. Comput. Syst. 21, 2, 164--206.
[65]
Wawrzoniak, M., Peterson, L., and Roscoe, T. 2003. Sophia: An information plane for networked systems. In Proceedings of the Workshop on Hot Topics in Networking (HotNets).
[66]
White, B., Lepreau, J., Stoller, L., Ricci, R., Guruprasad, S., Newbold, M., Hibler, M., Barb, C., and Joglekar, A. 2002. An integrated experimental environment for distributed systems and networks. In Proceedings of the ACM USENIX Symposium on Operating Systems Design and Implementation (OSDI).
[67]
Zhang, X. and Schopf, J. 2004. Performance Analysis of the Globus toolkit monitoring and discovery service, MDS2. In Proceedings of the International Workshop on Middleware Performance (MP).

Cited By

View all
  • (2023)SmartORC: smart orchestration of resources in the compute continuumFrontiers in High Performance Computing10.3389/fhpcp.2023.11649151Online publication date: 25-Oct-2023
  • (2023)An Autonomous Resource Management Model towards Cloud MorphingProceedings of the 6th International Workshop on Edge Systems, Analytics and Networking10.1145/3578354.3592864(7-12)Online publication date: 8-May-2023
  • (2021)Determination of Suitable Resource Discovery Tool and Methodology for High-Volume Internet of Things (IoT)Journal of Physics: Conference Series10.1088/1742-6596/1874/1/0120461874:1(012046)Online publication date: 1-May-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Internet Technology
ACM Transactions on Internet Technology  Volume 8, Issue 4
September 2008
216 pages
ISSN:1533-5399
EISSN:1557-6051
DOI:10.1145/1391949
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 October 2008
Accepted: 01 December 2006
Revised: 01 May 2006
Received: 01 November 2005
Published in TOIT Volume 8, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. PlanetLab
  2. Resource discovery

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)SmartORC: smart orchestration of resources in the compute continuumFrontiers in High Performance Computing10.3389/fhpcp.2023.11649151Online publication date: 25-Oct-2023
  • (2023)An Autonomous Resource Management Model towards Cloud MorphingProceedings of the 6th International Workshop on Edge Systems, Analytics and Networking10.1145/3578354.3592864(7-12)Online publication date: 8-May-2023
  • (2021)Determination of Suitable Resource Discovery Tool and Methodology for High-Volume Internet of Things (IoT)Journal of Physics: Conference Series10.1088/1742-6596/1874/1/0120461874:1(012046)Online publication date: 1-May-2021
  • (2019)Software-Defined Networking for Scalable Cloud-Based Services to Improve System Performance of Hadoop-Based Big Data ApplicationsWeb Services10.4018/978-1-5225-7501-6.ch076(1460-1484)Online publication date: 2019
  • (2019)A Peer-to-Peer Based Cloud Storage Supporting Orthogonal Range Queries of Arbitrary DimensionMolecular Logic and Computational Synthetic Biology10.1007/978-3-030-19759-9_4(46-58)Online publication date: 28-Apr-2019
  • (2018)Resource discovery for distributed computing systems: A comprehensive surveyJournal of Parallel and Distributed Computing10.1016/j.jpdc.2017.11.010113(127-166)Online publication date: Mar-2018
  • (2017)HARDJournal of Network and Computer Applications10.1016/j.jnca.2017.04.01490:C(42-73)Online publication date: 15-Jul-2017
  • (2017)FractalJournal of Network and Computer Applications10.1016/j.jnca.2017.03.02187:C(147-168)Online publication date: 1-Jun-2017
  • (2017)Improving the performance and reproducibility of experiments on large-scale testbeds with k-coresComputer Communications10.1016/j.comcom.2017.05.016110:C(35-47)Online publication date: 15-Sep-2017
  • (2016)VINEAIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.252699927:11(3381-3396)Online publication date: 1-Nov-2016
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media