skip to main content
10.1145/1065944.1065971acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article

Trust but verify: monitoring remotely executing programs for progress and correctness

Published: 15 June 2005 Publication History

Abstract

The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper describes the GridCop system which allows a computation on a remote, and potentially fraudulent, host system to be monitored for progress and execution correctness. A novel feature of our system is that it constructs cooperating submitter and host programs from the original program, and these programs allow both progress and execution correctness to be monitored with negligible overhead while providing protection against common fraudulent behaviors. Experimental results show that the overhead of this monitoring is low on both the submitting and host machines. We describe compiler algorithms that allow the required monitoring code to be automatically generated.

References

[1]
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]]
[2]
A. R. Butt, S. Adabala, N. H. Kapadia, R. J. Figueiredo, and J. A. B. Fortes. Grid-computing Portals and Security Issues. Journal of Parallel and Distributed Computing: Special issue on Scalable Web Services and Architecture, 63(10), October 2003.]]
[3]
A. R. Butt, X. Fang, Y. C. Hu, and S. Midkiff. Java, Peer-to-Peer, and Accountability: Building Blocks for Distributed Cycle Sharing. In Proceedings of the 3rd USENIX Virtual Machines Research and Technology Syposium (VM '04), May 2004.]]
[4]
M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron. Exploiting Network Proximity in Peer-to-Peer OverlayNetworks. Technical report, Technical Report MSR-TR-2002-82, 2002.]]
[5]
D. Cheng and R. Hood. A Portable Debugger for Parallel and Distributed Programs. In Proceedings of the 1994 ACM/IEEE conference on Supercomputing (SC'94), November 1994.]]
[6]
M. J. Clement and M. J. Quinn. Analytical Performance Prediction on Multicomputers. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, November 1993.]]
[7]
L. P. Cox and B. D. Noble. Samsara: Honor Among Thieves in Peer-to-Peer Storage. In Proc. 19th ACM Symposium on Operating Systems Principles, October 2003.]]
[8]
A. P. David. BOINC:A System for Public-Resource Computing and Storage. In Proc. 5th IEEE/ACM International Workshop on Grid Computing, November 2004.]]
[9]
W. Du, J. Jia, M. Mangal, and M. Murugesan. Uncheatable Grid Computing. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04), March 2004.]]
[10]
I. Foster, A. Roy, and V. Sander. A Quality of Service Architecture that Combines Resource Reservation and Application Adaptation. In Proc. 8th International Workshop on Quality of Service, June 2000.]]
[11]
Genome@home. Genome at home. http://www.stanford.edu/group/pandegroup/genome/index.html (December 16, 2004).]]
[12]
M. Gupta, S. Midkiff, E. Schonberg, V. Seshadri, D. Shields, K.-Y. Wang, W.-M. Ching, and T. Ngo. An HPF Compiler for the IBM SP2. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing (CDROM). ACM Press, December 1995.]]
[13]
X. Jiang and D. Xu. Collapsar: A VM-Based Architecture for Network Attack Detention Center. In Proceedings of the 13th USENIX Security Symposium (Security'04), August 2004.]]
[14]
S. Kannan, M. Roberts, P. Mayes, D. Brelsford, and J. F. Skovira. Workload Management with LoadLeveler. IBM International Technical Support Organization, 2001. http://www.ibm.com/redbooks (Dec. 17, 2004), publication number SG24-6038-00.]]
[15]
K. Knobe and V. Sarkar. Array SSA Form and Its Use in Parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages (POPL), January 1998.]]
[16]
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. The High Performance Fortran Handbook. MIT Press, 1993.]]
[17]
C. Koelbel and P. Mehrotra. Compiling Global Name-space Parallel Loops for Distributed Execution. IEEE Transactions on Parallel and Distributed Systems, 2(4):440--451, October 1991.]]
[18]
Y.-J. Lee and M. Hall. A Code Isolator: Isolating Code Fragments from Large Programs. In 17th Workshop on Languages and Compilers for Parallel Computing (LCPC '04), September 2004.]]
[19]
S.-W. Liao, A. Diwan, R. P. Bosch, Jr., A. Ghuloum, and M. S. Lam. SUIF Explorer: an Interactive and Interprocedural Parallelizer. In Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM Press, May 1999.]]
[20]
M. Litzkow, M. Livny, and M. Mutka. Condor - A Hunter of Idle Workstations. In Proc. 8th International Conference on Distributed Computing Systems (ICDCS 1988), June 1988.]]
[21]
M. Mock, D. C. Atkinson, C. Chambers, and S. J. Eggers. Improving Program Slicing with Dynamic Points-to Data. In Proceedings of the tenth ACM SIGSOFT Symposium on Foundations of software engineering. ACM Press, November 2002.]]
[22]
Nile. Scalable Solution for Distributed Processing of Independant Data. http://www.nile.cornell.edu/index.html (September 29, 2003).]]
[23]
D. D. Redell. Experience with Topaz Teledebugging. In Proceedings SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988.]]
[24]
R. L. Rivest. RFC 1321 --MD5 Message-Digest Algorithm, 1992.]]
[25]
A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001.]]
[26]
L. F. Sarmenta. Sabotage Tolerance Mechanism for Volunteer Computing Systems. In CCGrid'01, May 2001.]]
[27]
SETI@home. Search for extraterrestrial intelligence at home. http://setiathome.ssl.berkeley.edu/index.html (December 16, 2004).]]
[28]
L. A. Smith, J. M. Bull, and J. Obdrzalek. A Parallel Java Grande Benchmark Suite. In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (SC2001), November 2001.]]
[29]
K.-Y. Wang. Precise compile-time performance prediction for superscalar-based computers. In PLDI '94: Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation, June 1994.]]
[30]
M. Weiser. Program slicing. In Proceedings of the 5th international conference on Software engineering. IEEE Press, 1981.]]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
June 2005
310 pages
ISBN:1595930809
DOI:10.1145/1065944
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. correctness verification
  2. cycle-sharing
  3. grid computing
  4. progress monitoring
  5. security
  6. trustworthiness

Qualifiers

  • Article

Conference

PPoPP05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2011)EVEProceedings of the 3rd USENIX conference on Hot topics in cloud computing10.5555/2170444.2170455(11-11)Online publication date: 14-Jun-2011
  • (2011)GRaceACM SIGPLAN Notices10.1145/2038037.194157446:8(135-146)Online publication date: 12-Feb-2011
  • (2011)GRaceProceedings of the 16th ACM symposium on Principles and practice of parallel programming10.1145/1941553.1941574(135-146)Online publication date: 12-Feb-2011
  • (2011)Agreeing on Role Adoption in Open OrganisationsKI - Künstliche Intelligenz10.1007/s13218-011-0152-526:1(37-45)Online publication date: 24-Nov-2011
  • (2010)Accountable virtual machinesProceedings of the 9th USENIX conference on Operating systems design and implementation10.5555/1924943.1924952(119-134)Online publication date: 4-Oct-2010
  • (2010)Group-based adaptive result certification mechanism in Desktop GridsFuture Generation Computer Systems10.1016/j.future.2009.05.02526:5(776-786)Online publication date: 1-May-2010
  • (2008)Application ResilienceProceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid10.1109/CCGRID.2008.99(789-794)Online publication date: 19-May-2008
  • (2007)Portable virtual cycle accounting for large-scale distributed cycle sharing systemsParallel Computing10.1016/j.parco.2007.02.01233:4-5(314-327)Online publication date: 1-May-2007
  • (2006)Monitoring remotely executing shared memory programs in software DSMsProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898953.1898973(37-37)Online publication date: 25-Apr-2006
  • (2006)CycleMeterProceedings of the 2006 ACM/IEEE conference on Supercomputing10.1145/1188455.1188584(124-es)Online publication date: 11-Nov-2006
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media