ACM Home Page
Please provide us with feedback. Feedback
Availability management of distributed programs and services
Full text pdf formatPdf (282 KB)
Source IBM Centre for Advanced Studies Conference archive
Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research table of contents
Toronto, Ontario, Canada
Page: 9  
Year of Publication: 1996
Author
Markus Endler  Departamento de Ciência da Computação, IME-Universidade de São Paulo, São Paulo, Brazil
Sponsors
CRSNG : Natural Sci and EngRch Council of Canada
IBM Canada : IBM Canada
NRC : National Research Council - Canada
Publisher
IBM Press 
Bibliometrics
Downloads (6 Weeks): 0,   Downloads (12 Months): 16,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   

ABSTRACT

Modern distributed applications pose increasing demands for high availability, automatic management, and dynamic configuration of their software systems. This paper presents the architecture of Sampa, a System for Availability Management of Process-based Applications, which aims at fulfilling these requirements. The system has been designed to support the management of fault-tolerant DCE-based distributed programs according to user-provided and application-specific availability specifications. It is supposed to detect and automatically react to faults such as node crashes, network partitions, process crashes, and hang-ups. In this paper, we focus on the design of some of its services - the monitoring, checkpointing, and configuration management facilities - and show how they can be used for managing a generic fault-tolerant service.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
{1} J.S. Auerbach et al. Concert/C Tutorial and User Guide: An Introduction to a Language for Distributed C Programming. Technical report, IBM T. J. Watson Research Center, January 1995.
 
2
{2} T. Becker. Application-Transparent Fault Tolerance in Distributed Systems. In Proc. of 2nd. Int. Workshop on Configurable Distributed Systems, pages 36-45, March 1994.
3
 
4
 
5
{5} M. Endler and A. D'Souza. Supporting Distributed Application Management in Sampa. Technical Report RT-MAC-9516, IME/USP, November 1995.
 
6
{6} J.W. Hong and M.A. Bauer. A Generic Management Framework for Distributed Applications. In Proc. 1st Int. Workshop on System Management, pages 63-71. IEEE, April 1993.
 
7
{7} Y. Huang and C. Kintala. Software Fault Tolerance in the Application Layer, chapter 10. John Wiley & Sons, 1995.
 
8
{8} D.B. Johnson and W. Zwaenepoel. Sender-Based Message Logging. In Proc. of 7th. Int. Symposium on Fault-Tolerant Computing , pages 14-19, July 1987.
 
9
{9} J. Kramer. Configuration Programming - A Framework for the Development of Distributable Systems. In Proc. IEEE Int. Conf. on Computer Systems and Software Engineering (CompEuro90), Tel Aviv, Israel, May 1990.
 
10
{10} J. Magee, N. Dulay, and J. Kramer. Structuring parallel and distributed programs. In Proc. of the Int. Workshop on Configurable Distributed Systems, pages 102-117. IEE, March 1992.
 
11
 
12
 
13
 
14
 
15
 
16
 
17


Peer to Peer - Readers of this Article have also read: