ACM Home Page
Please provide us with feedback. Feedback
The notification based approach to implementing failure detectors in distributed systems
Full text PdfPdf (263 KB)
Source ACM International Conference Proceeding Series; Vol. 152 archive
Proceedings of the 1st international conference on Scalable information systems table of contents
Hong Kong
Article No. 14  
Year of Publication: 2006
ISBN:1-59593-428-6
Authors
Jin Yang  Hong Kong Polytechnic University, Hung Hom, Kowloon Hong Kong
Jiannong Cao  Hong Kong Polytechnic University, Hung Hom, Kowloon Hong Kong
Weigang Wu  Hong Kong Polytechnic University, Hung Hom, Kowloon Hong Kong
Corentin Travers  IRISA, University de Rennes, France
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 50,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1146847.1146861
What is a DOI?

ABSTRACT

Failure Detector (FD) is the fundamental component of fault tolerant computer systems. In recent years, many research works have been done on the study of QoS and implementation of FDs for distributed computing environments. Almost all of these works are based on the heartbeat approach (HBFD). In this paper, we propose a general model for implementing FDs which separates the processes to be monitored from the underlying running environment. We identify the potential problems of HBFD approach and propose an alternative approach to implementing FDs, called notification based FD (NTFD). Instead of letting the process periodically send heartbeat messages to show it is still alive, in NTFD, the underlying watchdog mechanism sends failure notification messages only when the failure of a monitored process is detected locally. Compared with HBFD implementation under our model, NTFD is more efficient and scalable, and can guarantee the strong accuracy property. Trade-off of achieving QoS of FD is analyzed and the results show that NTFD has much higher probability to achieve a better balance between completeness and accuracy, yet provides a much lower probability of false report and lower system cost. Based on the analysis, we propose the design of a hybrid FD which combines the advantages of HBFD and NTFD.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
 
5
 
6
7
 
8
Robbert van Renesse, Yaron Minsky, and Mark Hayden. "A gossip-style failure detection service." In Proceedings of Middleware '98, September 1998.
9
 
10
 
11
 
12
Bertier, M.; Marin, O.; Sens, P., "Performance analysis of a hierarchical failure detector" Dependable Systems and Networks, 2003. Proceedings. 2003 International Conference on 22--25 June 2003 Page(s):635--644
 
13
 
14
 
15
 
16
Szu-Chi Wang; Sy-Yen Kuo, "Communication Strategies for Heartbeat-Style Failure Detectors in Wireless Ad Hoc Networks",. In Proceedings of International Conference on Dependable Systems and Networks, 2003. 22--25 June 2003
17
 
18
 
19
Yennun Huang; Chandar Kintala; "Software fault tolerance in the application layer" Software Fault tolerance, John Wiley & Sons Ltd. 1995
 
20

Collaborative Colleagues:
Jin Yang: colleagues
Jiannong Cao: colleagues
Weigang Wu: colleagues
Corentin Travers: colleagues