Fast cluster failover using virtual memory-mapped communication

1 May 1999

proceedings article
Published by Association for Computing Machinery (ACM)

p. 373-382
https://doi.org/10.1145/305138.305215

Abstract

This paper proposes a novel way to use virtual memory- mapped communication (VMMC) to reduce the failover time on clusters. With the VMMC model, applications' virtual address space can be eciently mirrored on remote mem- ory either automatically or via explicit messages. When a machine fails, its applications can restart from the most re- cent checkpoints on the failover node with minimal memory copying and disk I/O overhead. This method requires little change to applications' source code. We developed two fast failover protocols: deliberate update failover protocol (DU) and automatic update failover protocol (AU). The rst can run on any system that supports VMMC, whereas the other requires special network interface support. We implemented these two protocols on two dierent clusters that supported VMMC communication. Our re- sults with three transaction-based applications show that both protocols work quite well. The deliberate update pro- tocol imposes 4-21% overhead when taking checkpoints ev- ery 2 seconds. If an application can tolerate 20% overhead, this protocol can failover to another machine within 4 mil- liseconds in the best case and from 0.1 to 3 seconds in the worst case. The failover performance can be further im- proved by using special network interface hardware. The automatic update protocol is able to take checkpoints every 0.1 seconds with only 3-12% overhead. If 10% overhead is allowed, it can failover applications from 0.01 to 0.4 seconds in the worst case.

Keywords

This publication has 14 references indexed in Scilit:

UTLB
Published by Association for Computing Machinery (ACM) ,1998
Free transactions with Rio Vista
Published by Association for Computing Machinery (ACM) ,1997
The Rio file cache
Published by Association for Computing Machinery (ACM) ,1996
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems, 1996
Myrinet: a gigabit-per-second local area network
IEEE Micro, 1995
RAID: high-performance, reliable secondary storage
ACM Computing Surveys, 1994
Fault tolerance under UNIX
ACM Transactions on Computer Systems, 1989
Optimistic recovery in distributed systems
ACM Transactions on Computer Systems, 1985
Publishing
Published by Association for Computing Machinery (ACM) ,1983
A NonStop kernel
Published by Association for Computing Machinery (ACM) ,1981