Microsoft Cluster Service (MSCS) extends the Win-dows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the presence of node failures. Later ver-sions of MSCS will provide scalability via a node and application management system that allows applications to scale to hundreds of nodes. This paper provides a de-tailed description of the MSCS architecture and the de-sign decisions that have driven the implementation of the service. The paper also describes how some major appli-cations use the MSCS features, and describes features added to make it easier to implement and manage fault-tolerant applications on MSCS.