Robustness in complex systems
- 24 August 2005
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
This paper argues that a common design paradigm for systems is fundamentally flawed, resulting in unstable, unpredictable behavior as the complexity of the system grows. In this flawed paradigm, designers carefully attempt to predict the operating environment and failure modes of the system in order to design its basic operational mechanisms. However, as a system grows in complexity, the diffuse coupling between the components in the system inevitably leads to the butterfly effect, in which small perturbations can result in large changes in behavior. We explore this in the context of distributed data structures, a scalable, cluster-based storage server. We then consider a number of design techniques that help a system to be robust in the face of the unexpected, including overprovisioning, admission control, introspection, adaptivity through closed control loops. Ultimately, however, all complex systems eventually must contend with the unpredictable. Because of this, we believe systems should be designed to cope with failure gracefully.Keywords
This publication has 9 references indexed in Scilit:
- ISTORE: introspective storage for data-intensive network servicesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Self-monitoring and self-adapting operating systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- An adaptive query execution system for data integrationPublished by Association for Computing Machinery (ACM) ,1999
- Cluster I/O with RiverPublished by Association for Computing Machinery (ACM) ,1999
- Origins of Internet routing instabilityPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Efficient mid-query re-optimization of sub-optimal query execution plansPublished by Association for Computing Machinery (ACM) ,1998
- Lazy receiver processing (LRP)Published by Association for Computing Machinery (ACM) ,1996
- The synchronization of periodic routing messagesIEEE/ACM Transactions on Networking, 1994
- THE ESSENCE OF CHAOSPublished by Taylor & Francis ,1993