Triage: performance isolation and differentiation for storage systems
- 13 November 2004
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Ensuring performance isolation and differentiation among workloads that share a storage infrastruc- ture is a basic requirement in consolidated data centers. Existing management tools rely on resource provisioning to meet performance goals; they depend on detailed knowledge of the system characteris- tics and the workloads. Provisioning is inherently slow to react to system and workload dynamics, and in the general case, it is impossible to provision for the worst case. We propose a software-only solution that ensures predictable performance for storage access. It is applicable to a wide range of storage systems and makes no assumptions about workload characteristics. We use an on-line feedback loop with an adaptive controller that throttles storage access requests to ensure that the available system throughput is shared among workloads according to their performance goals and their relative importance. The controller considers the system as a "black box" and adapts automatically to system and workload changes. The controller is distributed to ensure high availability under overload conditions, and it can be used for both block and file access protocols. The evalua- tion of Triage, our experimental prototype, demonstrates workload isolation and differentiation, in an overloaded cluster file-system where workloads and system components are changing.Keywords
This publication has 13 references indexed in Scilit:
- Triage: performance isolation and differentiation for storage systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Performance virtualization for large-scale storage systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Analysis and design of admission control in Web-server systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- End-to-end utilization control in distributed real-time systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- CacheCOWPublished by Association for Computing Machinery (ACM) ,2003
- Managing Web server performance with AutoTune agentsIBM Systems Journal, 2003
- Performance guarantees for Web server end-systems: a control-theoretical approachIEEE Transactions on Parallel and Distributed Systems, 2002
- Using Control Theory to Achieve Service Level Objectives In Performance ManagementReal-Time Systems, 2002
- Using fuzzy control to maximize profits in service level managementIBM Systems Journal, 2002
- NFS Version 3 Protocol SpecificationPublished by RFC Editor ,1995