Abstract
In the design and operation of service systems, it is important to determine an appropriate level of server utilization (the proportion of time each server should be working). In a multi-server queue with unlimited waiting space, the appropriate server utilization typically increases as the number of servers (and the arrival rate) increases. We explain this economy of scale and give a rough quantitative characterization. We also show how increased variability in the arrival and service processes tends to reduce server utilization with a given grade of service. As part of this analysis, we develop simple approximations for the mean steady-state waiting time and the full steady-state waiting-time distribution. These approximations exploit an infinite-server approximation for the probability of delay and a single-server approximation for the conditional waiting-time distribution given that waiting occurs. The emphasis is on simple formulas that directly convey understanding.

This publication has 0 references indexed in Scilit: