Computation-at-risk: employing the grid for computational risk management
- 6 April 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
This work expands upon our earlier work involving the concept of computation-at-risk (CaR). In particular, CaR refers to the risk that certain computations may not get done within a timely manner. We examine a number of CaR distributions on several large clusters. The important contribution of This work is that it shows that there exist CaR-reducing strategies and by employing such strategies, a facility can significantly reduce the risk of inefficient resource utilization. Grids are shown to be one means for employing a CaR-reducing strategy. For example, we show that a CaR-reducing strategy applied to a common queue can have a dramatic effect on the wait times for jobs on a grid of clusters. In particular, we defined a CaR Sharpe rule that provides a decision rule for determining the best machine in a grid to place a new job.Keywords
This publication has 9 references indexed in Scilit:
- Computation-at-risk: assessing job portfolio management risk on clustersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Quelling queue stormsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Price-at-Risk: A methodology for pricing utility computing servicesIBM Systems Journal, 2004
- Hierarchical Dynamics, Interarrival Times, and PerformancePublished by Association for Computing Machinery (ACM) ,2003
- Fair share on high performance computing systems: what does fair really mean?Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Adaptive Selection of Partition Size for Supercomputer RequestsPublished by Springer Nature ,2000
- The EASY — LoadLeveler API projectPublished by Springer Nature ,1996
- A fair share schedulerCommunications of the ACM, 1988
- A Fast Fractional Gaussian Noise GeneratorWater Resources Research, 1971