Estimating the progress of MapReduce pipelines
- 1 January 2010
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636382,p. 681-684
- https://doi.org/10.1109/icde.2010.5447919
Abstract
In parallel query-processing environments, accurate, time-oriented progress indicators could provide much utility given that inter- and intra-query execution times can have high variance. However, none of the techniques used by existing tools or available in the literature provide non-trivial progress estimation for parallel queries. In this paper, we introduce Parallax, the first such indicator. While several parallel data processing systems exist, the work in this paper targets environments where queries consist of a series of MapReduce jobs. Parallax builds on recently-developed techniques for estimating the progress of single-site SQL queries, but focuses on the challenges related to parallelism and variable execution speeds. We have implemented our estimator in the Pig system and demonstrate its performance through experiments with the PigMix benchmark and other queries running in a real, small-scale cluster.Keywords
This publication has 9 references indexed in Scilit:
- Pig latinPublished by Association for Computing Machinery (ACM) ,2008
- ConExPublished by Association for Computing Machinery (ACM) ,2007
- A Lightweight Online Framework For Query Progress IndicatorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- DryadPublished by Association for Computing Machinery (ACM) ,2007
- When can we trust progress estimators for SQL queries?Published by Association for Computing Machinery (ACM) ,2005
- Estimating progress of execution for SQL queriesPublished by Association for Computing Machinery (ACM) ,2004
- Toward a progress indicator for database queriesPublished by Association for Computing Machinery (ACM) ,2004
- Online aggregationPublished by Association for Computing Machinery (ACM) ,1997
- The importance of percent-done progress indicators for computer-human interfacesPublished by Association for Computing Machinery (ACM) ,1985