Flux: an adaptive partitioning operator for continuous query systems
- 6 May 2004
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
The long-running nature of continuous queries coupled with their high scalability requirements poses new challanges for dataflow processing. CQ systems execute pipelined dataflows that are shared across multiple queries and whose scalability is limited by their constituent, stateful operators -- e.g. a windowed groupby-aggregate. To scale such operators, a natural solution is to partition them across a shared-nothing platform. But in the CQ context, traditional, static techniques for partitioned parallelism can exhibit detrimental imbalances as workload and runtime conditions evolve. Long-running CQ dataflows must continue to function robustly in the face of these imbalances. To address this challenge, we introduce a dataflow operator called Flux that encapsulates adaptive state partitioning and dataflow routing. Flux is placed between producer-consumer stages in a dataflow pipeline to repartition stateful operators while the pipeline is still executing. We present the Flux architecture, along with repartitioning policies that can be used for CQ operators under shifting processing and memory loads. We show that the Flux mechanism and these policies can provide several factors improvement in throughput, and orders of magnitude improvement in average latency over the static case.Keywords
This publication has 18 references indexed in Scilit:
- Dynamic and load-balanced task-oriented database query processing in parallel systemsPublished by Springer Nature ,2005
- Dynamic query re-optimizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Parallel sorting on a shared-nothing architecture using probabilistic splittingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Continuously adaptive continuous queries over streamsPublished by Association for Computing Machinery (ACM) ,2002
- EddiesPublished by Association for Computing Machinery (ACM) ,2000
- Cluster I/O with RiverPublished by Association for Computing Machinery (ACM) ,1999
- AlphaSortACM SIGMOD Record, 1994
- Query evaluation techniques for large databasesACM Computing Surveys, 1993
- Encapsulation of parallelism in the Volcano query processing systemPublished by Association for Computing Machinery (ACM) ,1990
- A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environmentACM SIGMOD Record, 1989