Chain
- 9 June 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 23, 253-264
- https://doi.org/10.1145/872757.872789
Abstract
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performance. We discuss one strategy for processing bursty streams --- adaptive, load-aware scheduling of query operators to minimize resource consumption during times of peak load. We show that the choice of an operator scheduling strategy can have significant impact on the run-time system memory usage. We then present Chain scheduling, an operator scheduling strategy for data stream systems that is near-optimal in minimizing run-time memory usage for any collection of single-stream queries involving selections, projections, and foreign-key joins with stored relations. Chain scheduling also performs well for queries with sliding-window joins over multiple streams, and multiple queries of the above types. A thorough experimental evaluation is provided where we demonstrate the potential benefits of Chain scheduling, compare it with competing scheduling strategies, and validate our analytical conclusions.Keywords
This publication has 18 references indexed in Scilit:
- Bursty and hierarchical structure in streamsPublished by Association for Computing Machinery (ACM) ,2002
- Continuously adaptive continuous queries over streamsPublished by Association for Computing Machinery (ACM) ,2002
- Rate-based query optimization for streaming information sourcesPublished by Association for Computing Machinery (ACM) ,2002
- Java support for data-intensive systemsACM SIGMOD Record, 2001
- HancockPublished by Association for Computing Machinery (ACM) ,2000
- An adaptive query execution system for data integrationPublished by Association for Computing Machinery (ACM) ,1999
- Memory-adaptive scheduling for large query executionPublished by Association for Computing Machinery (ACM) ,1998
- Cost-based query scrambling for initial delaysPublished by Association for Computing Machinery (ACM) ,1998
- Wide area traffic: the failure of Poisson modelingIEEE/ACM Transactions on Networking, 1995
- On the self-similar nature of Ethernet traffic (extended version)IEEE/ACM Transactions on Networking, 1994