Practical lineage tracing in data warehouses
- 7 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 367-378
- https://doi.org/10.1109/icde.2000.839437
Abstract
We consider the view data lineage problem in a warehousing environment: for a given data item in a materialized warehouse view, we want to identify the set of source data items that produced the view item. We formalize the problem and we present a lineage tracing algorithm for relational views with aggregation. Based on our tracing algorithm, we propose a number of schemes for storing auxiliary views that enable consistent and efficient lineage tracing in a multi-source data warehouse. We report on a performance study of the various schemes, identifying which schemes perform best in which settings. Based on our results, we have implemented a lineage tracing package in the WHIPS data warehousing system prototype at Stanford. With this package, users can select view tuples of interest, then efficiently "drill through" to examine the exact source tuples that produced the view tuples of interest.Keywords
This publication has 14 references indexed in Scilit:
- Selection of views to materialize in a data warehouseIEEE Transactions on Knowledge and Data Engineering, 2005
- Physical database design for data warehousesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Supporting fine-grained data lineage in a database visualization environmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Lineage tracing in a data warehousing systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- An overview of data warehousing and OLAP technologyACM SIGMOD Record, 1997
- View maintenance in a warehousing environmentACM SIGMOD Record, 1995
- Adapting materialized views after redefinitionsPublished by Association for Computing Machinery (ACM) ,1995
- Research problems in data warehousingPublished by Association for Computing Machinery (ACM) ,1995
- Maintaining views incrementallyPublished by Association for Computing Machinery (ACM) ,1993
- On the correct translation of update operations on relational viewsACM Transactions on Database Systems, 1982