Abstract
A document type definition (DTD) D defines the structure of elements permitted in any web document valid with respect to D. From a given DTD D we show how to derive a number of simple structural constraints which are implied by D. Using a relational abstraction of web databases, we consider a class of conjunctive queries which retrieve elements from web documents stored in a database D. For simplicity, we assume that all documents in D are valid with respect to the same DTDD. The main contribution of the paper is the use of the constraints derived from D to optimise conjunctive queries on D by removing redundant conjuncts. The relational abstraction allows us to show that the constraints derived from a DTD are equivalent to tuple-generating and equality-generating dependencies which hold on D. Having done so, we can use the chase algorithm to show equivalence between a query and its reduced form.

This publication has 4 references indexed in Scilit: