An algebra for structured office documents
- 1 April 1989
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 7 (2) , 123-157
- https://doi.org/10.1145/65935.65939
Abstract
We describe a data model for structured office information objects, which we generically call “documents,” and a practically useful algebraic language for the retrieval and manipulation of such objects. Documents are viewed as hierarchical structures; their layout (presentation) aspect is to be treated separately. The syntax and semantics of the language are defined precisely in terms of the formal model, an extended relational algebra.The proposed approach has several new features, some of which are particularly useful for the management of office information. The data model is based on nested sequences of tuples rather than nested relations. Therefore, sorting and sequence operations and the explicit handling of duplicates can be described by the model. Furthermore, this is the first model based on a many-sorted instead of a one-sorted algebra, which means that atomic data values as well as nested structures are objects of the algebra. As a consequence, arithmetic operations, aggregate functions, and so forth can be treated inside the model and need not be introduced as query language extensions to the model. Many-sorted algebra also allows arbitrary algebra expressions (with Boolean result) to be admitted as selection or join conditions and the results of arbitrary expressions to be embedded into tuples. In contrast to other formal models, this algebra can be used directly as a rich query language for office documents with precisely defined semantics.Keywords
This publication has 26 references indexed in Scilit:
- Extending relational algebra and relational calculus with set-valued attributes and aggregate functionsACM Transactions on Database Systems, 1987
- A database language for sets, lists and tablesInformation Systems, 1986
- Tuple sequences and lexicographic indexesJournal of the ACM, 1986
- The relational model with relation-valued attributesInformation Systems, 1986
- FORMANAGERACM Transactions on Information Systems, 1984
- The design of Star's records processingACM Transactions on Information Systems, 1983
- Document Formatting Systems: Survey, Concepts, and IssuesACM Computing Surveys, 1982
- Form managementCommunications of the ACM, 1982
- A very high level programming language for data processing applicationsCommunications of the ACM, 1977
- CONVERTCommunications of the ACM, 1975