Data patterns in multiple botanical descriptions: Implications for automatic processing of legacy data
- 1 August 2003
- journal article
- Published by Taylor & Francis in Systematics and Biodiversity
- Vol. 1 (2) , 151-157
- https://doi.org/10.1017/s1477200003001129
Abstract
An analysis of conventional paper‐based botanical descriptions from Floras was undertaken as part of the development of MultiFlora, a system for the automatic production of a queryable database from such legacy data. The descriptions of five species of Ranunculus L. (buttercups and crowfoots) in six different English language Floras, from Europe and North America, show a surprising lack of uniformity in the suite of properties described. There is also considerable variation in the way property states are recorded. These findings have implications for the automatic production of taxonomic databases. This study is a proof of concept exercise, in which the taxa used are of negligible importance in themselves.Keywords
This publication has 7 references indexed in Scilit:
- The Quiet Revolution: Biodiversity Informatics and the InternetScience, 2000
- The Prometheus Taxonomic Model: a practical approach to representing multiple classificationsTaxon, 2000
- Basic properties for biological databases: Character development and supportMathematical and Computer Modelling, 1997
- Software infrastructure for natural language processingPublished by Association for Computational Linguistics (ACL) ,1997
- Information extractionCommunications of the ACM, 1996
- Automatic analysis of descriptive textsPublished by Association for Computational Linguistics (ACL) ,1983
- A GENERAL SYSTEM FOR CODING TAXONOMIC DESCRIPTIONSTaxon, 1980