Data diving with cross‐validation: an investigation of broad‐scale gradients in Swedish weed communities
- 24 December 1999
- journal article
- research article
- Published by Wiley in Journal of Ecology
- Vol. 87 (6) , 1037-1051
- https://doi.org/10.1046/j.1365-2745.1999.00413.x
Abstract
Summary: 1 Multivariate analysis of complex data sets is plagued by problems of subjectivity and of finding statistically valid ways to test a large number of plausible hypotheses. We show how patterns in the data can be identified (data diving) as well as rigorously tested statistically by subdividing the data set. 2 We analysed data on weed biomass and environmental variables from more than 2000 plots in cereal and oil‐seed crops in Sweden during 1970–94. Half the data set was used in an exploratory phase while the other half was used in a subsequent confirmatory phase. 3 The exploratory analyses included multivariate statistics [detrended correspondence analysis (DCA) and canonical correspondence analysis (CCA)] with various options and combinations of variables, and led to the formation of hypotheses that were then tested. 4 We tested the hypotheses in a sequential analysis with CCA and Monte Carlo permutation tests: after establishing the influence of one set of environmental variables, this set was covaried out in subsequent analyses. In this way the influence of (i) season of sowing of the crop; (ii) geographical region; (iii) soil type; (iv) crop species; and (v) temporal trends was tested. The four latter were tested separately for spring‐ and autumn‐sown crops. 5 The sowing season of the crop had an overwhelming influence on the weed flora, and many weed species, both annual and perennial, showed strong associations with either autumn or spring. There were significant differences in weed flora composition between the geographical regions and soil types as well as between crop species. There were significant temporal trends only in the weed flora of autumn‐sown crops. 6 This study provides a protocol that combines exploratory ‘data diving’ with strict hypothesis testing using direct gradient analysis methods such as CCA. Such two‐phase analysis should improve the way complex data are analysed and patterns are interpreted.Keywords
This publication has 23 references indexed in Scilit:
- Ecological interpretation of weed flora dynamics under different tillage systemsAgriculture, Ecosystems & Environment, 1997
- Seasonal variation in dormancy and light sensitivity in buried seeds of eight annual weed speciesCanadian Journal of Botany, 1997
- Predicting Invasions of Woody Plants Introduced into North AmericaConservation Biology, 1997
- Decline of the Flora in Danish Arable FieldsJournal of Applied Ecology, 1996
- Environmental factors including management practices as correlates of weed community composition in spring seeded cropsCanadian Journal of Botany, 1992
- Soil properties affecting the distribution of 37 weed species in Danish fieldsWeed Research, 1991
- The role of light and alternating temperatures on germination of Polygonum aviculare seeds exhumed on various datesWeed Research, 1990
- Canonical Correspondence Analysis: A New Eigenvector Technique for Multivariate Direct Gradient AnalysisEcology, 1986
- Seasonal variation in the emergence of annual weeds — an introductory investigation in SwedenWeed Research, 1983
- Multivariate Analysis in Community EcologyPublished by Cambridge University Press (CUP) ,1982