Question: The heterogeneous origin of the data in large phytosociological databases may seriously influence the results of their analysis. Therefore we propose some strategies for stratified resampling of such databases, which may improve the representativeness of the data. We also explore the effects of different resampling options on vegetation classification.
Methods: We used 6050 plot samples (relevés) of mesic grasslands from the Czech Republic. We stratified this database using (1) geographical stratification in a grid; (2) habitat stratification created by an overlay of digital maps in GIS; (3) habitat stratification with strata defined by traditional phytosociological associations; (4) habitat stratification by numerical classification and (5) habitat stratification by Ellenberg indicator values. Each time we resampled the database, taking equal numbers of relevés per stratum. We then carried out cluster analyses for the resampled data sets and compared the resulting classifications using a newly developed procedure.
Results: Random resampling of the initial data set and geographically stratified resampling resulted in similar classifications. By contrast, classifications of the resampled data sets that were based on habitat stratifications (2–5) differed from each other and from the initial data set. Stratification 2 resulted in classifications that strongly reflected environmental factors with a coarse grain of spatial heterogeneity (e.g. macroclimate), whereas stratification 5 resulted in classifications emphasizing fine-grained factors (e.g. soil nutrient status). Stratification 3 led to the most deviating results, possibly due to the subjective nature of the traditional phytosociological classifications.
Conclusions: Stratified resampling may increase the representativeness of phytosociological data sets, but different types of stratification may result in different classifications. No single resampling strategy is optimal or superior: the appropriate stratification method must be selected according to the objectives of specific studies.
Abbreviations: ASS = Phytosociological association; ELL = Ellenberg indicator values; GEO = Geographical stratification; GIS = Geographical information system; NUM = Numerical classification; RAN = Random resampling.