Species response curves in JUICE
David Zelený & Lubomír Tichý
Institute of Botany and Zoology, Masaryk University Brno
Warning: This function in JUICE is not maintained any more. Instead, please, refere to analogical JUICE-R function for calculation of species response curves.
Determination of species response on studied gradient represents one of the basic tasks in ecology. Response curve allows estimation of species optimum and also niche width (tolerance), identifying species as generalist or specialist. Most of widely used statistical methods assume that species response on gradient have symmetrical bell shape of Gaussian curve, even if number of studies showed that this type of response occurs in real data quite rarely . Several methods dealing with problem of modeling of asymmetric species response curves were discussed in study of Oksanen & Minchin (2002a), which was together with detailed technical description in Oksanen & Minchin (2002b) taken as the base for routine built in the JUICE software and calculated in externally running R package software.
Methods for modeling response curves available in JUICE include:
1) Bell-shaped (GAUSSian) curve with traditionally symmetrical shape,
2) Generalized linear models (GLM) with polynom of 1st to 3rd degree,
3) Generalized additive models (GAM) models with optional degrees of freedom 3 to 5,
4) Huisman-Olff-Fresco models (HOF) - hierarchical set of five models with increasing complexity.
Bell-shaped curve is included mainly for reference, to show what the response shape would look like if using Gaussian model. Other three options allow quite flexible expression of different response curve shapes, each having different constrains, advantages and disadvantages. GLM models offer curves, which are completely described by equation with given number of parameters; however, the shape of its response curve is quite often somehow inappropriate or unrealistic. The GAM models are more flexible in terms of curves' shape, but their equation is non-parametric and not easily expressible. Perhaps the best option from ecological point of view is using HOF models, which are designed with stress to demands one would expect from species response curves - for example, they express only unimodal response shape, which is usually the most appropriate way when searching for species optimum on gradient (GLM and GAM with low number of degrees of polynom or freedom, respectively, can't really produce bimodal response shape, but can easily produce 'semi-bimodal' shape with inexplicit interpretation - see Fig. 3).
Available information about species is reduced only to presence-absence, even if all methods can handle also percentage or somehow transformed data. It has two reasons - first, resulting curves from presence-absence data are more 'pretty', with shapes giving more straightforward interpretation. Second, removal of information about dominance from species data makes the interpretation of resulting response curve more clear. Information about species abundance (or cover, respectively) is affected by complex factors, including competitive relations, species morphology and other biotic aspects, which all together doesn't need to be easily interpretable; on the other hand, these factors are at least partly removed, when the presence-absence transformation is used (Austin, 2002).
As any model, also response curves are just simplification of reality and their shape is strongly dependent on available dataset. One of the main assumption, which doesn't need to meet reality, is the unimodal type of species response - means that species has only one optimum along tested gradient. Even if bimodal response of species could have interesting and meaningful interpretation, it's somehow tricky in terms of determination of optimum and species tolerance, and needs to be evaluated individually for particular species. Therefore, only unimodal or monotone response of species is considered here.
Technical notes to particular modeling strategies:
1) Bell-shaped response curve is not based on the classical Gaussian equation, but on simplified equivalent polynomial model (ter Braak & Looman, 1986, Oksanen & Minchin, 2002a), which can be easily fit using generalized linear models (with logistic link function for presence-absence binomial data) and gives results close to real Gaussian curve.
2) Generalize linear models (GLM) are included in standard R package. Available models are linear, exponential and cubistic (polynom degree 1, 2 and 3, resp.), as higher polynomial degree of models would result in more then unimodal response shape. Logit link function is used; selection of model can be done manually or automatically based on AIC test criterion, selecting the model with the lowest deviance of data.
3) General additive models (GAM) are included in library 'mgcv' also available in R package. Models with fixed degree of freedom 3, 4 and 5 are available, automatic selection is also based on AIC test criterion.
4) Huisman-Olff-Fresco models (HOF) includes hierarchical set of five models: model I - flat with no response, II - monotone growing, III - monotone growing with 'plateau', IV - symmetric unimodal and V - asymmetric unimodal response. Four parameters are to be estimated - this is done using non-linear maximum likelihood estimation procedure, described in Oksanen & Minchin (2002b) and further developed by Jari Oksanen in 'gravy' package. However, 'gravy' package in some cases gives ecologically unrealistic response, and therefore we used some corrections, described in more details here ('Selfish' HOF response curves).
How species optimum and tolerance are calculated
Optimum is simply the value of gradient, where the species has the highest probability of occurrence based on particular model. If the response curve shows monotone growing, the optimum of species is identical with lowest or highest value of gradient. In case of HOF models, where flat segments of curve can occur (model I and III), optimum is localized in the middle of this flat segments (in case of model I, optimum is simply the middle-point of available gradient values).
Tolerance is determined similar to the method used in Schröder et al. (2005) - it is represented by that part of gradient, where predicted probability of species occurrence is higher than half of maximum value for predicted probability.
How to install
Calculation and drawing of response curves is processed in R software environment (R Development Core Team, 2005), which runs under complete supervision of JUICE (Tichý, 2002). Therefore, additionally to newest version of JUICE program you need to install also R package on your computer.
NOTE: This function is under development and suffers from several bugs - if you have any problems or questions about that, don't hasitate to send an email! (zelenysci.muni.cz, tichysci.muni.cz)
Examples of resulting graphs
Fig.1 - Response curve of Picea abies on the gradient of altitude (based on 10.000 forest relevés data from Czech national phytosociologial database). All available models are selected, with automatic selection of model parameters.
Fig.2 - Distribution of response curves of several most common tree species along altitude calculated by HOF models (the same dataset as in Fig.1).
Fig.3 - Comparison of different models in response curve of Aneura pinguis along gradient of pH. Data have obvious bimodal character, reproduced by both GLM and GAM models. However, interpretetion of optimum and tolerance of these curves is somehow complicated: while GLM and GAM models put optimum to the end of gradient (pH 8.2), HOF model puts optimum into pH 5.6 and Gaussian curve somewhere inbetween - pH 6.9. Here, different models give significantly different results. (Analysis based on data of Hájek et al., ined.)
Austin M.P., 2002. Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecological Modelling 157: 101-118.
Oksanen J., Minchin P.R., 2002a. Continuum theory revisited: what shape are species responses along ecological gradients? Ecological Modelling 157: 119-129.
Oksanen J., Minchin P.R., 2002b. Non-linear maximum likelihood estimation of Beta and HOF response models. URL: http://cc.oulu.fi/~jarioksa/softhelp/hof3.pdf
R Development Core Team (2005). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
Schröder, H. K., Andersen H. E., Kiehl K., 2005. Rejecting the mean: Estimating the response of fen plant species to environmental factors by non-linear quantile regression. Journal of Vegetation Science 16: 373-382.
ter Braak, C.J.F, Looman, C.W.N., 1986. Weighted averaging, logistic regression and the Gaussian response model. Vegetatio 65: 3-11.
Tichý, L., 2002. JUICE, software for vegetation classification. Journal of Vegetation Science 13: 451-453.