Aquatic Invertebrates

2006-12-01 16:00

Objective

In this lab, we will use a variety of approaches to visualize and analyze patterns of species composition and diversity between sites. These approaches include ordination, rank abundance diagrams, and diversity indices.

Methods

First, you will need to load and inspect the data file from the combined Tuesday and Wednesday labs.

data <- read.csv("insect.data.csv")
head(data)

Next we need to load the package vegan that provides all the functions for the analyses below (you may need to install the package first).

library(vegan)

Rank-abundance diagrams

Rank-abundance curves give you a visual picture of diversity by plotting abundance on the y-axis ordered by ranked-abundance on the x-axis. Your text has a discussion of rank-abundance diagrams that you can consult for further details.

In order to make a rank-abundance diagram (which we will do at the family level), we first need to determine the total number of individuals sampled for each family. We do this by cross tabulation using the function table() in R:

## Create a "crosstabulation table" of Site and Family
families.site <- table(data$Site,data$Family)
## Extract Family labels from two sites for plotting later
HMF.labels <- names(sort(families.site["HMF",], decreasing=TRUE))
MasonHill.labels <- names(sort(families.site["MasonHill",], decreasing=TRUE))

Be sure to enter families.site to visualize the summarized data.

Next, we can use the function rad.lognormal() to generate the rank-abundance data as well as a best-fit curve based on a lognormal distribution.

## Rank abundance diagram (rad) analysis
result.rad <- apply(families.site,1,rad.lognormal)

Finally we can create our rank-abundance diagrams.

# RAD for Mason Hill
par(mai=c(1.75,1,.25,.25)) # Increase margin size for labels
plot(result.rad$MasonHill, xaxt="n", xlab="")
axis(side=1, at=c(1:length(MasonHill.labels)), labels=MasonHill.labels, las=3) # Add labels

# RAD for HMF
par(mai=c(1.75,1,.25,.25)) # Increase margin size for labels
plot(result.rad$HMF, xaxt="n", xlab="")
axis(side=1, at=c(1:length(HMF.labels)), labels=HMF.labels, las=3) # Add labels

# RAD plots for plots combined
par(mai=c(1,1,.25,.25)) # Reset margins to default
plot(result.rad$MasonHill, pch=16, cex.lab=1.5) # MasonHill site
points(result.rad$HMF, pch=16, col="blue") # Add HMF data
lines(result.rad$HMF, col="blue") # Add HMF prediction

Ordination

We can also use ordination to link differences in community composition to differences in environmental variables.

Some definitions of ordination (from: http://ordination.okstate.edu/glossary.htm)

  1. "Ordination is the collective term for multivariate techniques that arrange sites along axes on the basis of data on species composition" (ter Braak 1987)
  2. "The term 'ordination' derives from early attempts to order a group of objects, for example in time or along an environmental gradient. Nowadays the team is used more generally and refers to an 'ordering' in any number of dimensions (preferably few) that approximates some pattern of response of the set of objects. The usual objective of ordination is to help generate hypotheses about the relationship between the species composition at a site and the underlying environmental gradients" (Digby and Kempton 1987)

The technique that we will use, canonical correspondence analysis (CCA) is currently the preferred method used by community ecologists. For this analysis, we will focus on the ordinal level. As above, we use cross tabulation to summarize the raw data prior to analysis:

## Create a "cross tabulation table" of Site, Substrate, Habitat, and Order
orders <- table(paste(data$Site,data$Substrate,data$Habitat),data$Order)

You can enter orders to visualize these summarized data.

Finally we do the analysis:

## Extract variables from above
vars      <- matrix(unlist(strsplit(row.names(orders), split=" ")),
                    ncol=3, byrow=T)
## Define substrate and habitat variables
substrate <- factor(vars[,2])
habitat   <- factor(vars[,3])

## Plot of Canonical Correspondence Analysis
plot(cca(orders~substrate+habitat))

Diversity indices

Finally, you may want to compare the overall diversity between sites. Diversity can be decomposed into richness (the number of species in a community) and evenness (the distribution of species in a community). A number of diversity indices have been developed that integrate these two components - we will use the Shannon-Weaver diversity index to compare diversity between substrates and habitats (note that as above, the analysis begins with cross tabulation of the raw data.):

## Create a "crosstabulation table" of Substrate, Habitat, and Family
families.env <- table(paste(data$Substrate,data$Habitat),data$Family)

## Number of families per site
Number <- specnumber(families.env)
## Shannon-Weiner diversity index between sites
Diversity <- diversity(families.env, index="shannon")
## Table of results
rbind(Number, Diversity)

One issue with using a single index of diversity between sites is that this does not account for differences in sampling effort (or effectiveness). In particular, the number of species will increase in proportion to the number of samples (this is analogous to the idea of a species-area curve that will be covered in class). To account for sampling bias, we "rarefy" our samples to the lowest number samples common to all sites. In other words, if site A had 8 species in 100 samples and site B had 4 species in 50 samples, we generate the predicted number of species in site A for 50 samples (+/- a confidence interval). This approach allows us to compare species diversity between sites and facilitates a statistical comparison:

## Minimum of total samples colllected between sites
min.samples <- min(apply(families.env, 1, sum))
## Generate predicted diversity at common sample size between sites
rarefy.result <- rarefy(families.env, sample=min.samples, se=TRUE)
rarefy.mean <- rarefy.result[1,] # Extract mean
rarefy.se <- rarefy.result[2,] # Extract SE

Finally, we can use the function grasshopper.plot from the Grasshopper Lab to visualize the results. Note that you will need to copy and paste the function prior to using it.

grasshopper.plot(rarefy.mean,rarefy.se) # Plot results

Questions

Use the analyses above to address the following questions:

  1. Is there a difference among the habitats in the kind of insects found?
    • What is the effect of substrate?
    • What is the effect of water speed?
    • What is the effect of watershed?
  2. Which community (ies) appear most diverse? Least diverse? Explain your
    answer.
  3. What environmental factor seems most important in supporting high diversity of aquatic insects?
AttachmentSize
insect.data.csv11.92 KB