**Bio793 notes, Feb 2**

**A few more basics from
last time**

Lets
input the tree dataset again:

> dat = read.csv(file.choose())

We
can explore properties of individual variables graphically. For example, the
distribution of elevation (elev) can be assessed with
a histogram:

> hist(dat$elev)

with control over the number of bins via the *breaks*
argument:

> hist(dat$elev,breaks=10)

We
can compare this to an estimate of the continuous probability density by
overlaying the density on a histogram scaled to proportions rather than
frequency:

> hist(dat$elev,breaks=10,freq=F)

> lines(density(dat$elev),lwd=2,col="red")

Some
elevation values seem to be over-represented, leading to a choppy distribution.
Perhaps this is because some plots were sampled multiple times? To get a single
elevation value per plot, we can take out plot duplicates using the duplicated
function:

> dat2 = dat[duplicated(dat$plotID)==F,]

> plot(density(dat2$elev),lwd=2,col="salmon")

This
is clearly better, although there still a hump around 1400 m. Note how
duplicated returns "false" if an entry is NOT duplicated, so we
changed those not duplicated by converting all T to F and vice versa with
"==F".

Is
elevation normally distributed?

> qqnorm(dat2$elev)

> qqline(dat2$elev,col="purple",lwd=2,lty=2)

The
'qq' plot (quantile-quantile
against a normal distribution) for a normal distribution should lie on the qqline we plotted, but the tails are way off. We can test for this with a Shapiro-Wilk normality test (see

> shapiro.test(dat2$elev)

Shapiro-Wilk normality test

data: dat2$elev

W = 0.962,
p-value = 7.174e-15

The
infinitesimal p value suggests our elevation distribution is anything but
normal. Here's what perfect normality
looks like:

> curve(dnorm)

This
**curve** function works for any formula or distribution object in R. For example:

> curve(x+x^2+x^3)

Two
other handy plotting functions are the **pairs** function, which plots all
variables of a data frame against one another in multi-panel matrix format, and
**coplot**, which shows bivariate
relationships conditioned on a third variable.

**Writing
functions**

Much
of the power of R involves writing your own functions to perform complex
calculations and simplify repetitive tasks. Writing a function just requires
assigning a name to the **function** function,
along with desired function parameter names (and default values, if desired)
and a specification of what the function is supposed to do. To call the
function, just type its name with the necessary parameter values in
parentheses. Here's an example for calculating the standard error (se) of a
vector:

> se = function(x) {sqrt(var(x)/length(x))

Now
that the function 'se' is defined, we can use it to calculate the standard
error of a variable like elevation:

> se(dat2$elev)

[1] 12.78305

Note
we told to the function to include only one argument (x); we don't have to
specify that dat$elev is actually x because this is
the only argument to se. You also don't have to specify which argument is which
if they are listed in the same order as you when you created the function.

You can pass arguments to other (nested)
functions. For example we can write our own function for variance within the se
function:

> se2 =
function (x) {

variance =
function(X=x) {sum((X-mean(X))^2)/(length(X)-1)}

return ( sqrt(variance(x)/length(x)) )

}

> se2(dat2$elev)

[1] 12.78305

Here
our vector input x is passed onto the 'variance' function with X=x. It is generally recommended to put new
statements on new lines within functions.
R knows the function is still open because the original bracket '{' has
not yet been closed ('}'). The use of *return* tells the function which
value to end with if there are many; otherwise it will return last one
evaluated.

Here's another example of writing a
function to return a bootstrapped confidence interval for our vector x, which
demonstrates the handy function **sample**.

> CI.boot = function(x,nperm,lower,upper)
{

list1 = vector("list",nperm)

vec1 = sapply(list1,function(y)mean(sample(x,replace=T)))

return ( quantile(vec1,c(lower,upper)) )

}

> CI.boot(c(1:100),10000,.025,.975)

2.5% 97.5%

44.75 56.20

Here
we created the function 'CI.boot' with 4 arguments:
the vector for which the CI is to be calculated, the number of desired
permutations, and the desired lower and upper confidence quantiles
(e.g., 2.5%, 97.5%). The next two lines
are tricky: the first creates a list object, with 'nperm'
empty elements; the second then populates each element in the list with a
sampled (with replacement), random version of our initial vector x, takes the
mean of each permutation, and the returns these mean values in a simple vector
form. So we end up with a population of
sampled means (vec1), and we then ask for the desired lower and upper quantiles of this population (the **quantile**
function, good for summarizing vectors generally). Note with **sample** you can specify
different probabilities of selecting each element, and can sample with and
without replacement.

**Using
tapply and other apply functions**

The
CI.boot function also contained another bit of magic
about R: vector-based calculations. In
one line of code we created 10,000 different randomized versions of the vector
x, without using a for loop, and all in a few
seconds. This is usually accomplished
with one of the **apply** family of functions, of which **tapply**
will be used most commonly. To
understand what apply (or sapply) does, note that we
first created an empty object (in this case a list of 'nperm'
elements), and then performed the same function on each element. Rather than do this sequentially with a for
loop, the **sapply**
function applied the **'mean**(**sample**
)' function simultaneously to
all elements. It may be easiest to see
this with apply function on a matrix.

> matrix1 = matrix(seq(1:100),nrow=10,ncol=10)

> apply(matrix1,2,mean)

[1] 5.5 15.5 25.5 35.5
45.5 55.5 65.5 75.5 85.5 95.5

Here
we created a 10x10 matrix populated with numbers 1 to 100, filled down columns
first so that each subsequent column gets higher numbers (check it by calling
'matrix1'). Then we used apply to
calculate the mean of each column, which was returned in vector form. The second argument in **apply** refers to
summarizing by columns (the 2^{nd} dimension) rather than rows, and the
final argument is the function we wish to apply to each column. This last part is where we can use **apply** to do complicated calculations. For example, we can use our own se function
instead of the mean:

> apply(matrix1,2,function(x)se(x))

[1] 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271 0.9574271

The
standard errors of each column are identical.
The 'function(x)
' argument means that whatever comes next is a function
that uses 'x', where 'x' stands in for each successive column. You can put here anything that will return a
value, including nested functions, calls to other functions, many lines of code
(in brackets), etc.

Finally, we can use **tapply** to apply functions to vectors in a data
frame, separated by other grouping variables.
For example, let's look at the mean elevation for vegetation plots in
each of the disturbance categories:

> dist.mean = tapply(dat2$elev,dat2$disturb,mean)

> dist.mean

CORPLOG
LT-SEL SETTLE VIRGIN

1109.0728 839.9964
672.8473 1355.6132

And
the standard errors:

> dist.se = tapply(dat2$elev,dat2$disturb,function(x)se(x))

> dist.se

CORPLOG LT-SEL
SETTLE VIRGIN

24.14951
16.37239 17.20755 19.94275

And
finally a bar graph with error bars:

> bar1 = barplot(dist.mean,names.arg=names(dist.mean),ylim=c(0,1500),ylab="Elevation")

> arrows(bar1,dist.mean,bar1,dist.mean+dist.se,angle=90,lwd=2)

Functions **sapply** and **lapply** are similar to tapply, but use elements of a
list (or vector) rather than grouped variables or matrix dimensions.

**Creating models: where
to from here**

- basic
strategy for the way I'm presenting modeling techniques:

- everything
before spring break (basically) concerns 'single-level' models: that is, only
one type of variability

- as
single-level models get complicated, we're modeling how the mean changes: the
error distribution (residuals) will not vary among samples

- in multi-level models, we model changes in the distribution
of variance (error) in addition to the mean

- these
get analytically intractable, so we use likelihood techniques with optimization

- for now,
we're only concerned with modeling the correct error distribution and
specifying appropriate process models

- later,
we'll see how all model components can themselves be modeled with their own
stochastic distributions

**General linear models
(least squares) in R**

Bolker
(Chap. 9) provides a nice summary of *general*
linear models as opposed to the more modern techniques. Essentially general linear models (NOT
general*ized* linear models) are the
old-school models of normal residual distributions, independent observations, homoscedasticity, and (assumed) lack of any observation
error. In R this includes everything
that the **lm** function does (simple
and multiple least-squares regression), ANOVA, and ANCOVA. The only difference between these is whether
the model includes only continuous variables (regression), only factor
variables (ANOVA), or both (ANCOVA).

All modeling approaches in R use the same basic structure of "response ~ explanatory variable(s)". Here's an example of a regression model of the abundance (cover) of beech on elevation:

> lm1 =
lm(cover~elev,data=dat,subset=species=="Fagus grandifolia")

This simple linear regression fits both an intercept and a
slope. A summary of the fitted model
object is available with the **summary**
function:

> summary(lm1)

Call:

lm(formula
= cover ~ elev, data = dat,
subset = species == "Fagus grandifolia")

Residuals:

Min
1Q Median 3Q
Max

-3.9054
-1.8398 -0.1664 1.6433 5.4512

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.9620373
0.3437750 5.707 2.21e-08 ***

elev
0.0019596 0.0002971 6.596 1.31e-10 ***

---

Signif. codes: 0
*** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.1 on 409 degrees of freedom

Multiple
R-squared: 0.09613, Adjusted
R-squared: 0.09393

F-statistic: 43.5 on 1 and 409 DF, p-value: 1.313e-10

All fitted model objects can be explored with **summary**. For **lm**,
you get a restatement of the model, the range and quantile
estimates of the model residuals, a table of fitted coefficients along with
their precision (SE) and significance (t tests), the residual SE (square root
of the error variance), the R2 or coefficient of determination (fraction of
variance explained), the adjusted-R2 (applies a penalty for more fitted parameters),
and the associated F statistic evaluating the significance of the full
model. There are many attributes of lm1,
most of which are not shown. For
example, you can retrieve a vector of fitted (predicted) values with lm1$fitted
(or **fitted**(lm1)),
a vector of residuals with lm1$resid (or **resid**(lm1)), and return the
coefficients with **coef**. Alternatively, note that this summary is
itself an object, with components that can be accessed with indexing. For example:

> summary(lm1)[[4]]

Estimate Std. Error t value Pr(>|t|)

(Intercept)
1.962037256 0.3437749892 5.707330 2.206867e-08

elev
0.001959647 0.0002971168 6.595544 1.312590e-10

returns the 4^{th} element of
summary(lm1), which is stored as a list.
Here this element is the coefficient table. We could also show these results as an
ANOVA-type table:

> summary.aov(lm1)

Df
Sum Sq Mean Sq F value
Pr(>F)

elev
1 191.76 191.76 43.501 1.313e-10 ***

Residuals 409 1802.91 4.41

In
fact, all **lm** fitted objects can be
returned in both summary forms (**summary.lm** and **summary.aov**). Other
helpful related items are a report of the confidence intervals of each
parameter:

> confint(lm1)

2.5 % 97.5 %

(Intercept)
1.28625089 2.637823622

elev
0.00137558 0.002543713

and associated AIC:

> AIC(lm1)

[1] 1780.058

Don't
forget that AIC values are particular to a given response vector: a change in
any of the values of the response (e.g., transformation, removing an outlier,
etc.) results in a non-comparable vector.

Is our model a good model? One easy way to examine model results is with
the plot function:

> plot(lm1)

Here
you get four graphs (click to go from one panel to the next): 1) residuals vs.
fitted values (a good model will show no pattern); 2) the qqnorm
plot we saw above (values should be on the dashed line); 3) scale-location
graph, indicating heteroscedasticity ; and 4)
standardized residuals vs. leverage and Cook's distance, which is handy for
identifying outliers. The first thing we
notice in our data is that the y data are not continuous (thus the
cross-hatched patterns in residual graphs) and that the residuals are probably
not normally distributed (try **shapiro.test**(lm1$resid) to confirm).
Would square root transformation help?
We could compare the fit of a new model (lm2) with a transformed cover
variable, by either calling lm again with new arguments or using the update
function:

> lm2 = update(lm1,formula=sqrt(cover)~elev)

> plot(lm2)

The
residuals don't look any better and the R2 has gone down. To do better we'll have to move to a
generalized linear model (**glm**).

Now we'll try an ANOVA. Let's ask whether beech is differentially
sensitive to disturbance history:

> aov1 = lm(cover~disturb,data=dat,subset=species=="Fagus grandifolia")

> summary(aov1)

Call:

lm(formula
= cover ~ disturb, data = dat, subset = species ==

"Fagus grandifolia")

Residuals:

Min
1Q Median 3Q Max

-3.333 -2.053 -0.229 1.947 5.771

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept)
4.2290 0.1920 22.030
<2e-16 ***

disturbLT-SEL
0.1043 0.2759 0.378
0.7055

disturbSETTLE
-0.8199 0.3828 -2.142
0.0328 *

disturbVIRGIN -0.1759 0.2821 -0.624
0.5332

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.197 on 407 degrees of freedom

Multiple
R-squared: 0.01499, Adjusted
R-squared: 0.007725

F-statistic:
2.064 on 3 and 407 DF,
p-value: 0.1044

This
output is more confusing than regression output; for one thing coefficients are
only shown for 3 of four levels of disturbance, plus the overall
intercept. This is because the default
contrast structure for ANOVA in R is to use 'treatment contrasts' and ignore a
raw intercept. The factor level that is
alphabetically first (here, "disturbCORPLOG")
is used as the overall intercept, so this estimate is used to test whether the
overall mean differs from zero (here, it clearly does, because there are no
zero cover values in our dataset). Each
remaining factor level is then tested against this first level; i.e., the
second parameter estimate (here next to "disturbLT-SEL")
is the difference in cover of beech in CORPLOG and LT-SEL plots. So what this output is showing us is that
SETTLE plots have significantly lower abundances of beech than CORPLOG plots
(by -.8 cover units). The first row
coefficient is an actual mean of a sample (mean beech cover of all CORPLOG
plots), while the remaining coefficients are differences in mean cover, not
actual means themselves. These are of
course only a few of all possible contrasts; you can specify a particular
contrast structure with the function **contrasts**.

We could optionally choose to have R
not fit an intercept at all, but instead determine whether the mean cover of
each group is significantly different from zero. In our case this is a bit silly, but it is
easily accomplished using '-1' in on right side of the equation:

> aov2 = lm(cover~disturb-1,data=dat,subset=species=="Fagus
grandifolia")

A
call to summary(aov2) will confirm that indeed cover
is greater than zero for all groups. Not
fitting an intercept is usually more handy in
regressions when we want the line to go through the origin.

Finally, note that the **lm**
output for our model is different than that produced by **aov**:

> summary.aov(aov1)

Df
Sum Sq Mean Sq F value Pr(>F)

disturb 3
29.89 9.9635 2.0639 0.1044

Residuals 407 1964.78 4.8275

You
can see why from the summary table: the **lm** output produces a set of
treatment contrasts (the mean of each group), each
tested with pairwise t tests. But here **aov** produces the overall ANOVA result, which is a
simple comparison of variances of the cover response variable between and
within the grouping variable; comparisons of two variances are done via the F
test. The fact that these tests give us
different results suggests than in fact the difference between groups is an
issue of variance as well as the mean:

> plot(dat$disturb[dat$species=="Fagus grandifolia"],dat$cover[dat$species=="Fagus
grandifolia"],notch=T)

The
problem is evident here: a general linear model is not a good choice because of
heteroscedasticity (SETTLE variance is much
lower). This would be a good occasion to
use a bootstrapped model (like a permutational ANOVA)
rather than something based on classical assumptions.

Multiple comparisons can be
accomplished with various other tests in R, such as the popular Tukey honest
significant difference (HSD) test. The
argument for the TukeyHSD function must be an **aov** object rather than a **lm**
one:

> TukeyHSD(aov(aov1))

Tukey multiple comparisons of means

95% family-wise confidence level

Fit: aov(formula
= aov1)

$disturb

diff lwr upr p adj

LT-SEL-CORPLOG 0.1043257
-0.6073034 0.81595478 0.9815546

SETTLE-CORPLOG
-0.8199167 -1.8075190 0.16768552 0.1417827

VIRGIN-CORPLOG
-0.1759103 -0.9035981 0.55177748 0.9244785

SETTLE-LT-SEL -0.9242424
-1.9198871 0.07140225 0.0796909

VIRGIN-LT-SEL -0.2802360
-1.0188019 0.45832992 0.7616165

VIRGIN-SETTLE 0.6440064 -0.3631787 1.65119155 0.3521145

Unlike
the t tests produced by the **lm** summary, the more conservative Tukey HSD
tests suggest that beech cover does not vary between any two disturbance
classes.

We have already noted that plots of different
disturbance classes have a very different elevation distribution. We can determine whether elevation influences
the above ANOVA results (or equivalently, whether the elevation effect varies
by disturbance class) with an ANCOVA:

> ancova1 = lm(cover~disturb*elev,data=dat,subset=species=="Fagus
grandifolia")

> summary(ancova1)

Call:

lm(formula
= cover ~ disturb * elev, data = dat,
subset = species ==

"Fagus grandifolia")

Residuals:

Min
1Q Median 3Q
Max

-4.3054 -1.5585
-0.1539 1.7687 5.3592

Coefficients:

Estimate Std. Error t
value Pr(>|t|)

(Intercept)
1.517e+00 6.635e-01 2.287
0.0227 *

disturbLT-SEL
3.248e-01 8.653e-01 0.375
0.7076

disturbSETTLE
1.688e+00 1.546e+00 1.092
0.2754

disturbVIRGIN
2.092e-01 1.338e+00 0.156
0.8758

elev
2.340e-03 5.503e-04 4.252 2.64e-05 ***

disturbLT-SEL:elev
2.112e-05 7.402e-04 0.029
0.9772

disturbSETTLE:elev -2.041e-03 2.073e-03
-0.984 0.3255

disturbVIRGIN:elev -4.867e-04 1.065e-03
-0.457 0.6478

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.094 on 403 degrees of freedom

Multiple
R-squared: 0.1139, Adjusted
R-squared: 0.09848

F-statistic:
7.398 on 7 and 403 DF,
p-value: 2.249e-08

This
again is the default contrast form of model results, where we see that
elevation is really driving the relationship, with no support for including
disturbance history or its interaction with elevation. The significance tests associated with only
the main effect of disturbance (rather than treatment contrasts is found in the
**anova**-style output:

> anova(ancova1)

Analysis of
Variance Table

Response: cover

Df Sum Sq Mean Sq F
value Pr(>F)

disturb
3 29.89 9.964
2.2717 0.07972 .

elev
1 191.92 191.924 43.7589
1.184e-10 ***

disturb:elev
3 5.32 1.775
0.4047 0.74971

Residuals 403 1767.53 4.386

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Which again shows no support for disturbance or
the interaction. We can test the fit of models with and
without the disturbance variance by defining a second ANCOVA model with elevation
only (the period below followed by the minus sign is translated as 'the full
model minus these terms, with the colon signifying the interaction):

> ancova2 = update(ancova1, ~.-disturb-disturb:elev)

and then using the useful **anova**
function to compare whether the explained variance is significantly less from
the simpler model:

> anova(ancova1,ancova2)

Analysis of
Variance Table

Model 1: cover
~ disturb * elev

Model 2: cover
~ elev

Res.Df RSS Df Sum of
Sq F Pr(>F)

1 403 1767.5

2 409 1802.9 -6 -35.381 1.3445 0.2361

The
models are not significantly different in explanatory power (p=0.2361), so the
simpler one with only elevation (model 2) is better. AIC agrees:

> AIC(ancova1,ancova2)

df AIC

ancova1 9 1783.912

ancova2 3 1780.058

There
is an even easier (but not always correct) way to perform model simplification
(backwards elimination from a maximal model) in R using the **step**
function:

> step(ancova1)

Start: AIC=615.54

cover ~
disturb * elev

Df Sum of Sq
RSS AIC

- disturb:elev 3
5.3248 1772.9 610.78

<none> 1767.5 615.54

Step: AIC=610.78

cover ~
disturb + elev

Df Sum of
Sq RSS AIC

<none> 1772.9 610.78

- disturb 3 30.057 1802.9 611.69

- elev 1
191.924 1964.8 651.03

Call:

lm(formula
= cover ~ disturb + elev, data = dat,
subset = species == "Fagus
grandifolia")

Coefficients:

(Intercept)
disturbLT-SEL disturbSETTLE disturbVIRGIN elev

1.649949
0.335265 0.244698 -0.390918 0.002225

The
output is not as complicated as it seems.
Starting from the maximal model (~disturb*elev),
the function compares the resulting AIC value from taking away the most
complicated (last) term in the model, which here is the interaction between
disturb and elev. With a substantially
better AIC, the function keeps this simpler model and then takes away the terms
from the next tier (here, both main effects).
The model with disturb removed is essentially of equal fit to having it
in (AIC goes up 1 unit), but taking elev out results
in a massive loss of explanatory power (up 41 units). So the best model only includes elev. I do
not know why the AICs reported here do not correspond
to that produced by AIC. Note that **step**
works for forward selection as well as backward, and you can select to do both
simultaneously. (It works with **glm**, too).
Try using **step** with a more complicated ANCOVA, such including the
tci variable along with disturb and elev, along with all interactions. Beware that the saved object associated with
step will include the best-fitting model according to raw AIC (better AIC wins,
no matter whether the actual difference is less than two). For example:

> step.ancova = step(ancova1)

> summary(step.ancova)

Call:

lm(formula
= cover ~ disturb + elev, data = dat,
subset = species ==

"Fagus grandifolia")

Residuals:

Min
1Q Median 3Q
Max

-4.24965
-1.67849 -0.09873
1.81710 5.37934

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept)
1.6499485 0.4297313 3.839 0.000143 ***

disturbLT-SEL
0.3352653 0.2646654 1.267 0.205970

disturbSETTLE
0.2446982 0.3979475 0.615 0.538965

disturbVIRGIN -0.3909178
0.2702366 -1.447 0.148787

elev
0.0022252 0.0003357 6.630 1.07e-10 ***

---

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.09 on 406 degrees of freedom

Multiple
R-squared: 0.1112, Adjusted
R-squared: 0.1024

F-statistic: 12.7 on 4 and 406 DF, p-value: 9.535e-10

So
disturbance is retained here, even though there is not sufficient AIC support.

**Nonlinear Least
Squares (nls)**

Let's limit the focus of our analysis to one species (say,
sweet birch) using the **subset**
function:

> dat2 = subset(dat,dat$species=="Betula
lenta")

Which gives us a data frame of 795 rows (use **str**
to confirm). An example of one relationship that may be nonlinear is that
between elevation and tci (topographic convergence
index).

> with(dat2,plot(elev,tci))

Clearly there is more to the relationship than a simple
linear function, but nonetheless let's try fitting an asymptotic model to these
data. The function for this in R using
least squares is **nls**. Unlike **lm**,
minimization is done via an optimization algorithm, which thus requires extra
input on your part in the form of initial values for each parameter in the
optimization. This also means **nls** can
sometimes be frustrating, due to lack of convergence or slow performance with
very large datasets. One two-parameter
asymptotic model is the Michaelis-Menten (MM)
function, used in **nls**
here:

>
model1 = nls(tci ~ a*elev
/ (b+elev), start=list(a=500,b=5), data=dat2)

The *start* argument
is a list of each fitted parameter and its starting value. Here a is the
asymptote, and b is the x value where y=a/2.
Note usually the MM function is used for monotonically increasing data,
whereas here the trend is downward, so our b value will be negative. Access parameter estimates with the **summary** function:

> summary(model1)

Formula: tci ~ a * elev/(b
+ elev)

Parameters:

Estimate Std. Error t value Pr(>|t|)

a 4.9742 0.1564
31.814 < 2e-16 ***

b -109.3603
16.9142 -6.466 1.76e-10 ***

---

Signif. codes: 0
*** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.255 on 793 degrees of freedom

Number of
iterations to convergence: 8

Achieved
convergence tolerance: 1.309e-06

Indeed the guess about the asymptote (a=5) was right on, and b was fit to -109. The t and associated p values indicate whether the parameter values are significantly different from zero. (This is different from deciding whether or not this is the best model, or even whether these terms both belong in this model.) You also get an indication of the model fit and df, plus information about the optimization result. Let's see what the regression line looks like:

> x = seq(0,2000)

> lines(x,predict(model1,list(elev=x)),col="red")

Here we use **predict**
to generate new (predicted) values based on our fitted model, using the new
vector x. The **predict** function is generic and should work with any modeling
function.

R provides
an easier way to implement nls for many common
nonlinear functions via 'self-starting functions' that do not require
specification of initial parameter values.
(It is also possible to write your own functions for this; see **selfStart**.) The one for the MM model looks like this:

>
model2 = nls(tci ~ SSmicmen(elev,a,b), data=dat2)

Fortunately we get basically the same model fit as before:

> summary(model2)

Formula: tci ~ SSmicmen(elev, a, b)

Parameters:

Estimate Std. Error t value Pr(>|t|)

a 4.9742 0.1564
31.814 < 2e-16 ***

b -109.3588
16.9144 -6.465 1.76e-10 ***

---

Signif. codes: 0
*** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual
standard error: 2.255 on 793 degrees of freedom

Number of
iterations to convergence: 4

Achieved
convergence tolerance: 2.352e-06

Maybe a power function fits the data better?

>
model3 = nls(tci ~ c*elev^z,
start=list(c=20,z=1), data=dat2)

> lines(x,predict(model3,list(elev=x)),col="blue")

The fit is similar. If the functions were nested (i.e,. setting a parameter to zero yields the other function), we could use a likelihood ratio test to compare models with the anova function. Here they're not, so we can appeal to AIC:

> AIC(model2,model3)

df AIC

model2 3 3552.757

model3 3 3546.405

There is substantially more support for the power model
(lower AIC). We'll talk more about this
later. For more on useful deterministic
nonlinear functions, see **gnls**
in the nlme library.
Finally, note that for relatively simple nonlinear relationships it is
often more productive to simply transform the dependent variable rather than
worry about nonlinearity.

- log y vs x for exponential relationship

- log y vs log x for power

- exp(y) vs. x
for logarithmic

- 1/y vs 1/x
for asymptotic

- log(p/(1-p)) vs x for proportion data

- sqrt(y) for count data for variance homogenization

- arcsin(y) for percentage data for variance homogenization