See also Experimental Design, Cochran and Cox, Chapter 4

This is one of the simplist classification designs; quite robust, easy to use and analyse.

Consider that we have 6 diets to feed to dairy cows and we wish to determine which one provides the most milk. We have a group of 30 cows, assumed to be similar, but not necessarily identical, and to be a representative random sample of dairy cows, typical of this region. We assign the cows to the 6 treatments/diets at random. Each cow is in her own tie-stall, so that they cannot poach (steal) from oneanother, and fed individually. This is important; if we had the 5 cows on Diet 1 in a single group and fed as a group then the experimental unit would be the group, and not the individual cow; this is true even if we milk the cows individually - the experimental unit is the group, the cows are the sampling unit. We record the cumulative milk yield for each cow for a month and record the average yield for each.

Unfortunately, for reasons quite beyond our control and unrelated to the diet, 1 cow from each of the Diets 2 and 3 are removed from the trial (perhaps a reproductive biologist 'appropriates' these cows for his, or her, own trial).

Diet 1 | Diet 2 | Diet 3 | Diet 4 | Diet 5 | Diet 6 |
---|---|---|---|---|---|

19.4 | 17.7 | 17.0 | 20.7 | 14.3 | 17.3 |

32.6 | 24.8 | 19.4 | 21.0 | 14.4 | 19.4 |

27.0 | 27.9 | 9.1 | 20.5 | 11.8 | 19.1 |

32.1 | 25.2 | 11.9 | 18.8 | 11.6 | 16.9 |

33.0 | . | . | 18.6 | 14.2 | 20.8 |

Consider a similar example involving plants. We have 6 treatments (inoculants) to apply to red clover plants. There are thirty (30) individual plots available to us to use; we randomly assign 5 plots to each of the 6 treatments. N.B. this random assignment is VERY important; if we simply assigned the five plots in row 1 to be assigned to treatment 1, and then the 5 plots in row 2 to be assinged to another treatment (perhaps treatment 6), and then the 5 plots in the third row to another treatment (perhaps treatment 5), etc then this would ABSOLUTELY NOT count as a random assignment at we would not have a valid analysis, only a bad example! Assuming a random assignment, at the end of the growing season we measure the nitrogen content of the plants. We started with 30 plants, randomly assigning 5 of the plants to each of the 6 treatments.

Unfortunately, for reasons unrelated to the inoculants we are missing 2 observations; perhaps during the laboratory analysis for the nitrogen content the machine broke down and these two samples were lost.

3DOk1 | 3DOk5 | 3DOk4 | 3DOk7 | 3DOk13 | composite |
---|---|---|---|---|---|

19.4 | 17.7 | 17.0 | 20.7 | 14.3 | 17.3 |

32.6 | 24.8 | 19.4 | 21.0 | 14.4 | 19.4 |

27.0 | 27.9 | 9.1 | 20.5 | 11.8 | 19.1 |

32.1 | 25.2 | 11.9 | 18.8 | 11.6 | 16.9 |

33.0 | . | . | 18.6 | 14.2 | 20.8 |

In this experimental design, and hence analysis, we consider that cows (plants) are a random sample from their respective populations and are assigned to treatments at random. Thus animals are a random effect, as is the measurement made on the cow (plant). Therefore a suitable linear model could be :

Y_{ij} = µ + trt_{i} + animal_{ij} +
e_{ij}

as before for our regression problems, with only 1 observation per animal
(plant) we cannot seperate animal_{ij} and e_{ij}
effects, they are 'confounded' or combined into one error term,
e_{ij}.

Thus our linear model now becomes :

See Course Notes, Section 1-Way Classification for more details about the Normal Equations, a generalised inverse and a solution vector.

Data file with recorded observations

R code to read in the data and carry out the CRD analysis

If we look at the ANOVA table below we can see that there are 5 degrees of freedom for the Model after µ ; there are 6 treatments so why are there only 5 degrees of freedom for treatment (the Model over and above the Mean)? Our null hypothesis with respect to treatments is that they are all the same, i.e.

This we can rewrite as a series of single comparisons:

Which if we re-write so that the null hypothesis for each is = Zero, then we have:

This gives us 5 comparisons, and hence the 5 degrees of freedom for our 6 treatments.

The output from R [anova(lm1)] gives us:

Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) trt 5 812.67 162.535 12.72 6.965e-06 *** Residuals 22 281.12 12.778 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Assuming that we know the overall mean value of Y, then we can compute the Sums of Squares for the Correction Factor for the Mean, and hence the Sums of Squares for the Model and hence the total Sums of Squares.

This therefore gives us our usual Analysis of Variance

Sum Sq | Df | F value | Pr(>F) | |
---|---|---|---|---|

trt | 812.67 | 5 | 12.72 | 0.0000 |

Residuals | 281.12 | 22 |

thus

Any linear combination of fitted values must be estimable. So is estimable; it is a linear function (the sum) of 2 fitted values, .

The above is the formulation that SAS uses, based on the complete X matrix, although, as we know, the number of columns is 7, but only 6 are linearly independent, ie r(X) = 6. If we actually look at the solution vector from SAS proc glm we can see that SAS has set the last level of the treatment effects to Zero, so that we are effectively left with a full-rank subset, for the purposes of obtaining a solution. Other software may use other subsets; a not uncommon option is to set the first level of each classification effect to Zero. This is the approach that R uses, except that rather than just set the first level to Zero, R goes one step further by completely eliminating that first level column from the X matrix, so that it has only 6 columns; for the 'Intercept' and treatments 2 to 6.

If we look at the output from the lm1 object we obtain:

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 28.820 1.599 18.028 1.15e-14 *** trt2 -4.920 2.398 -2.052 0.052284 . trt3 -14.470 2.398 -6.034 4.50e-06 *** trt4 -8.900 2.261 -3.937 0.000704 *** trt5 -15.560 2.261 -6.883 6.52e-07 *** trt6 -10.120 2.261 -4.476 0.000188 *** ---

What this means is the the (Intercept) is in fact (µ + trt_{1}).

We could write out a corresponding k' matrix for R as :
( 1 0 0 0 0 0 )

The 'Estimate' for trt2 is in fact (trt_{2} - trt_{1}).

We could write out a corresponding k' matrix for R as :
( 0 1 0 0 0 0 )

The 'Estimate' for trt3 is in fact (trt_{3} - trt_{1}).

We could write out a corresponding k' matrix for R as :
( 0 0 1 0 0 0 )

etc

If we then want the estimate of (µ + trt_{2})
this will be (µ + trt_{1}) + (trt_{2} - trt_{1}),

i.e. ( 1 1 0 0 0 0 )

If we then want the estimate of (µ + trt_{3})
this will be (µ + trt_{1}) + (trt_{3} - trt_{1}),

i.e. ( 1 0 1 0 0 0 )

etc.

It is, IMHO, essential that you fully understand this aspect of
estimability and that you do not take any short cuts, otherwise it is
all too easy
to make mistakes with the k' matrices in R, and as indicated before, R
will quite happily allow you to calculate something which is in fact
non-estimable, *i.e.* shoot yourself in the head!

Let us return to the above example of wanting to estimate a linear function of two fitted values :

We saw that

(µ + trt_{1}) has a corresponding k' matrix :
( 1 0 0 0 0 0 )

and (µ + trt_{2}) has a corresponding k' matrix : ( 1 1 0 0 0 0 )

Thus , which is (µ + trt_{1})
+ (µ + trt_{2}) has a corresponding k' matrix which
is simply the sum of the two corresponding k' matrices, i.e.
( 2 1 0 0 0 0 )

One might ask why would we want to take the sum of these two effects. Well
we can then take the average of them, since it is still an estimable function.
Perhaps these two treatments come from the same company and we want to know
what their average effect is! A suitable k' matrix (for R) would be

( 1 0.5 0 0 0 0 )

The k' matrix is used to compute fitted values and estimable values (k'b) and also to provide the sampling variances (and standard errors) of such estimates:

For calculating these estimates (and also contrasts) we can use the esticon function from the doBy package.

# model 1, fixed effects lm1 <-lm(y~trt,data=ds) ## use esticon to estimate things, to show how to use # this estimates (mu + trt 1), lambda = k' matrix lambda1 <- c(1,0,0,0,0,0) esticon(lm1,lambda1) # this estimates (mu + trt 2) lambda1 <- c(1,1,0,0,0,0) esticon(lm1,lambda1) # this estimates (mu + trt 3) lambda1 <- c(1,0,1,0,0,0) esticon(lm1,lambda1) # this estimates (mu + trt 4) lambda1 <- c(1,0,0,1,0,0) esticon(lm1,lambda1) # this estimates (mu + trt 5) lambda1 <- c(1,0,0,0,1,0) esticon(lm1,lambda1) # this estimates (mu + trt 6) lambda1 <- c(1,0,0,0,0,1) esticon(lm1,lambda1) # this estimates (mu + (trt 1 + trt 2)/2) lambda1 <- c(1,0.5,0,0,0,0) esticon(lm1,lambda1)

Treatment | Estimate | Standard Error |
---|---|---|

µ + trt_{1} |
28.82 | 1.60 |

µ + trt_{2} |
23.90 | 1.79 |

µ + trt_{3} |
14.35 | 1.79 |

µ + trt_{4} |
19.92 | 1.60 |

µ + trt_{5} |
13.26 | 1.60 |

µ + trt_{6} |
18.70 | 1.60 |

Note the larger standard errors for treatments 2 and 3, which had fewer observations.

Construct a k' vector to compute a fitted value for an observation from treatment 1. Repeat this for each treatment.

Using the k' vectors you have computed verify that, when multiplied by the solution vector, you obtain the same fitted values and standard errors as in the table above.

Building on the k' vectors you have computed, construct a k' vector to estimate the difference (treatment 1 - treatment 2). Now compute this estimate and its standard error.

Construct a k' vector for each of the contrasts :

treatment 1 - treatment 2 |
---|

treatment 1 - treatment 3 |

treatment 1 - treatment 4 |

treatment 1 - treatment 5 |

treatment 1 - treatment 6 |

treatment 2 - treatment 3 |

treatment 2 - treatment 4 |

treatment 2 - treatment 5 |

treatment 2 - treatment 6 |

treatment 3 - treatment 4 |

treatment 3 - treatment 5 |

treatment 3 - treatment 6 |

We can use the esticon function, as indicated above, to estimate each of these pairwise differences. However, it is such a commonly required operation that the lsmeans package will provide this for us:

library(lsmeans) lsmeans(lm1,pairwise~trt) lsmeans(lm1,pairwise~trt,adjust="Scheffe")

We load the package lsmeans, using the library command, and then the lsmeans function will provide us with the lsmeans for each level of your fixed effects, AND also the pairwise differences. If we specify nothing about the differences, then the default is to use Tukey's adjustment for the multiple comparison issue; see the first lsmeans specification. If we wish we can specify an alternative multiple comparison adjustment, e.g. Scheffe, or Bonferroni; see the second specification for the adjustment using the Scheffe option.

Results:

> lsmeans(lm1,pairwise~trt) $lsmeans trt lsmean SE df lower.CL upper.CL 1 28.82 1.598630 22 25.504644 32.13536 2 23.90 1.787323 22 20.193319 27.60668 3 14.35 1.787323 22 10.643319 18.05668 4 19.92 1.598630 22 16.604644 23.23536 5 13.26 1.598630 22 9.944644 16.57536 6 18.70 1.598630 22 15.384644 22.01536 Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value 1 - 2 4.92 2.397945 22 2.052 0.3470 1 - 3 14.47 2.397945 22 6.034 0.0001 1 - 4 8.90 2.260804 22 3.937 0.0081 1 - 5 15.56 2.260804 22 6.883 <.0001 1 - 6 10.12 2.260804 22 4.476 0.0023 2 - 3 9.55 2.527656 22 3.778 0.0116 2 - 4 3.98 2.397945 22 1.660 0.5705 2 - 5 10.64 2.397945 22 4.437 0.0025 2 - 6 5.20 2.397945 22 2.169 0.2912 3 - 4 -5.57 2.397945 22 -2.323 0.2271 3 - 5 1.09 2.397945 22 0.455 0.9972 3 - 6 -4.35 2.397945 22 -1.814 0.4776 4 - 5 6.66 2.260804 22 2.946 0.0710 4 - 6 1.22 2.260804 22 0.540 0.9938 5 - 6 -5.44 2.260804 22 -2.406 0.1971 P value adjustment: Tukey method for comparing a family of 6 estimates

And from the second statment, using Scheffe's multiple comparison adjustment

> lsmeans(lm1,pairwise~trt,adjust="Scheffe") $lsmeans trt lsmean SE df lower.CL upper.CL 1 28.82 1.598630 22 25.504644 32.13536 2 23.90 1.787323 22 20.193319 27.60668 3 14.35 1.787323 22 10.643319 18.05668 4 19.92 1.598630 22 16.604644 23.23536 5 13.26 1.598630 22 9.944644 16.57536 6 18.70 1.598630 22 15.384644 22.01536 Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value 1 - 2 4.92 2.397945 22 2.052 0.5345 1 - 3 14.47 2.397945 22 6.034 0.0004 1 - 4 8.90 2.260804 22 3.937 0.0288 1 - 5 15.56 2.260804 22 6.883 0.0001 1 - 6 10.12 2.260804 22 4.476 0.0098 2 - 3 9.55 2.527656 22 3.778 0.0391 2 - 4 3.98 2.397945 22 1.660 0.7360 2 - 5 10.64 2.397945 22 4.437 0.0106 2 - 6 5.20 2.397945 22 2.169 0.4745 3 - 4 -5.57 2.397945 22 -2.323 0.3990 3 - 5 1.09 2.397945 22 0.455 0.9989 3 - 6 -4.35 2.397945 22 -1.814 0.6587 4 - 5 6.66 2.260804 22 2.946 0.1683 4 - 6 1.22 2.260804 22 0.540 0.9975 5 - 6 -5.44 2.260804 22 -2.406 0.3607 P value adjustment: Scheffe method with dimensionality 5

We can see that Scheffe is more conservative than Tukey's test; unless we have a very large number of levels of our fixed effect then it is probably more efficient to use Tukey's option rather that Scheffe. However, if we wish to make a comparison other than simple pairwise (e.g. comparing the average of trts 1, 2, 3, 4, and 5 vs trt 6), then Scheffe's adjustment will be appropriate. For this we can use the esticon function to estimate the particular contrast and its standard error, and then the qf() function to estimate the tabulated F value and hence the Scheffe Critical Difference. Below we show the R code, and then the results:

# this estimates (mu + (trt 1 + trt 2 + trt 3 + trt 4 + trt 5)/5) - (mu + trt 6) # ie the average of trt's 1 to 5 minus trt 6 # N.B. unlike SAS, the contrast coefficients do NOT sum to Zero # you have to be very careful about ensuring that your lambda1 is something # that is estimable. In SAS, if you specify a contrast which is non-estimable # SAS will tell you so, and refuse to 'estimate' something that is non-estimable # R is more accommodating!, it will calculate the number and it is your # responsibility if you have shot yourself in the head! # lambda1 <- c(0, .2, .2, .2, .2, -1) diff1 <- esticon(lm1,lambda1) diff1 diff1$Std.Error # since this is not a comparison of means, then we need # some suitable test, Scheffé's multiple comparison method # will be suitable and appropriate. # # we need the tabulated F value for the ndf and ddf Ftab <- qf(.95, df1=5, df2=22) Ftab # then Critical Difference (CD) = se * sqrt( ndf * F_tab) ndf <- 5 CD <- diff1$Std.Error * sqrt(ndf * Ftab) CD # If |Estimate| > CD then s.s., otherwise NSS # NB Estimate = 1.35, CD = 1.7657, therefore NSS

Results:

> # this estimates (mu + (trt 1 + trt 2 + trt 3 + trt 4 + trt 5)/5) - (mu + trt 6) > # ie the average of trt's 1 to 5 minus trt 6 > # N.B. unlike SAS, the contrast coefficients do NOT sum to Zero > # you have to be very careful about ensuring that your lambda1 is something > # that is estimable. In SAS, if you specify a contrast which is non-estimable > # SAS will tell you so, and refuse to 'estimate' something that is non-estimable > # R is more accommodating!, it will calculate the number and it is your > # responsibility if you have shot yourself in the head! > # > lambda1 <- c(0, .2, .2, .2, .2, -1) > diff1 <- esticon(lm1,lambda1) > diff1 beta0 Estimate Std.Error t.value DF Pr(>|t|) Lower Upper 1 0 1.35 1.765745 0.76455 22 0.4526629 -2.31193 5.01193 > > diff1$Std.Error [1] 1.765745 > > # since this is not a comparison of means, then we need > # some suitable test, Scheffé's multiple comparison method > # will be suitable and appropriate. > # > # we need the tabulated F value for the ndf and ddf > > Ftab <- qf(.95, df1=5, df2=22) > Ftab [1] 2.661274 > > # then Critical Difference (CD) = se * sqrt( ndf * F_tab) > ndf <- 5 > CD <- diff1$Std.Error * sqrt(ndf * Ftab) > CD [1] 6.441065 > > # If |Estimate| > CD then s.s., otherwise NSS > # NB Estimate = 1.35, CD = 1.7657, therefore NSS >

R.I. Cue ©

Department of Animal Science, McGill University

last updated : 2017 April 18