Factorial experiments arise when we have 2 or more factors with multiple measurements for each combination of factors. See Steel, Torrie and Dickey, Chapter 15.

Example with animals

An example with animals will be considered. Let us consider this in the context of a study of the effects of various diets on the growth of rats. We have 3 diets (1, 2 and 3), factor A, and 2 sexes (1 = Male and 2 = Female), factor B. We think that the effects of the 3 diets may not be the same in males and in females; i.e. an interaction between Diet and Sex. This is a 3 x 2 factorial design, similar to the example given below for plants. For each of the 6 combinations we decide to use 4 experimental units (i.e. 4 animals, each in seperate cages). Therefore we require 24 cages (12 males and 12 females; 1 rat per cage). The 12 males must be a random sample from available males and must be assigned to the 3 diets (1, 2 and 3) at random; similarly for the females.

Note, in the table below the first letter refers to Sex, (1 = Male) or (2 = Female), the second letter refers to the Diet (1, 2 or 3); the number refers to whether it is the first, second, third or fourth experimental unit for the given combination. Thus the 3 character indicator serves to designate any particular observation.

Y_{112}, Y=104 |
Y_{223}, Y=103 |
Y_{123}, Y=91 |
Y_{131}, Y=94 |
Y_{133}, Y=92 |
Y_{222}, Y=104 |

Y_{111}, Y=104 |
Y_{134}, Y=93 |
Y_{233}, Y=92 |
Y_{213}, Y=101 |
Y_{224}, Y=104 |
Y_{234}, Y=89 |

Y_{214}, Y=105 |
Y_{113}, Y=107 |
Y_{124}, Y=93 |
Y_{121}, Y=99 |
Y_{212}, Y=102 |
Y_{221}, Y=98 |

Y_{232}, Y=91 |
Y_{132}, Y=86 |
Y_{211}, Y=101 |
Y_{122}, Y=97 |
Y_{114}, Y=106 |
Y_{231}, Y=86 |

Example with crops/plants

An example with crops will serve to illustrate the basic concept. Suppose that we are interested in looking at 2 factors (Phosphorus and Nitrogen) and their effect on maize yield. We are going to use 2 'levels' of Phosphorus (1 = High = 150kg, and 2 = Low = 50kg), factor A, and 3 'levels' of Nitrogen (1 = High = 200kg, 2 = Medium = 150kg, and 3 = Low = 100kg), factor B. This is a 3 x 2 factorial. For each of the 6 combinations we decide to use 4 experimental plots; for a total of 24 (3 x 2 x 4) plots. The plots are the experimental unit and must be assigned to the treatment combinations at random.

Note the first letter refers to Phosphorus (High or Low) and the second refers to Nitrogen (High, Medium or Low); the number indicates whether it is the first, second, third or fourth experimental unit for the given combination. Thus the 3 character indicator serves to designate any particular observation.

Y_{112}, Y=104 |
Y_{223}, Y=103 |
Y_{123}, Y=91 |
Y_{131}, Y=94 |
Y_{133}, Y=92 |
Y_{222}, Y=104 |

Y_{111}, Y=104 |
Y_{134}, Y=93 |
Y_{233}, Y=92 |
Y_{213}, Y=101 |
Y_{224}, Y=104 |
Y_{234}, Y=89 |

Y_{214}, Y=105 |
Y_{113}, Y=107 |
Y_{124}, Y=93 |
Y_{121}, Y=99 |
Y_{212}, Y=102 |
Y_{221}, Y=98 |

Y_{232}, Y=91 |
Y_{132}, Y=86 |
Y_{211}, Y=101 |
Y_{122}, Y=97 |
Y_{114}, Y=106 |
Y_{231}, Y=86 |

Then the linear model will be

Y_{ijk} = µ + A_{i} + B_{j} +
AB_{ij} + e_{ijk}

This is a fixed effects model; we are interested in the fixed, specific levels
of the factors A and B that we chose to consider. These results
are not extrapolatable to other 'levels' of A or B, or even to intermediate
amounts, *i.e.* 105 kg Nitrogen.

Then the Expected Mean Squares from the Analysis of Variance are :

Run this model and examine the output. Verify the Normal Equations, the fitted values and the contrast statements (specifically the implicit k' matrix printed by the /e option).

The fitted value for A_{1}, B_{1} is :

With k' matrix

Thus the fitted values for factor A and their k' matrices are :

Fitted value = µ + a_{1} + b_{1} + ab_{11}

k' =

Fitted value = µ + a_{1} + b_{2} + ab_{12}

k' =

Fitted value = µ + a_{1} + b_{3} + ab_{13}

k' =

Fitted value = µ + a_{2} + b_{1} + ab_{21}

k' =

Fitted value = µ + a_{2} + b_{2} + ab_{22}

k' =

Fitted value = µ + a_{2} + b_{3} + ab_{23}

k' =

Thus the contrast between factor A_{1} and A_{2} is

With k' matrix =

Verify these algebraic computations and the numerical results. Compare them to the SAS output and check that with the k' matrix you obtain the Type III, Marginal, Sums of Squares for factor A (Phosphorus or Sex).

Note particularly that the Sums of Squares for factor A is thus adjusted for
factor B, and for the (un)equal frequency of observations in each AB subclass,
but does include interaction effects, i.e. SS_{A} =
R( A | µ, B)

Repeat this same exercise for factor B (Nitrogen or Diet) fitted values and differences to ensure that again the k' matrix generated gives the Type III, Marginal, Sums of Squares for the effect of factor B (Nitrogen or Diets).

Again, note that the Sums of Squares for factor B = R( B | µ, A).

Note that although we are computing Sums of Squares, as well as linear
functions of fitted values, for factor A adjusted for factor B, and
*vice versa*, we are **not** able to adjust out completely the
interaction components! We can account for unequal frequencies but we
are unable to completely remove the interaction effects, as noted by their
presence in the k` matrices.

For the Sums of Squares for the Interaction we have to remember that the Interaction measures whether the differences between factor A levels are the same at each level of factor B; i.e. the differences of the differences.

measures the difference factor A level 1 - 2 at level 1 of factor B.

Similarly measures the difference factor A level 1 - 2 at level 2 of factor B.

Thus the difference between these 2 differences measures the difference in the differences between factor A 1 - 2 at level 1 and level 2 of factor B :

A suitable k' matrix would be :

Similarly measures the difference between level 1 - 2 of factor A at level 3 of factor B.

Thus the difference between these 2 differences measures the difference in the differences between level 1 - 2 of factor A at level 1 and level 3 of factor B :

A suitable k' matrix would be :

Therefore we have 2 contrasts,

This measures the Interaction component. Note that the 2 contrasts involve only interaction components, thus the interaction component is free of main effects, i.e. it is R(AB | µ, A, B).

Construct a suitable k' matrix (of 2 rows) involving these 2 contrasts that will enable us to compute the Sums of Squares for the interaction effect.

Use this contrast in PROC GLM to compute the Sums of Squares and verify that they equal those in the following ANOVA table.

Source of variation | d.f. | Sums Squares | Mean Squares | F-ratio | Pr |
---|---|---|---|---|---|

SSR_{m} |
5 | 857.833 | 191.567 | 24.28 | .0001 |

A | 1 | 4.167 | 4.167 | 0.53 | .47 |

B | 2 | 728.583 | 364.292 | 46.18 | .0001 |

A * B | 2 | 125.083 | 62.542 | 7.93 | .0034 |

Residual | 18 | 142.00 | 7.89 |

If the interaction term is statistically significant what does this mean?

It means that the interaction term AB is presumed to be real and to exist; that the model cannot be simply explained in terms of µ, A and B (see Ch 15.1 and Ch 15.2 of Steel, Torrie and Dickey). Thus our fitted values are :

Looking at these estimable functions we can see that there is no way to obtain
only differences between levels of a main effect completely free of other
effects; i.e. (A_{1} - A_{2})
is **not** estimable.

Why? An exercise. Write down algebraically all, each and every fitted value (estimated value).

We must consider what are sometimes called 'Simple Effects'.

Simple effects are 'simply' (*sic*) the fitted values, i.e.

There are 6 fitted values in this example for the 'simple effects'.

Simple effects | Estimate ± s.e. |
---|---|

µ + a_{1} + b_{1} +
ab_{11} | 105.25 ± 1.40 |

µ + a_{1} + b_{2} +
ab_{12} | 95.00 ± 1.40 |

µ + a_{1} + b_{3} +
ab_{13} | 91.25 ± 1.40 |

µ + a_{2} + b_{1} +
ab_{21} | 102.25 ± 1.40 |

µ + a_{2} + b_{2} +
ab_{22} | 102.25 ± 1.40 |

µ + a_{2} + b_{3} +
ab_{23} | 89.50 ± 1.40 |

An exercise. Plot the fitted values (on the Y-axis) against diet (as 1, 2 and 3 on the X-axis) for both males and females, Graph. Estimate ( ± s.e.) the differences between males and females on each of the 3 diets.

What if we think that the variability in the different groups varies. i.e. that the variances in the various subclasses (Factor A * Factor B subgroups) are not homogeneous? Check back in the section Normality and Homogeneity of Variance. Does this give you any ideas how to tackle and answer this question? Jump to this section for more details on the actual SAS code for this problem, answer and discussion.

Here is another example, with unequal numbers of observations. Repeat the above exercises and construct the Analysis of Variance table and compute the fitted values.

SAS code, PROC GLM , example with unequal numbers of observations.

R.I. Cue ©

Department of Animal Science, McGill Univerrsity

last updated : 2010 May 6