Here are the results from running the Randomised Complete Block analysis, using SAS PROC/MIXED.

The MIXED Procedure Class Level Information Class Levels Values

DIET 4 1 2 3 4 MONTH 12 1 2 3 4 5 6 7 8 9 10 11 12

REML Estimation Iteration History Iteration Evaluations Objective Criterion

0 1 173.52884464 1 1 82.12223052 0.00000000 Convergence criteria met.

Covariance Parameter Estimates (REML) Cov Parm Estimate

MONTH 14.54488636 Residual 0.60414773

What does the above tell us?

- In this data set that we are analysing, using PROC MIXED, we have a factor DIET which has been fitted as a CLASS variable and that it has 4 levels (1, 2, 3, 4).
- We also have a factor MONTH which is fitted as a CLASS variable with 12 levels (1 to 12).
- The iterative Restricted Maximum Likelihood (REML) procedure was used to estimate the 2 variance components (MONTH and Residual) and that the convergence criteria were met after 2 iterations (Iteration 0, the starting point, and Iteration 1).
- The estimates of the (co)variance components were, month variance = 14.544886 and the residual variance = 0.6041477

Model Fitting Information for GAIN Description Value

Observations 48.0000 Res Log Likelihood -81.4944 Akaike's Information Criterion -83.4944 Schwarz's Bayesian Criterion -85.2786 -2 Res Log Likelihood 162.9888

Tests of Fixed Effects Source NDF DDF Type III F Pr > F

DIET 3 33 49.78 0.0001

ESTIMATE Statement Results Parameter Estimate Std Error DF t Pr > |t|

mean + diet 1 51.15833333 1.12357443 33 45.53 0.0001 mean + diet 2 51.97500000 1.12357443 33 46.26 0.0001 mean + diet 3 53.25833333 1.12357443 33 47.40 0.0001 mean + diet 4 54.78333333 1.12357443 33 48.76 0.0001 diet 1 - 2 -0.81666667 0.31731891 33 -2.57 0.0147 diet 1 - 3 -2.10000000 0.31731891 33 -6.62 0.0001

We can see the information that SAS uses in fitting the model, specifically the (Restricted) Log Likelihood (-81.4944), as well as Akaike's Information Criteria (AIC) and Schwarz's Bayesian Criterion (SBC). These relate to the fit of the model, but are usually themselves of no particular value; they are only relative values for comparing 1 model with another.

-2 (Res) Log Likelihood is an interesting and useful number, because if
we are comparing 2 models, for example if we had also run this data with
a model without Month, the difference in -2 Log Likelihood's between the
two models has a chi-squared distribution
(c^{2}); so we could test whether the
effect of Month (s^{2}_{month})
was in fact statistically significant.

The tests of the fixed effects provides us with a synopsis of what would
be an Analysis of Variance table. For the fixed effect of Diet (the only
fixed effect in this model) we have 3 degrees of freedom for the
numerator (NDF, 4 levels - 1 = 3) and 33 degrees of freedom for the
denominator (DDF), which in this case is simply the residual degrees of
freedom (DDF = N - r(X), 48 - (1 for µ, 3 for Diets and 11 for Month
= 15) = 33). The Type III F is the F-ratio, the Marginal SS, as we
compute using our k' matrix, divided by the degrees of freedom for Diet
to give the Mean SS for Diet, divided by the appropriate Error Mean
Square, in this case the Residual, or MSE. Finally Pr > F indicates
the probability of obtaining such a large F-ratio simply by random
chance when there is no effect of Diet (our Null Hypothesis, H_{o}).
We can see that this is very unlikely, less than 1 in 10000, so I shall
reject the H_{o} and instead accept that there are indeed
statistically significant differences between Diets.

Least Squares Means Effect DIET LSMEAN Std Error DF t Pr > |t|

DIET 1 51.15833333 1.12357443 33 45.53 0.0001 DIET 2 51.97500000 1.12357443 33 46.26 0.0001 DIET 3 53.25833333 1.12357443 33 47.40 0.0001 DIET 4 54.78333333 1.12357443 33 48.76 0.0001

The LSMEANS are Least Squares Means (n.b. most journals and supervisors love LSMeans!), which for Diet 1 are simply the average of the 12 fitted values for Diet 1, i.e.

( µ + diet_{1}+ month_{1}+ µ + diet_{1}+ month_{2}+ µ + diet_{1}+ month_{3}+ µ + diet_{1}+ month_{4}+ µ + diet_{1}+ month_{5}+ µ + diet_{1}+ month_{6}+ µ + diet_{1}+ month_{7}+ µ + diet_{1}+ month_{8}+ µ + diet_{1}+ month_{9}+ µ + diet_{1}+ month_{10}+ µ + diet_{1}+ month_{11}+ µ + diet_{1}+ month_{12}) / 12

You should note that this LSMean is a linear function of fitted values (and hence estimable). A suitable k' matrix would be

µ Diets Months ------- ----------------------------------------------------------- ( 1 1 0 0 0 1/12 1/12 1/12 1/12 1/12 1/12 1/12 1/12 1/12 1/12 1/12 1/12 )

The Sampling Variance of our LSMean estimate is simply k'(X'X)^{-}k
* phenotypic variance. The phenotypic variance = residual variance +
month variance.

*i.e.* s_{p}^{2} =
s_{e}^{2} +
s_{m}^{2}

variance_{p} = 0.60414773 + 14.54488636 = 15.149034

Multiply these and in fact, because it is a simple completely balanced
experiment with each diet having 12 observations (1 per month), then
the Sampling Variance = variance_{p}/12, = 15.149 / 12 = 1.2624.

Then the standard error to our estimate is the square root of this, which gives us the value of 1.124.

We can test the statistical significance of the random Month effect by re-running the analysis, but dropping out the effect of MONTH.

This gives the following results:

The MIXED Procedure Class Level Information Class Levels Values

DIET 4 1 2 3 4

Covariance Parameter Estimates (REML) Cov Parm Estimate

Residual 15.14903409

Model Fitting Information for GAIN Description Value

Observations 48.0000 Res Log Likelihood -127.198 Akaike's Information Criterion -128.198 Schwarz's Bayesian Criterion -129.090 -2 Res Log Likelihood 254.3954

Tests of Fixed Effects Source NDF DDF Type III F Pr > F

DIET 3 44 1.99 0.1300

What we are interested in is the -2 (Res) Log Likelihood number, it is 254.3954. What does this mean and what does it tell us?

Model | -2 LnL |
---|---|

( µ + Diet_{fixed}) | 254.3954 |

- ( µ + Diet_{fixed} + Month_{random}) | 162.9888 |

= | 91.4066 |

Thus we have a c^{2} of 91.4 for 1 degree
of freedom (for our 1 parameter
[e^{2}_{month}]).
The critical tabulated value for a c^{2}
with 1 d.f. and a Pr of 5% is 3.84. Thus we can conclude that the effect of
Month is quite significant and should be retained in the model; therefore
the first analysis is the one that we should use.

Here we look at the results that we would have got if we had mistakenly used a Fixed Effects model.

General Linear Models Procedure Class Level Information Class Levels Values

DIET 4 1 2 3 4 MONTH 12 1 2 3 4 5 6 7 8 9 10 11 12 Number of observations in data set = 48

Much as for PROC MIXED, PROC GLM tells us that DIET and MONTH were class variables with 4 and 12 levels respectively.

General Linear Models Procedure Dependent Variable: GAIN Sum of Mean Source DF Squares Square F Value Pr > F

Model 14 736.8512500 52.6322321 87.12 0.0001 Error 33 19.9368750 0.6041477 Corrected Total 47 756.7881250

R-Square C.V. Root MSE GAIN Mean

0.973656 1.472275 0.777269 52.79375

Source DF Type I SS Mean Square F Value Pr > F

DIET 3 90.2306250 30.0768750 49.78 0.0001 MONTH 11 646.6206250 58.7836932 97.30 0.0001

Source DF Type III SS Mean Square F Value Pr > F

DIET 3 90.2306250 30.0768750 49.78 0.0001 MONTH 11 646.6206250 58.7836932 97.30 0.0001

The GLM procedure gives an Analysis of Variance showing the Sources of Variation being the Model and the Residual.

The Model is in fact the Model corrected for the Mean, or the Model over and above the Mean, R(Diet, Month | µ).

The Type I Sums of Squares are the Sequential Sums of Squares (due to fitting the effects in the order specified in the SAS model statement). The Type III Sums of Squares are the Marginal Sums of Squares and are therefore independent of the order in which they are fitted.

General Linear Models Procedure Source Type III Expected Mean Square

DIET Var(Error) + Q(DIET) MONTH Var(Error) + 4 Var(MONTH)

Having declared MONTH as a Random effect we obtain a table of the Expectations of the various Mean Squares in the ANOVA, E(MS). This is in spite of the fact that PROC GLM is a fixed effects model and fits all effects as Fixed Effects. Anyway, if we use the E(MS) we can estimate the variance due to Month:

General Linear Models Procedure Least Squares Means DIET GAIN Std Err Pr > |T| LSMEAN LSMEAN H0:LSMEAN=0

1 51.1583333 0.2243783 0.0001 2 51.9750000 0.2243783 0.0001 3 53.2583333 0.2243783 0.0001 4 54.7833333 0.2243783 0.0001

Just as with PROC MIXED, the LSMeans are the average of the 12 fitted values for Diet 1, i.e.

( µ + diet_{1}+ month_{1}+ µ + diet_{1}+ month_{2}+ µ + diet_{1}+ month_{3}+ µ + diet_{1}+ month_{4}+ µ + diet_{1}+ month_{5}+ µ + diet_{1}+ month_{6}+ µ + diet_{1}+ month_{7}+ µ + diet_{1}+ month_{8}+ µ + diet_{1}+ month_{9}+ µ + diet_{1}+ month_{10}+ µ + diet_{1}+ month_{11}+ µ + diet_{1}+ month_{12}) / 12

In fact, because the design is balanced, we get the same estimate for
out LSMean for Diet 1 as we had obtained with PROC MIXED.
**BUT**, the standard error is quite different. The standard
error is computed by PROC GLM using only the residual variance (because in
a purely fixed effects analysis that is the only variance) in our
(*by now traditional*) formula:

Note that this gives the standard error, as computed by PROC GLM, of .224,
whereas the correct standard error, computed by PROC MIXED, is 1.12;
**5 times larger!**

General Linear Models Procedure Dependent Variable: GAIN T for H0: Pr > |T| Std Error of Parameter Estimate Parameter=0 Estimate

mean diet 1 51.1583333 228.00 0.0001 0.22437835 mean diet 2 51.9750000 231.64 0.0001 0.22437835 d1 - d2 -0.8166667 -2.57 0.0147 0.31731891 d1 - d3 -2.1000000 -6.62 0.0001 0.31731891

R.I. Cue,

last updated : 2010 April 27