Analysis of Covariance

I’ve decided to present the statistical model for the Analysis of Covariance design in regression analysis notation. The model shown here is for a case where there is a single covariate and a treated and control group. We use a dummy variables in specifying this model. We use the dummy variable Zi to represent the treatment group. The beta values (βs) are the parameters we are estimating. The value β0 represents the intercept. In this model, it is the predicted posttest value for the control group for a given X value (and, when X=0, it is the intercept for the control group regression line). Why? Because a control group case has a Z=0 and since the Z variable is multiplied with β2, that whole term would drop out.

The data matrix that is entered into this analysis would consist of three columns and as many rows as you have participants: the posttest data, one column of 0s or 1s to indicate which treatment group the participant is in, and the covariate score.

This model assumes that the data in the two groups are well described by straight lines that have the same slope. If this does not appear to be the case, you have to modify the model appropriately.

$$ y_i = \beta_0 + \beta_1 X_i + \beta_2 Z_i + e_i $$


  • yi is the outcome for the ith unit,
  • β0 is coefficient for the intercept,
  • β1 is the pretest coefficient,
  • β2 is the mean difference for treatment,
  • Xi is the covariate,
  • Zi is the dummy variable for treatment:
    • 0 for control,
    • 1 for treatment,
  • ei is the residual for the ith unit.