GLM 5: contrast coding and post hoc tests

# The general linear model: contrast coding and post hoc tests

## Overview

This tutorial is one of a series that accompanies Discovering Statistics Using IBM SPSS Statistics (Field 2017) by me, Andy Field. These tutorials contain abridged sections from the book (so there are some copyright considerations).1

• Who is the tutorial aimed at?
• Students enrolled on my Discovering Statistics module at the University of Sussex, or anyone reading my textbook Discovering Statistics Using IBM SPSS Statistics (Field 2017)
• What is covered?
• This tutorial develops the material from the previous tutorial to look at using contrast coding for categorical predictors in the linear model using IBM SPSS Statistics. We will look at contrast coding (a.k.a. planned comparisons) and the linear model as applied to independent experimental designs (i.e., experiments in which different entities participate in different experimental conditions. We will also cover post hoc tests.
• This tutorial does not teach the background theory: it is assumed you have either attended my lecture or read the relevant chapter in my book (or someone else’s)
• The aim of this tutorial is to augment the theory that you already know by guiding you through fitting linear models using IBM SPSS Statistics and asking you questions to test your knowledge along the way.
• The main tutorial follows the example described in detail in Field (2017), so there’s a thorough account in there.
• You can access free lectures and screencasts on my YouTube channel
• There are more statistical resources on my website www.discoveringstatistics.com

## The puppy example continued

The main example in this tutorial is the same puppy therapy example as the previous tutorial. It is taken from Field (2017) and so all the background theory is in there (or track down the video of my lecture). In the previous tutorial I tried to motivate you with a photo of my dog, Ramsey. I don’t know whether it worked, but I’m going to ramp up the cute factor with a video because you can never have too many puppies.

To recap the example, a review of animal-assisted therapy in childhood mental health found that of 24 studies, 8 found positive effects, 10 showed mixed findings, and 6 concluded that there was no effect (Hoagwood et al. 2017). The example is based around an imagined study that, in light of these mixed findings, tested the efficacy of puppies in the therapeutic process. Participants were randomized into three groups: (1) a control group (a treatment as usual, a no treatment or some kind of placebo group); (2) 15 minutes of puppy therapy (a low-dose group); and (3) 30 minutes of puppy contact (a high-dose group). The dependent variable was a measure of happiness ranging from 0 (as unhappy as I can possibly imagine being) to 10 (as happy as I can possibly imagine being). Remember that we predicted that any form of puppy therapy should be better than the control (i.e. higher happiness scores) and that as exposure time increases happiness will increase too (a dose-response hypothesis).

In the lecture we discovered that we could operationalize these hypotheses as two dummy variables using contrast coded.

Table 1: Contrast codes for the puppy therapy data
Group Dummy_1 Dummy_2
Control -2 0
15 Minutes 1 -1
30 Minutes 1 1

In the previous tutorial we fitted this model using dummy variables that coded the three experimental groups in a way that compared each group to a control. In this tutorial we’ll look at how we fit the model using a the coding scheme in Table 1, which tests our two hypotheses. The first contrast (Dummy_1 in Table 1) compares the control group to any group that had puppy therapy, whereas the second (Dummy_2 in Table 1) ignores the control group (by coding it with 0) and compares the 15-minute group to the 30-minute group. These contrasts, therefore, test these hypotheses:

1. Does having a dose (but ignoring if the dose was 15- or 30-minutes) of puppies lead to different levels of happiness than having no puppies at all?
2. Does having 15-minutes of puppies lead to different levels of happiness than having or 30-minutes of puppies.

The data are in the file Puppies Contrast.sav and are shown in Figure 1.

The data editor has five variables/columns:

• Person: an ID number
• Dose: codes the group to which the individual belonged (I have coded 1 = control, 2 = 15 minutes and 3 = 30 minutes)
• Happiness: the person’s happiness score (1-10)
• dummy1: the first of two dummy variables that uses the contrast codes in Table 1. This variable has codes -2 = control, 1 = everything else. It represents the control group compared to all other groups.
• dummy2: the second of two dummy variables that uses the contrast codes in Table 1. This variable has codes 0 = control, -1 = 15 mins, 1 = 30 minutes. It represents the 15-minute group compared to the 30-minute group

## The model

### Contrast coding

We have coded the categorical predictor (Dose) using contrast coding. As we saw in the previous tutorial SPSS has different menu structures when the goal of the linear model is to compare means to when predictors are all continuous. This can create the false impression that when you fit a model to compare means it is a different model to the one you fit when looking at predicting an outcome from continuous variables. As in the previous tutorial we’ll try to debunk this myth.

### The model

As revision from the lecture/chapter, the model we’re fitting is:

$\text{Happiness}_i = b_0 + b_1\text{Contrast 1}_i+ b_2\text{Contrast 2}_i + ε_i$ In which the variables Contrast 1 and Contrast 2 are the dummy variables named dummy1 and dummy2 in the data file. The first variable (Contrast 1) represents the control group compared to all other groups, whereas the second variable (Contrast 2) represents the difference between the 15-minute group and the 30-minute group.

### Fitting the model using the Regression menu (optional)

Let’s first fit the model using the Regression menu. This isn’t what you’d normally do, but it is instructive for showing how what happens ‘under the hood’ is the same when you use the Regression menu to when you use the One-Way ANOVA menu. The different menus are simply different ‘wrappers’ or ‘packaging’ for the same statistical procedure. This video shows how to fit the model using the Regression menu:

Quiz

## Fitting the model using SPSS Statistics

### Fitting the model

The following video shows how to fit the model using SPSS Statistics.

### Re-cap of the previous tutorial

Outputs 2 to 4 and Figure 3 are reproduced from the previous tutorial. Just to briefly recap, we can conclude from Output 3 that:

• There was a significant effect of puppy therapy on levels of happiness, F(2, 12) = 5.12, p = 0.025.

From Output 4 (which, I would argue, is the Output that we should look at because it corrects for deviations from the assumption of homogeneity of variance) we would conclude:

• Welch: There was not a significant effect of puppy therapy on levels of happiness, F(2, 7.94) = 4.32, p = 0.054.
• Brown-Forsythe: There was a significant effect of puppy therapy on levels of happiness, F(2, 11.57) = 5.12, p = 0.026.

## Contrast coding

Although the Welch F is technically not significant, because the Brown-Forsythe F is significant it’s worth knowing where the differences between the groups lie. The contrast codes that were explained in the lecture/book test two specific hypotheses about the group means. The first table displays the contrast coefficients that we entered into the Contrasts dialog box when we set up the model (see the video). It is worth checking this table to make sure that the contrasts compare what they are supposed to. We can see that the first contrast compares the control (-2) to the combined mean of the 15- and 30-minute groups (both coded as 1), and the second contrast ignores the control group (by coding with 0), and compares the 15-minute group (-1) to the 30-minute group (1). These are the comparisons we intended.

Quiz

The second table in Output 5 gives the statistics for each contrast in their raw form but also corrected for unequal variances. I recommend routinely looking at the corrected values (rather than, for example, using Levene’s test to decide which values to use in this table). The table tells us the value of the contrast itself, which is the weighted sum of the group means. This value is obtained by taking each group mean, multiplying it by the weight for the contrast of interest, and then adding these values together. For contrast 1, we can say that taking puppy therapy significantly increased happiness compared to the control group (p = 0.031), but contrast 2 tells us that 30 minutes of puppy therapy did not significantly affect happiness compared to 15 minutes (p = 0.086). Contrast 2 is almost significant, which again demonstrates how the NHST process can lead you to very all-or-nothing thinking.

## Post hoc tests

### Output

If we had no specific hypotheses about the effect of puppy therapy on happiness then we would have selected post hoc tests to compare all group means to each other (Output 6). If we look at Tukey’s test first we see that for each pair of groups the difference between group means is displayed, the standard error of that difference, the significance level of that difference and a 95% confidence interval. The first row of Output 6 compares the control group to the 15-minute group and reveals a non-significant difference (Sig. of 0.516 is greater than 0.05), and the second row compares the control group to the 30-minute group where there is a significant difference (Sig. of 0.021 is less than 0.05). It might seem odd that the contrast above showed that any dose of puppy therapy produced a significant increase in happiness, yet the post hoc tests indicate that a 15-minute does not. Can you explain the contradiction between the planned contrasts and post hoc tests?

### Solution

The first contrast compared the therapy groups to the control group. Specifically, it compares the average of the 15- and 30-minute puppy therapy groups ($$\frac{3.2 + 5.0}{2} = 4.1$$) to the mean of the control group (2.2). So, it assesses whether the difference between these values ($$4.1 − 2.2 = 1.9$$) is significant. The post hoc test that compares the 15-minute group to the control is testing something different: whether the mean of 3.2 (15-minute group) is different to 2.2 (the control group). The difference being tested in the post hoc test is 1, whereas the difference tested by the contrast is 1.9.

### Back to the output

The third and fourth rows of Output 6 compared the 15-minute group to both the control group and the 30-minute group. The test involving the 15-minute and 30-minute groups shows that these group means did not differ (because the p of 0.147 is greater than our alpha of 0.05. Rows 5 and 6 repeat comparisons already discussed. The second block of the table describes the Games–Howell test and a quick inspection reveals the same pattern of results: the only groups that differed significantly were the 30-minute and control groups. These results give us confidence in our conclusions from Tukey’s test because even if the population variances are not equal (which seems unlikely given that the sample variances are very similar), then the profile of results holds true.

## Unguided example

### There goes my hero … watch him as he goes (to hospital)

We’ll look at the same example as the last tutorial (a smart Alex task from Field (2017)). To recap, we looked at a study showing children reporting to hospital with severe injuries because of trying ‘to initiate flight without having planned for landing strategies’ (Davies et al. 2007) and imagined a study looking at whether the type of superhero costume affected the severity of injury. We used a data file with variables representing the severity of injury (on a scale from 0, no injury, to 100, death) for children reporting to the accident and emergency department at hospitals, and information on which superhero costume they were wearing (hero): Superman ( = 1), Spiderman (= 2), the Hulk (= 3) or a teenage mutant ninja turtle (= 4). In the previous tutorial we fitted a model to test the (overall) hypothesis that different costumes give rise to more severe injuries. In this tutorial we will fit a model to test this specific hypothesis:

• Costumes of ‘flying’ superheroes (i.e. That is, the ones that travel through the air: Superman and Spiderman) will lead to more severe injuries than non-flying ones (the Hulk and Ninja Turtles).

In the last tutorial I got you into the mood for hulk-related data analysis with a photo of my wife and I on the Hulk roller-coaster in Florida on our honeymoon. In this tutorial I want to ramp up the weird factor with a photo of some complete strangers reading an earlier edition of my SPSS textbook on the Hulk roller-coaster, because nothing says ‘I love your textbook’ like taking it on a stomach-churning high speed ride. I dearly wish that reading my books on roller coasters would become a ‘thing’.

There are detailed answers to this task on the companion website of my SPSS textbook.

### Solution

In the lecture/book, we learned that we need to follow rules to generate contrasts:

1. Choose sensible contrasts. Remember that you want to compare only two chunks of variation and that if a group is singled out in one contrast, that group should be excluded from any subsequent contrasts.
2. Groups coded with positive weights will be compared against groups coded with negative weights. So, assign one chunk of variation positive weights and the opposite chunk negative weights.
3. If you add up the weights for a given contrast the result should be zero.
4. If a group is not involved in a contrast, automatically assign it a weight of zero, which will eliminate it from the contrast.
5. For a given contrast, the weights assigned to the group(s) in one chunk of variation should be equal to the number of groups in the opposite chunk of variation.

Figure 4 shows how we would apply Rule 1 to the Superhero example. We’re told that we want to compare flying superheroes (i.e. Superman and Spiderman) against non-flying ones (the Hulk and Ninja Turtles) in the first instance. That will be contrast 1. However, because each of these chunks is made up of two groups (e.g., the flying superheroes chunk comprises both children wearing Spiderman and those wearing Superman costumes), we need a second and third contrast that breaks each of these chunks down into their constituent parts.

To get the weights (Table 2), we apply rules 2 to 5. Contrast 1 compares flying (Superman, Spiderman) to non-flying (Hulk, Turtle) superheroes. Each chunk contains two groups, so the weights for the opposite chunks are both 2. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen the flying superheroes to have positive weights, but you could do it the other way around). Contrast two then compares the two flying superheroes to each other. First we assign both non-flying superheroes a 0 weight to remove them from the contrast. We’re left with two chunks: one containing the Superman group and the other containing the Spiderman group. Each chunk contains one group, so the weights for the opposite chunks are both 1. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen to give Superman the positive weight, but you could do it the other way around).

Finally, Contrast three compares the two non-flying superheroes to each other. First we assign both flying superheroes a 0 weight to remove them from the contrast. We’re left with two chunks: one containing the Hulk group and the other containing the Turtle group. Each chunk contains one group, so the weights for the opposite chunks are both 1. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen to give the Hulk the positive weight, but you could do it the other way around).

Table 2: Contrast codes for the superhero data
Group Contrast_1 Contrast_2 Contrast_3
Superman 2 1 0
Spiderman 2 -1 0
Hulk -2 0 1
Ninja Turtle -2 0 -1

### The model

To recap last week we saw from the Brown-Forsythe and Welch F-statistics (Output 7) that there was a significant effect of the costume worn on the severity of injuries because the Sig. values are below the conventional 0.05 threshold. Now fit the same model but using the contrast codes that we have just derived.

Quiz

Now fit the same model but requesting post hoc tests (Bonferroni and Games-Howell).

Quiz

## Next tutorial

The next tutorial will follow up the puppy therapy example to look at what to do if you want to adjust an effect by another (continuous) variable.

## References

Davies, P., J. Surridge, L. Hole, and L. Munro-Davies. 2007. “Superhero-Related Injuries in Paediatrics: A Case Series.” Journal Article. Archives of Disease in Childhood 92 (3): 242–43. doi:10.1136/adc.2006.109793.

Field, Andy P. 2017. Discovering Statistics Using Ibm Spss Statistics: And Sex and Drugs and Rock ’N’ Roll. Book. 5th ed. London: Sage.

Hoagwood, K. E., M. Acri, M. Morrissey, and R. Peth-Pierce. 2017. “Animal-Assisted Therapies for Youth with or at Risk for Mental Health Problems: A Systematic Review.” Journal Article. Applied Developmental Science 21 (1): 1–13. doi:10.1080/10888691.2015.1134267.