This tutorial is one of a series that accompanies Discovering Statistics Using IBM SPSS Statistics (Field 2017) by me, Andy Field. These tutorials contain abridged sections from the book (so there are some copyright considerations).1
The main example in this tutorial is the same puppy therapy example as the previous tutorial. It is taken from Field (2017) and so all the background theory is in there (or track down the video of my lecture). In the previous tutorial I tried to motivate you with a photo of my dog, Milton. I don't know whether it worked, but I'm going to ramp up the cute factor with a video of the first time I met Milton because you can never have too many puppies.
To recap the example, a review of animal-assisted therapy in childhood mental health found that of 24 studies, 8 found positive effects, 10 showed mixed findings, and 6 concluded that there was no effect (Hoagwood et al. 2017). The example is based around an imagined study that, in light of these mixed findings, tested the efficacy of puppies in the therapeutic process. Participants were randomized into three groups: (1) a control group (a treatment as usual, a no treatment or some kind of placebo group); (2) 15 minutes of puppy therapy (a low-dose group); and (3) 30 minutes of puppy contact (a high-dose group). The dependent variable was a measure of happiness ranging from 0 (as unhappy as I can possibly imagine being) to 10 (as happy as I can possibly imagine being). Remember that we predicted that any form of puppy therapy should be better than the control (i.e. higher happiness scores) and that as exposure time increases happiness will increase too (a dose-response hypothesis).
In the lecture we discovered that we could operationalize these hypotheses as two dummy variables using contrast coded.
Group | Dummy_1 | Dummy_2 |
---|---|---|
Control | -2 | 0 |
15 Minutes | 1 | -1 |
30 Minutes | 1 | 1 |
In the previous tutorial we fitted this model using dummy variables that coded the three experimental groups in a way that compared each group to a control. In this tutorial we'll look at how we fit the model using a the coding scheme in Table 1, which tests our two hypotheses. The first contrast (Dummy_1 in Table 1) compares the control group to any group that had puppy therapy, whereas the second (Dummy_2 in Table 1) ignores the control group (by coding it with 0) and compares the 15-minute group to the 30-minute group. These contrasts, therefore, test these hypotheses:
The data are in the file puppies_contrast.sav and are shown in Figure 1.
Figure 1: puppies_contrast.sav
The data editor has five variables/columns:
We have coded the categorical predictor (Dose) using contrast coding. As we saw in the previous tutorial SPSS has different menu structures when the goal of the linear model is to compare means to when predictors are all continuous. This can create the false impression that when you fit a model to compare means it is a different model to the one you fit when looking at predicting an outcome from continuous variables. As in the previous tutorial we'll try to debunk this myth.
As revision from the lecture/chapter, the model we're fitting is:
\[ \text{Happiness}_i = b_0 + b_1\text{Contrast 1}_i+ b_2\text{Contrast 2}_i + ε_i \] In which the variables Contrast 1 and Contrast 2 are the dummy variables named dummy1 and dummy2 in the data file. The first variable (Contrast 1) represents the control group compared to all other groups, whereas the second variable (Contrast 2) represents the difference between the 15-minute group and the 30-minute group.
Output 1
The following video shows how to fit the model using SPSS Statistics.
Outputs 2 to 4 and Figure 3 are reproduced from the previous tutorial. Just to briefly recap, we can conclude from Output 3 that:
From Output 4 (which, I would argue, is the Output that we should look at because it corrects for deviations from the assumption of homogeneity of variance) we would conclude:
Output 2
Figure 3
Output 3
Output 4
Although the Welch F is technically not significant, because the Brown-Forsythe F is significant it's worth knowing where the differences between the groups lie. The contrast codes that were explained in the lecture/book test two specific hypotheses about the group means. The first table displays the contrast coefficients that we entered into the Contrasts dialog box when we set up the model (see the video). It is worth checking this table to make sure that the contrasts compare what they are supposed to. We can see that the first contrast compares the control (-2) to the combined mean of the 15- and 30-minute groups (both coded as 1), and the second contrast ignores the control group (by coding with 0), and compares the 15-minute group (-1) to the 30-minute group (1). These are the comparisons we intended.
Output 5
The second table in Output 5 gives the statistics for each contrast in their raw form but also corrected for unequal variances. I recommend routinely looking at the corrected values (rather than, for example, using Levene's test to decide which values to use in this table). The table tells us the value of the contrast itself, which is the weighted sum of the group means. This value is obtained by taking each group mean, multiplying it by the weight for the contrast of interest, and then adding these values together. For contrast 1, we can say that taking puppy therapy significantly increased happiness compared to the control group (p = 0.031), but contrast 2 tells us that 30 minutes of puppy therapy did not significantly affect happiness compared to 15 minutes (p = 0.086). Contrast 2 is almost significant, which again demonstrates how the NHST process can lead you to very all-or-nothing thinking.
If we had no specific hypotheses about the effect of puppy therapy on happiness then we would have selected post hoc tests to compare all group means to each other (Output 6). If we look at Tukey’s test first we see that for each pair of groups the difference between group means is displayed, the standard error of that difference, the significance level of that difference and a 95% confidence interval. The first row of Output 6 compares the control group to the 15-minute group and reveals a non-significant difference (Sig. of 0.516 is greater than 0.05), and the second row compares the control group to the 30-minute group where there is a significant difference (Sig. of 0.021 is less than 0.05). It might seem odd that the contrast above showed that any dose of puppy therapy produced a significant increase in happiness, yet the post hoc tests indicate that a 15-minute does not. Can you explain the contradiction between the planned contrasts and post hoc tests?
Output 6
The first contrast compared the therapy groups to the control group. Specifically, it compares the average of the 15- and 30-minute puppy therapy groups (\(\frac{3.2 + 5.0}{2} = 4.1\)) to the mean of the control group (2.2). So, it assesses whether the difference between these values (\(4.1 − 2.2 = 1.9\)) is significant. The post hoc test that compares the 15-minute group to the control is testing something different: whether the mean of 3.2 (15-minute group) is different to 2.2 (the control group). The difference being tested in the post hoc test is 1, whereas the difference tested by the contrast is 1.9.
The third and fourth rows of Output 6 compared the 15-minute group to both the control group and the 30-minute group. The test involving the 15-minute and 30-minute groups shows that these group means did not differ (because the p of 0.147 is greater than our alpha of 0.05. Rows 5 and 6 repeat comparisons already discussed. The second block of the table describes the Games–Howell test and a quick inspection reveals the same pattern of results: the only groups that differed significantly were the 30-minute and control groups. These results give us confidence in our conclusions from Tukey’s test because even if the population variances are not equal (which seems unlikely given that the sample variances are very similar), then the profile of results holds true.
We'll look at the same example as the last tutorial (a smart Alex task from Field (2017)). To recap, we looked at a study showing children reporting to hospital with severe injuries because of trying ‘to initiate flight without having planned for landing strategies’ (Davies et al. 2007) and imagined a study looking at whether the type of superhero costume affected the severity of injury. We used a data file with variables representing the severity of injury (on a scale from 0, no injury, to 100, death) for children reporting to the accident and emergency department at hospitals, and information on which superhero costume they were wearing (hero): Superman ( = 1), Spiderman (= 2), the Hulk (= 3) or a teenage mutant ninja turtle (= 4). In the previous tutorial we fitted a model to test the (overall) hypothesis that different costumes give rise to more severe injuries. In this tutorial we will fit a model to test this specific hypothesis:
In the last tutorial I got you into the mood for hulk-related data analysis with a photo of my wife and I on the Hulk roller-coaster in Florida on our honeymoon. In this tutorial I want to ramp up the weird factor with a photo of some complete strangers reading an earlier edition of my SPSS textbook on the Hulk roller-coaster, because nothing says 'I love your textbook' like taking it on a stomach-churning high speed ride. I dearly wish that reading my books on roller coasters would become a 'thing'.
Figure 3: Random American students reading my book on the Hulk
There are detailed answers to this task on the companion website of my SPSS textbook.
In the lecture/book, we learned that we need to follow rules to generate contrasts:
Figure 4 shows how we would apply Rule 1 to the Superhero example. We’re told that we want to compare flying superheroes (i.e. Superman and Spiderman) against non-flying ones (the Hulk and Ninja Turtles) in the first instance. That will be contrast 1. However, because each of these chunks is made up of two groups (e.g., the flying superheroes chunk comprises both children wearing Spiderman and those wearing Superman costumes), we need a second and third contrast that breaks each of these chunks down into their constituent parts.
To get the weights (Table 2), we apply rules 2 to 5. Contrast 1 compares flying (Superman, Spiderman) to non-flying (Hulk, Turtle) superheroes. Each chunk contains two groups, so the weights for the opposite chunks are both 2. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen the flying superheroes to have positive weights, but you could do it the other way around). Contrast two then compares the two flying superheroes to each other. First we assign both non-flying superheroes a 0 weight to remove them from the contrast. We’re left with two chunks: one containing the Superman group and the other containing the Spiderman group. Each chunk contains one group, so the weights for the opposite chunks are both 1. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen to give Superman the positive weight, but you could do it the other way around).
Finally, Contrast three compares the two non-flying superheroes to each other. First we assign both flying superheroes a 0 weight to remove them from the contrast. We’re left with two chunks: one containing the Hulk group and the other containing the Turtle group. Each chunk contains one group, so the weights for the opposite chunks are both 1. We assign one chunk positive weights and the other negative weights (in Table 2 I’ve chosen to give the Hulk the positive weight, but you could do it the other way around).
Figure 4: Contrast codes
Group | Contrast_1 | Contrast_2 | Contrast_3 |
---|---|---|---|
Superman | 2 | 1 | 0 |
Spiderman | 2 | -1 | 0 |
Hulk | -2 | 0 | 1 |
Ninja Turtle | -2 | 0 | -1 |
To recap last week we saw from the Brown-Forsythe and Welch F-statistics (Output 7) that there was a significant effect of the costume worn on the severity of injuries because the Sig. values are below the conventional 0.05 threshold. Now fit the same model but using the contrast codes that we have just derived.
Output 7
Now fit the same model but requesting post hoc tests (Bonferroni and Games-Howell).
Output 8
Output 9
The next tutorial will follow up the puppy therapy example to look at what to do if you want to adjust an effect by another (continuous) variable.
Davies, P., J. Surridge, L. Hole, and L. Munro-Davies. 2007. “Superhero-Related Injuries in Paediatrics: A Case Series.” Journal Article. Archives of Disease in Childhood 92 (3): 242–43. doi:10.1136/adc.2006.109793.
Field, Andy P. 2017. Discovering Statistics Using Ibm Spss Statistics: And Sex and Drugs and Rock ’N’ Roll. Book. 5th ed. London: Sage.
Hoagwood, K. E., M. Acri, M. Morrissey, and R. Peth-Pierce. 2017. “Animal-Assisted Therapies for Youth with or at Risk for Mental Health Problems: A Systematic Review.” Journal Article. Applied Developmental Science 21 (1): 1–13. doi:10.1080/10888691.2015.1134267.
This tutorial is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, basically you can use it for teaching and non-profit activities but not meddle with it.↩