GLM 9: mixed designs

The general linear model: mixed designs

Overview

This tutorial is one of a series that accompanies Discovering Statistics Using IBM SPSS Statistics (Field 2017) by me, Andy Field. These tutorials contain abridged sections from the book (so there are some copyright considerations).1

  • Who is the tutorial aimed at?
    • Students enrolled on my Discovering Statistics module at the University of Sussex, or anyone reading my textbook Discovering Statistics Using IBM SPSS Statistics (Field 2017)
  • What is covered?
    • This tutorial develops the material from the previous tutorial to look at comparing means using IBM SPSS Statistics when the research design uses a mixture of repeated measures and independent measures. We also explore a three-way design.
    • This tutorial does not teach the background theory: it is assumed you have either attended my lecture or read the relevant chapter in my book (or someone else’s)
    • The aim of this tutorial is to augment the theory that you already know by guiding you through fitting linear models using IBM SPSS Statistics and asking you questions to test your knowledge along the way.
  • Want more information?
    • The main tutorial follows the example described in detail in Field (2017), so there’s a thorough account in there.
    • You can access free lectures and screencasts on my YouTube channel
    • There are more statistical resources on my website www.discoveringstatistics.com

Speed dating

This tutorial follows the example Field (2017), which is about dating. A big discussion in magazines seems to be the factors that get you a relationship, for example, the relative importance of looks, personality, and dating strategies (whether you should ‘treat them mean to keep them keen’ and all that stuff). Scientists have looked at these issues too. For example, the top three most highly rated attributes of a partner in teenagers are reliability, honesty and kindness (Ha, Overbeek, and Engels 2010). Beyond that, in the same study boys tended to rate attractiveness slightly higher than girls, and girls rate a sense of humour more highly than boys (although both are ranked in the top 10 by both sexes). With regard to dating strategies, Dai, Dong, and Jia (2014) suggest that if someone is committed to pursuing a relationship with a person who plays hard to get, they will find that person more desirable but less likeable.

Imagine a scientist designed a study to look at the interplay between looks, personality and dating strategies on evaluations of a date. She set up a speed-dating night with nine tables at which there sat a ‘date’. All the dates were stooges selected to vary in their attractiveness (attractive, average, and unattractive), their charisma (high charisma, average charisma, writes statistics books), and also the strategy they were told to employ during the conversation (normal or playing hard to get). The dates were trained before the study to act charismatically to varying degrees, and also how to act in a way that made them seem unobtainable (hard to get) or not. As such, across the nine dates/stooges there were three attractive people, one of whom acted charismatically, one who acted normally (average) and another who acted like a dullard, and likewise for the three average-looking dates and the three unattractive dates. Therefore, each participant attending a speed-dating night would be exposed to all combinations of attractiveness and charisma (these are repeated measures). (There was a set of nine male stooges and nine females so that those attending could meet ‘dates’ of whichever sex interested them.) Upon arrival participants were randomly assigned a blue or red sticker. For the participants with the red sticker the stooges played hard to get (unobtainable) and for those with a blue sticker they acted normally. Over the course of a few nights 20 people attended, spent 5 minutes with each of the nine ‘dates’ and then rated how much they’d like to have a proper date with each one as a percentage (100% = ‘I’d pay a large sum of money for their phone number’, 0% = ‘I’d pay a large sum of money for a plane ticket to get me as far away from them as possible’). The data are in LooksOrPersonality.sav and are shown in Figure 1. The 20 rows show the different participants, and the columns represent the following:

  • Strategy: Whether the participant was assigned to the ‘hard to get’ or ‘normal’ condition (i.e., were they given a red or blue sticker)
  • att_high: the participant’s rating of the data who was highly attractive and highly charismatic
  • av_high: the participant’s rating of the data who was averagely attractive and highly charismatic
  • ug_high: the participant’s rating of the data who was ugly and highly charismatic
  • att_some: the participant’s rating of the data who was highly attractive and averagely charismatic
  • av_some: the participant’s rating of the data who was averagely attractive and averagely charismatic
  • ug_some: the participant’s rating of the data who was ugly and averagely charismatic
  • att_none: the participant’s rating of the data who was highly attractive and a dullard
  • av_none: the participant’s rating of the data who was averagely attractive and a dullard
  • ug_none: the participant’s rating of the data who was ugly and a dullard
Figure 1: LooksOrPersonality.sav

Figure 1: LooksOrPersonality.sav

Fitting the model

To fit the model we use the Analyze > General Linear Model > Repeated Measures … menu. The following video shows how.

Interpreting the output

Output 1

Output 1

The F-statistics

Output 1 provided information about sphericity for each of the three repeated-measures effects in the model. If you have more enthusiasm for Mauchly’s test than I do you might have noted that all the values in the column labelled Sig. are above 0.05, indicating no significant departures from sphericity. However, in my book I advise correcting for sphericity by default, so that’s what I have done in Output 2, which shows only the Greenhouse-Geisser corrected degrees of freedom and associated p-values. (The version of this table that you will see will look a lot more hideous, but I have hidden the values that I don’t want to see - the chapter in the book shows how to do this.)

Working down from the top of the table, we find significant effects (the value in the column Sig. is less than 0.05) of Looks, the Looks × Strategy interaction, Charisma, the Charisma × Strategy interaction, the Looks × Charisma interaction and the Looks × Charisma × Strategy interaction. Everything, basically. You wouldn’t normally be interested in main effects when there are significant interactions, but for completeness we’ll interpret each effect in turn, starting with the main effect of Strategy.

Output 2

Output 2

The main effect of strategy

Output 3

Output 3

The main effect of looks

Output 4

Output 4

Output 4 shows the Estimated Marginal Means. The levels of Looks are labelled as 1, 2 and 3, and it’s down to you to remember how you entered the variables. If you assigned variables as I did then level 1 is attractive, level 2 is unattractive and level 3 is average. From this table you can see that as attractiveness falls, the mean rating falls too. Contrasts will help us to understand exactly what’s going on.

Output 5 shows the contrasts that we requested. Look at the row labelled Looks. We have a contrast comparing level 1 to level 3, and then comparing level 2 to level 3; because of the order in which we entered the variables, these contrasts represent attractive compared to average (level 1 vs. level 3) and unattractive compared to average (level 2 vs. level 3). The values of F for each contrast, and their related significance values tell us that the main effect of Looks represented the fact that attractive dates were rated significantly higher than average dates, F(1, 18) = 226.99, p < 0.001, and average dates were rated significantly higher than unattractive ones, F(1, 18) = 160.07, p < 0.001.

Output 5

Output 5

The main effect of charisma

Output 6

Output 6

Output 6 shows the estimated marginal means in which levels of Charisma are labelled as 1, 2 and 3. If you followed what I did in the video then level 1 is high charisma, level 2 is dullard and level 3 is some charisma. As charisma declines, the mean rating of the date falls too.

Because of the order that we entered variables, the row labelled Charisma in Output 3 tells us that the main effect of Charisma is that highly charismatic dates were rated significantly higher than dates with some charisma, F(1, 18) = 109.94, p < 0.001, and dates with some charisma were rated significantly higher than dullards, F(1, 18) = 227.94, p < 0.001.

The interaction between strategy and looks

Strategy significantly interacted with the attractiveness of the date, F(1.92, 34.62) = 80.43, p < 0.001 (Output 2). This effect tells us that the profile of ratings across dates of different attractiveness was different depending on whether or not they played hard to get. The means in Output 7 are plotted in the interaction graph.

Output 7

Output 7

The first contrast for the interaction term (Output 5) looks at level 1 of Looks (attractive) compared to level 3 (average), comparing playing hard to get to normal. This contrast is highly significant, F(1, 18) = 43.26, p < 0.001, suggesting that the increased interest in attractive dates compared to average-looking dates found when dates played hard to get is significantly more than when they acted normally. The second contrast, which compares playing hard to get to normal at level 2 of looks (unattractive) relative to level 3 (average) is also highly significant, F(1, 18) = 30.23, p < 0.001. This contrast tells us that the decreased interest in unattractive dates compared to average-looking dates found when dates played hard to get is significantly more than when they did not.

The interaction between strategy and charisma

Strategy significantly interacted with how charismatic the date was, F(1.87, 33.62) = 62.45, p < 0.001. This effect means that the profile of ratings across dates of different levels of charisma was influenced by the dating strategy employed. The means in Output 8 are plotted in the interaction graph.

Output 8

Output 8

We can break this interaction down using the contrasts in Output 5. The first one, which looks at level 1 of Charisma (high charisma) compared to level 3 (some charisma), for playing hard to get relative to normal, is highly significant, F(1, 18) = 27.20, p < 0.001. This result tells us that the increased interest in highly charismatic dates compared to averagely charismatic dates found when dates acted normally is significantly more than when they played hard to get. The second contrast looks at level 2 of Charisma (dullard) compared to level 3 (some charisma), for playing hard to get relative to normal. This contrast is highly significant, F(1, 18) = 33.69, p < 0.001, suggesting that the decreased interest in dull dates compared to averagely charismatic dates found is significantly less when dates play hard to get than when they act normally.

The interaction between looks and charisma

There was a significant Looks × Charisma interaction, F(3.20, 57.55) = 36.63, p < 0.001 (Output 1.3). This effect tells us that the profile of ratings across dates of different levels of charisma was different for attractive, average and unattractive dates. The means in Output 9 are plotted in the interaction graph.

Output 9

Output 9

The contrasts in Output 5 help to pick apart this interaction. The first contrast for the Looks × Charisma interaction investigates level 1 of looks (attractive) compared to level 3 (average-looking), for level 1 of Charisma (high charisma) relative to level 3 (some charisma). This is like asking ‘is the difference between high charisma and some charisma the same for attractive people and average-looking people?’ The best way to understand this contrast is to focus on the relevant bit of the interaction graph. Interest (as indicated by high ratings) in attractive dates was the same regardless of whether they had high or some charisma; however, for average-looking dates, there was more interest when that person had high charisma rather than some. The contrast is highly significant, F(1, 18) = 21.94, p < 0.001, and tells us that as attractiveness is reduced there is a significantly greater decline in interest when charisma is average compared to when it is high.

The second contrast asks the question ‘is the difference between no charisma and some charisma the same for attractive people and average-looking people? It explores level 1 of Looks (attractive) compared to level 3 (average-looking), for level 2 of Charisma (dullard) relative to level 3 (some charisma). We can again focus on the relevant part of the interaction graph. This graph shows that interest in attractive dates was higher when they had some charisma (blue) than when they were a dullard (green); the same is also true for average-looking dates. The two lines are fairly parallel, which is reflected in the non-significant contrast, F(1, 18) = 4.09, p = 0.058. It seems that as the attractiveness of dates is reduced there is a decline in interest both when charisma is average and when the date is dull.

The third contrast investigates level 2 of Looks (unattractive) relative to level 3 (average-looking), comparing level 1 of Charisma (high charisma) to level 3 (some charisma). This contrast asks ‘is the difference between high charisma and some charisma the same for unattractive people and average-looking people?’ Interest in dating decreases from average-looking dates to unattractive ones in dates with both high and some charisma dates; however, this fall is significantly greater in the low-charisma dates (the blue line is slightly steeper than the orange), F(1, 18) = 6.23, p = 0.022. As dates’ attractiveness is reduced there is a significantly greater decline in interest when dates have some charisma compared to when they have a lot.

The final contrast addresses the question ‘is the difference between no charisma and some charisma the same for unattractive people and average-looking people? It compares level 2 of Looks (unattractive) to level 3 (average-looking), in level 2 of Charisma (dullard) relative to level 3 (some charisma). For average-looking dates, ratings were higher when they had some charisma than when they were a dullard, but for unattractive dates the ratings were roughly the same regardless of the level of charisma. This contrast is highly significant, F(1, 18) = 88.60, p < 0.001.

The interaction between looks, charisma and strategy

The significant Looks × Charisma × Strategy interaction, F(3.20, 57.55) = 24.12, p < 0.001 (in Output 2), tells us whether the Looks × Charisma interaction described above is the same when dates played hard to get compared to when they didn’t. The means in Output 9 are plotted in the interaction graph.

Output 10

Output 10

Again, we can use contrasts to further break this interaction down (Output 5). These contrasts are similar to those for the Looks × Charisma interaction, but they now also take into account the effect of dating strategy as well. The first contrast for the Looks × Charisma × Strategy interaction explores level 1 of Looks (attractive) relative to level 3 (average-looking), when level 1 of Charisma (high charisma) is compared to level 3 (some charisma), when dates played hard to get relative to when they didn’t, F(1, 18) = 0.93, p = 0.348. It seems that interest in dating (as indicated by high ratings) attractive dates was the same regardless of whether they had high or average charisma (the blue and orange dots are in a similar place). However, for average-looking dates, there was more interest when that person had high charisma rather than some charisma (the blue dot is lower than the orange dot). The non-significance of this contrast indicates that this pattern of results is very similar when dates played hard to get and when they didn’t.

The second contrast explores level 1 of Looks (attractive) relative to level 3 (average-looking), when level 2 of Charisma (dullard) is compared to level 3 (some charisma), when dates played hard to get relative to when they didn’t. The contrast is significant, F(1, 18) = 60.67, p < 0.001, which reflects the fact that the pattern of means is different when dates played hard to get compared to when they didn’t. First, if we look at average-looking dates, more interest was expressed when the date has some charisma than when they have none, and this is true whether or not dates played hard to get (the distance between the blue and green lines is about the same in the two dating strategy groups). So, the difference created by playing hard to get doesn’t appear to be here. Now look at attractive dates. When dates played hard to get the interest in the date is high regardless of their charisma (the lines meet). However, when dates acted normally interest in dating an attractive person is much lower if they are a dullard (the green dot is much lower than the blue). Another way to look at it is that for dates with some charisma, the reduction in interest as attractiveness goes down is about the same regardless of whether they played hard to get (the blue lines have the same slope). However, for dates who are dullards, the decrease in interest if these dates are average-looking rather than attractive is much more dramatic if they play hard to get (the green line is steeper in the hard to get group).

The third contrast was also significant, F(1, 18) = 11.70, p = 0.003. This contrast compares level 2 of Looks (unattractive) to level 3 (average-looking), in level 1 of Charisma (high charisma) relative to level 3 (some charisma), when dates played hard to get relative to when they didn’t. First, let’s look at when dates played hard to get. As attractiveness goes down, so does interest when the date has high charisma and when they have some charisma (the slopes of the orange and blue lines are similar). So, regardless of charisma, there is a similar reduction in interest as attractiveness declines. Now let’s look at when the dates acted normally. The picture is quite different: when charisma is high, there is no decline in interest as attractiveness falls (the orange line is flat); however, when charisma is ‘some’, interest is lower in an unattractive date than in an average-looking date (the blue line slopes down). Another way to look at it is that for dates with some charisma, the reduction in interest as attractiveness goes down is about the same regardless of whether dates play hard to get (the blue lines have similar slopes). However, for dates who have high charisma, the decrease in interest if these dates are unattractive rather than average-looking is much more dramatic when dates played hard to get than when they didn’t (the orange line is steeper for dates that played hard to get).

The final contrast was not significant, F(1, 18) = 1.33, p = 0.263. This contrast looks at the effect of Strategy when comparing level 2 of Looks (unattractive) to level 3 (average-looking), in level 2 of Charisma (dullard) relative to level 3 (some charisma). Interest in unattractive dates was the same regardless of whether they had some charisma or were a dullard (the blue and green dots are in the same place). Interest in average-looking dates was greatest when they had some charisma rather than when they were a dullard (the blue dot is higher than the green). Importantly, this pattern of results is very similar when dates played hard to get and when they didn’t.

Unguided example

Let’s look at a second example from Field (2017). In the previous tutorial we looked at an example in which participants viewed videos of different drink products in the context of positive, negative or neutral imagery. Just to remind you:

As part of an initiative to stop binge drinking in teenagers, the government funded scientists to look at whether negative imagery could be used to make teenagers’ attitudes towards alcohol more negative. The scientists compared the effects of negative imagery against positive and neutral imagery for different types of drinks. Participants viewed a total of nine videos over three sessions. In one session, they saw three videos: (1) a brand of beer (Strange Brew) presented alongside negative imagery (a bunch of inanimate dead bodies in a trendy bar with the slogan ‘Strange Brew: who needs a liver?’); (2) a brand of wine (Liquid Fire) presented within positive imagery (a bunch of sexy hipster types in a trendy bar with the slogan ‘Liquid Fire: your life would be so much better if you were a sexy hipster type’); and (3) a brand of water (Backwater) presented with neutral imagery (some completely average people in a trendy bar accompanied by the slogan ‘Backwater: it will make no difference to your life one way or another’). In a second session (a week later), the participants saw the same three brands, but this time Strange Brew was accompanied by the positive imagery, Liquid Fire by the neutral image and Backwater by the negative. In a third session, the participants saw Strange Brew accompanied by the neutral image, Liquid Fire by the negative image and Backwater by the positive. After each advert participants rated the drinks from −100 (dislike very much) through 0 (neutral) to 100 (like very much). The order of adverts was randomized, as was the order in which people participated in the three sessions. This design is quite complex. There are two predictor/independent variables: the type of drink (beer, wine or water) and the type of imagery used (positive, negative or neutral). These two variables completely cross over, producing nine experimental conditions represented by 9 columns in the data editor.

The data are in the file MixedAttitude.sav, which contains the variables:

  • sex: whether the participant was male (1) or female (2)
  • beerpos: ratings of products when the product was beer and the advert contained positive imagery
  • beerneg: ratings of products when the product was beer and the advert contained negative imagery
  • beerneut: ratings of products when the product was beer and the advert contained neutral imagery
  • winepos: ratings of products when the product was wine and the advert contained positive imagery
  • wineneg: ratings of products when the product was wine and the advert contained negative imagery
  • wineneut: ratings of products when the product was wine and the advert contained neutral imagery
  • waterpos: ratings of products when the product was water and the advert contained positive imagery
  • waterneg: ratings of products when the product was water and the advert contained negative imagery
  • waterneut: ratings of products when the product was water and the advert contained neutral imagery

Men and women might respond differently to the products so repeat the analysis from the previous tutorial but take sex (a between-group variable) into account. To remind you of how we fit the model without including sex as a predictor you can re-watch the video:

Remember to also include sex as a between-group factor.

Quiz

Next tutorial

The next tutorial will look at using the general linear model to predict a categorical outcome variable.

References

Dai, X. C., P. Dong, and J. S. Jia. 2014. “When Does Playing Hard to Get Increase Romantic Attraction?” Journal Article. Journal of Experimental Psychology-General 143 (2): 521–26. doi:10.1037/a0032989.

Field, Andy P. 2017. Discovering Statistics Using Ibm Spss Statistics: And Sex and Drugs and Rock ’N’ Roll. Book. 5th ed. London: Sage.

Ha, Thao, Geertjan Overbeek, and Rutger C. M. E. Engels. 2010. “Effects of Attractiveness and Social Status on Dating Desire in Heterosexual Adolescents: An Experimental Study.” Journal Article. Archives of Sexual Behavior 39 (5): 1063–71. doi:10.1007/s10508-009-9561-z.


  1. This tutorial is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, basically you can use it for teaching and non-profit activities but not meddle with it.

Andy Field