GLM 1: the linear model

The general linear model: introducing the general linear model (GLM)

Overview

This tutorial is one of a series that accompanies Discovering Statistics Using IBM SPSS Statistics (Field 2017) by me, Andy Field. These tutorials contain abridged sections from the book (so there are some copyright considerations).1

  • Who is the tutorial aimed at?
    • Students enrolled on my Discovering Statistics module at the University of Sussex, or anyone reading my textbook Discovering Statistics Using IBM SPSS Statistics (Field 2017)
  • What is covered?
    • This tutorial covers the very basics of fitting a linear model using IBM SPSS Statistics. It will look at fitting models with one predictor or many, and focusses mainly on interpreting the model fit statistics and parameter estimates of the model. Subsequent tutorials will build on this knowledge to look at linear models in more depth.
    • This tutorial does not teach the background theory: it is assumed you have either attended my lecture or read the relevant chapter in my book (or someone else’s)
    • The aim of this tutorial is to augment the theory that you already know by guiding you through fitting a linear model using IBM SPSS Statistics and asking you some questions to test your knowledge along the way.
  • Want more information?
    • The main tutorial follows the example described in detail in Field (2017), so there’s a thorough account in there.
    • You can access free lectures and screencasts on my YouTube channel
    • There are more statistical resources on my website www.discoveringstatistics.com

One predictor

The main tutorial follows the example from Field (2017) that looks at predicting physical and downloaded album sales (outcome variable) from various predictor variables. The data are in the file Album Sales.sav. This data file has 200 rows, each one representing a different album. There are also several columns, one of which contains the sales (in thousands) of each album in the week after release (Sales) and one containing the amount (in thousands of pounds/dollars/euro/whatever currency you use) spent promoting the album before release (Adverts). The other columns represent how many times songs from the album were played on a prominent national radio station in the week before release (Airplay), and how attractive people found the band’s image out of 10 (Image).

Figure 1: The data in IBM SPSS Statistics

Figure 1: The data in IBM SPSS Statistics

To begin with we will predict sales from advertising alone. The model we’re fitting is described by the following equation:

\[ \begin{aligned} Y_i & = b_0 + b_1X_i+ ε_i\\ \text{Sales}_i & = b_0 + b_1\text{Advertising}_i+ ε_i \end{aligned} \]

The data and the model look like this (note that this figure was created using R, not SPSS Statistics):

It should be clear that a positive relationship exists: the more money spent advertising an album, the more it is likely to sell. Of course there are some albums that sell well regardless of advertising (top left of scatterplot), but there are none that sell badly when advertising levels are high (bottom right of scatterplot). The scatterplot shows the linear model that we are about to fit. The following video illustrates how to get SPSS Statistics to estimate the parameters that describe the model.

The first summary table provides the value of R and \(R^2\) for the model.

Output 1: The model summary

Output 1: The model summary

Quiz

The next table reports the various sums of squares associated with the model, their degrees of freedom and the resulting mean squares.

Output 2: The model fit

Output 2: The model fit

Quiz

The final table provides estimates of the model parameters (the b-values) and the significance of these values. The Y intercept (\(b_0\)) is 134.14. This value can be interpreted as meaning that when no money is spent on advertising (when X = 0), the model predicts that 134,140 albums will be sold (remember that our unit of measurement is thousands of albums). The value of \(b_1\) is 0.096. This value represents the change in the outcome associated with a unit change in the predictor. In other words, if our predictor variable is increased by one unit (if the advertising budget is increased by 1), then our model predicts that 0.096 extra albums will be sold. Our units of measurement were thousands of pounds and thousands of albums sold, so we can say that for an increase in advertising of £1000 the model predicts 96 (0.096 × 1000 = 96) extra album sales. This investment is pretty useless for the record company: it invests £1000 and gets only 96 extra sales! Fortunately, as we already know, advertising accounts for only one-third of the variance in album sales.

Output 3: The parameter estimates

Output 3: The parameter estimates

If a predictor is having a significant impact on our ability to predict the outcome then its b should be different from 0 (and large relative to its standard error). The t-test and associated p-value tell us whether the b-value is significantly different from 0. The column Sig. contains the exact probability that a value of t at least as big as the one in the table would occur if the value of b in the population were zero. If this probability is less than 0.05, then people interpret that as the predictor being a ‘significant’ predictor of the outcome. For both ts, the probabilities are given as 0.000 (zero to 3 decimal places), and so the probability of these t values (or larger) occurring if the values of b in the population were zero is less than 0.001. In other words, the bs are significantly different from 0. In the case of the b for advertising budget this result means that the advertising budget makes a significant contribution (p < 0.001) to predicting album sales.

Using the model

Let’s use the model to make some predictions. First, replace the b-values with the values from the output:

\[ \begin{aligned} \text{Sales}_i & = b_0 + b_1\text{Advertising}_i+ ε_i \\ \text{Sales}_i & = 134.14 + (0.096\times\text{Advertising}_i)+ ε_i \\ \end{aligned} \]

It is now possible to make a prediction about album sales, by replacing the advertising budget with a value of interest. For example, imagine a recording company executive wanted to spend £100,000 on advertising a new album. Remembering that our units are already in thousands of pounds, we can simply replace the advertising budget with 100. He would discover that album sales should be around 144,000 for the first week of sales:

\[ \begin{aligned} \text{Sales}_i & = 134.14 + (0.096\times \text{Advertising}_i)+ ε_i \\ \text{Sales}_i & = 134.14 + (0.096\times \text{100})+ ε_i \\ &= 143.74 \end{aligned} \]

Several predictors

Let’s extend the model to include airplay and the band’s image as additional predictors. The executive has past research indicating that advertising budget is a significant predictor of album sales, and so the new predictors (airplay and attract) should be entered into the model after advertising budget. This method is hierarchical (the researcher decides in which order to enter variables into the model based on past research). The model we’re fitting is described by the following equation:

\[ \begin{aligned} Y_i & = b_0 + b_1X_{1i}+ b_2X_{2i} + \ldots + b_nX_{ni} + ε_i\\ \text{Sales}_i & = b_0 + b_1\text{Advertising}_i+ b_2\text{Airplay}_i + b_2\text{Image}_i + ε_i \end{aligned} \]

First let’s produce scatterplots of all of these variables.

The resulting scatterplots are below. Although the data are messy, the three predictors have reasonably linear relationships with the album sales and there are no obvious outliers (except maybe in the bottom left of the scatterplot with band image).

Figure 2: Scatterplot matrix

Figure 2: Scatterplot matrix

To do a hierarchical regression in SPSS we have to enter the variables in blocks (each block representing one step in the hierarchy). The video demonstrates.

Note that the output shows two models. Model 1 refers to the first stage in the hierarchy when only advertising budget is used as a predictor. Model 2 refers to when all three predictors are used. Under this table SPSS tells us what the dependent variable (outcome) was and what the predictors were in each of the two models. The column labelled R contains the values of the multiple correlation coefficient between the predictors and the outcome. When only advertising budget is used as a predictor, this is the simple correlation between advertising and album sales (0.578). In fact, all of the statistics for model 1 are the same as the previous model. The next column gives us a value of \(R^2\), which is a measure of how much of the variability in the outcome is accounted for by the predictors. For the first model its value is 0.335, which means that advertising budget accounts for 33.5% of the variation in album sales. However, when the other two predictors are included as well (model 2), this value increases to 0.665 or 66.5%. If advertising accounts for 33.5%, then the band’s image and airplay must account for an additional 33%. The inclusion of the two new predictors has explained quite a large additional amount of the variation in album sales

Output 4: Model summary

Output 4: Model summary

The next output contains the F-test of whether the model is significantly better at predicting the outcome than using the mean outcome (i.e., no predictors). The p-value tells us the probability of getting an F at least as large as the one we have if the null hypothesis were true (if we used the outcome mean to predict album sales). The F-statistic is 99.59, p < 0.001 for the initial model and 129.498, p < 0.001 for the second. Both models ‘significantly’ improved our ability to predict album sales compared to not fitting the model.

Output 5: Model fit

Output 5: Model fit

The next output gives us estimates for the b-values and statistics that indicate the individual contribution of each predictor to the model.

Output 6: Model parameter estimates

Output 6: Model parameter estimates

The b-values tell us about the relationship between album sales and each predictor. All three predictors have positive b-values indicating positive relationships. So, as advertising budget increases, album sales increase; as plays on the radio increase, so do album sales; and finally more attractive bands will sell more albums. The b-values tell us more than this, though. They tell us to what degree each predictor affects the outcome if the effects of all other predictors are held constant:

Quiz

We’ve looked at the band’s image, but for the other two predictors:

  • Advertising budget: b = 0.085 indicates that as advertising budget increases by one unit, album sales increase by 0.085 units. Both variables were measured in thousands; therefore, for every £1000 more spent on advertising, an extra 0.085 thousand albums (85 albums) are sold. This interpretation is true only if the effects of band image and airplay are held constant.
  • Airplay: b = 3.367 indicates that as the number of plays on radio in the week before release increases by one, album sales increase by 3.367 units. Every additional play of a song on radio (in the week before release) is associated with an extra 3.367 thousand albums (3367 albums) being sold. This interpretation is true only if the effects of the band’s image and advertising budget are held constant.

For the standardized beta’s, the quiz looked at airplay, so let’s summarize the values for the remaining predictors:

  • Advertising budget: Standardized \(\beta\) = 0.511 indicates that as advertising budget increases by one standard deviation (£485,655), album sales increase by 0.511 standard deviations. The standard deviation for album sales is 80,699, so this constitutes a change of 41,240 sales (0.511 × 80,699). Therefore, for every £485,655 more spent on advertising, an extra 41,240 albums are sold. This interpretation is true only if the effects of the band’s image and airplay are held constant.
  • Image: Standardized \(\beta\) = 0.192 indicates that a band rated one standard deviation (1.40 units) higher on the image scale can expect additional album sales of 0.192 standard deviations units. This is a change of 15,490 sales (0.192 × 80,699). A band with an image rating 1.40 higher than another band can expect 15,490 additional sales. This interpretation is true only if the effects of airplay and advertising are held constant.

Unguided example

Lacourse, Claes, and Villeneuve (2001) conducted a study to see whether suicide risk was related to listening to heavy metal music. They devised a scale to measure preference for bands falling into the category of heavy metal. This scale included heavy metal bands (Black Sabbath, Iron Maiden), speed metal bands (Slayer, Metallica), death/black metal bands (Obituary, Burzum) and gothic bands (Marilyn Manson, Sisters of Mercy). They then used this (and other variables) as predictors of suicide risk based on a scale measuring suicidal ideation etc.

The data file HMSuicide.sav contains the data from a replication. There are two variables representing scores on the scales described above: hm (the extent to which the person listens to heavy metal music) and suicide (the extent to which someone has suicidal ideation and so on). Fit a model to predict suicide risk from love of heavy metal.

Quiz

Next tutorial

The next tutorial will continue to look at the linear model but with more focus on how to interpret confidence intervals and p-values.

Useful resources

These are useful resources for understanding some of the concepts in this tutorial. These are not written or hosted by me, so I take no responsibility for whether they work. If they are working though, you might find them useful.

  • Click here for an interactive app to see how well you can estimate the correlation between two variables based on the scatterplot.
  • Can you adjust the intercept and slope of a line to find the line of best fit? Click here to try.

References

Field, Andy P. 2017. Discovering Statistics Using Ibm Spss Statistics: And Sex and Drugs and Rock ’N’ Roll. Book. 5th ed. London: Sage.

Lacourse, E., M. Claes, and M. Villeneuve. 2001. “Heavy Metal Music and Adolescent Suicidal Risk.” Journal Article. Journal of Youth and Adolescence 30 (3): 321–32.


  1. This tutorial is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, basically you can use it for teaching and non-profit activities but not meddle with it.

Andy Field