GLM 2: confidence intervals and significance

The general linear model: confidence intervals and significance tests


This tutorial is one of a series that accompanies Discovering Statistics Using IBM SPSS Statistics (Field 2017) by me, Andy Field. These tutorials contain abridged sections from the book (so there are some copyright considerations).1

  • Who is the tutorial aimed at?
    • Students enrolled on my Discovering Statistics module at the University of Sussex, or anyone reading my textbook Discovering Statistics Using IBM SPSS Statistics (Field 2017)
  • What is covered?
    • This tutorial develops the material from the previous tutorial to look again at fitting linear models using IBM SPSS Statistics but with a focus on interpreting p-values and confidence intervals of the model parameters.
    • This tutorial does not teach the background theory: it is assumed you have either attended my lecture or read the relevant chapter in my book (or someone else’s)
    • The aim of this tutorial is to augment the theory that you already know by guiding you through fitting linear models using IBM SPSS Statistics and asking you questions to test your knowledge along the way.
  • Want more information?
    • The main tutorial follows the example described in detail in Field (2017), so there’s a thorough account in there.
    • You can access free lectures and screencasts on my YouTube channel
    • There are more statistical resources on my website

The standard error

This video shows a demonstration that may help you to get a better understanding of what the standard error and sampling distribution of a model parameter b-value represents.

Demonstration of sampling, the standard error and sampling distributions

Demonstration of sampling, the standard error and sampling distributions


Fitting the model

The main tutorial follows the example from Field (2017) that looks at predicting physical and downloaded album sales (outcome variable) from various predictor variables. The data are in the file Album Sales.sav. This data file has 200 rows, each one representing a different album. There are also several columns, one of which contains the sales (in thousands) of each album in the week after release (Sales) and one containing the amount (in thousands of pounds/dollars/euro/whatever currency you use) spent promoting the album before release (Adverts). The other columns represent how many times songs from the album were played on a prominent national radio station in the week before release (Airplay), and how attractive people found the band’s image out of 10 (Image). Remember that the data are arranged as follows in the data editor:

Figure 1: The data in IBM SPSS Statistics

Figure 1: The data in IBM SPSS Statistics

In the previous tutorial we fitted this model:

\[ \text{Sales}_i = b_0 + b_1\text{Advertising}_i+ b_2\text{Airplay}_i + b_2\text{Image}_i + ε_i \] This model predicts album sales (in thousands) from advertising budget (in thousands), the amount of airplay before the album is released and the ratings of the band’s image. We fitted the model hierarchically in two blocks with advertising budget entered in the first block and the other two predictors entered in a second block. The following video recaps how to fit this model using SPSS Statistics and shows how to ask for confidence intervals for the model parameters.

Change statistics

The first output we have seen before in the previous tutorial, but this time it includes the change statistics (because we selected this option). We looked at R and \(R^2\) in the previous tutorial so hopefully you remember how to interpret them.


The adjusted \(R^2\) gives us some idea of how well our model generalizes and ideally we’d like its value to be the same as, or very close to, the value of \(R^2\). In this example the difference for the final model is small (it is 0.665 − 0.660 = 0.005 or about 0.5%). This shrinkage means that if the model were derived from the population rather than a sample it would account for approximately 0.5% less variance in the outcome, indicating that the cross-validity of this model is very good.

The change statistics tell us whether the change in \(R^2\) is significant (i.e. how much does the model fit improve as more predictors are added?). The change is reported for each block of the hierarchy: for model 1, \(R^2\) changes from 0 to 0.335, and gives rise to an F-statistic of 99.59, which is significant with a probability less than 0.001. In model 2, in which image and airplay have been added as predictors, \(R^2\) increases by 0.330, making the \(R^2\) of the new model 0.665 with a significant (p < 0.001) F-statistic of 96.44.

Output 4: Model summary

Output 4: Model summary

Confidence intervals

The next output contains information that we saw in the first tutorial, but this time we have also got information about the confidence intervals for each model parameter.

Output 6: Model parameter estimates

Output 6: Model parameter estimates

A bit of revision. Imagine that we collected 100 samples of data measuring the same variables as our current model. For each sample we estimate the same model that we have in this chapter, including confidence intervals for the unstandardized beta values. These boundaries are constructed such that in 95% of samples they contain the population value of b. Therefore, 95 of our 100 samples will yield confidence intervals for b that contain the population value. The trouble is that we don’t know if our sample is one of the 95% with confidence intervals containing the population values or one of the 5% that misses.

The typical pragmatic solution to this problem is to assume that your sample is one of the 95% that hits the population value. If you assume this, then you can reasonably interpret the confidence interval as providing information about the population value of b. A narrow confidence interval suggests that all samples would yield estimates of b that are fairly close to the population value, whereas wide intervals suggest a lot of uncertainty about what the population value of b might be. If the interval contains zero then it suggests that the population value of b might be zero – in other words, no relationship between that predictor and the outcome—and could be positive but might be negative. All of these statements are reasonable if you’re prepared to believe that your sample is one of the 95% for which the intervals contain the population value. Your belief will be wrong 5% of the time, though.

The quiz told you about the confidence interval for airplay, for the remaining predictors the confidence intervals tell us that assuming that each confidence interval is one of the 95% that contains the population parameter:

  • The true size of the relationship between advertising budget and album sales lies somewhere between 0.071 and 0.099.
  • The true size of the relationship between band image and album sales lies somewhere between 6.28 and 15.89.

The two best predictors (advertising and airplay) have very tight confidence intervals indicating that the estimates for the current model are likely to be representative of the true population values. The interval for the band’s image is wider (but still does not cross zero) indicating that the parameter for this variable is less representative, but nevertheless significant.

Significance tests

The output also contains significance tests for each predictor.

Output 6: Model parameter estimates

Output 6: Model parameter estimates


Many students and researchers think of p-values in terms of the ‘probability of a chance result’ or ‘the probability of a hypothesis being true’ but they are neither of these things. They are the long-run probability that you would get a test-statistic (in this case t) at least as large as the one you have if the null hypothesis were true. In other words, if there really were no relationship between advertising budget and album sales (the null hypothesis) then the population value of b would be zero. Imagine we sampled from this null population and computed t, and then repeated this process 1000 times. We’d have 1000 values of t from a population in which there was no effect. We could plot these values as a histogram. This would tell us how often certain values of t occur. From it we could work out the probability of getting a particular value of t. If we then took another sample, and computed t (because we’re kind of obsessed with this sort of thing) we would be able to compare this value of t to the distribution of all the previous 1000 samples. Is the t in our current sample large of small compared to the others? Let’s say it was larger than 999 of the previous values. That would be quite an unlikely value of t whereas if it was larger than 500 of them this would not surprise us. This is what a p-value is: it is the long run probability of getting test statistic at least as large as the one you have if the null hypothesis were true. If the value is less than 0.05, people typically take this as supporting the idea that the null hypothesis isn’t true.

The p-values in the table all tell us the long-run probability that we would get a a value of t at least as large as the ones we have if the the true relationship between each predictor and album sales was 0 (i.e., b = 0). In all cases the probabilities are less than 0.001, which researchers would generally take to mean that the observed bs are significantly different from zero. Given the bs quantify the relationship between each predictor and album sales, this conclusion implies that each predictor significantly predicts album sales.

Unguided example

In this example we’ll look at data collected from several questionnaires relating to clinical psychology, and we will use these measures to predict social anxiety. Anxiety disorders take on different shapes and forms, and each disorder is believed to be distinct and have unique causes. We can summarize the disorders and some popular theories as follows:

  • Social Anxiety: Social anxiety disorder is a marked and persistent fear of 1 or more social or performance situations in which the person is exposed to unfamiliar people or possible scrutiny by others. This anxiety leads to avoidance of these situations. People with social phobia are believed to feel elevated feelings of shame.
  • Obsessive Compulsive Disorder (OCD): OCD is characterized by the everyday intrusion into conscious thinking of intense, repetitive, personally abhorrent, absurd and alien thoughts (Obsessions), leading to the endless repetition of specific acts or to the rehearsal of bizarre and irrational mental and behavioural rituals (compulsions).

Social anxiety and obsessive compulsive disorder are seen as distinct disorders having different causes. However, there are some similarities. They both involve some kind of attentional bias: attention to bodily sensation in social anxiety and attention to things that could have negative consequences in OCD. They both involve repetitive thinking styles: social phobics ruminate about social encounters after the event (known as post-event processing), and people with OCD have recurring intrusive thoughts and images. They both involve safety behaviours (i.e. trying to avoid the thing that makes you anxious).

This might lead us to think that, rather than being different disorders, they are manifestations of the same core processes (Field and Cartwright-Hatton 2008). One way to research this possibility would be to see whether social anxiety can be predicted from measures of other anxiety disorders. If social anxiety disorder and OCD are distinct we should expect that measures of OCD will not predict social anxiety. However, if there are core processes underlying all anxiety disorders, then measures of OCD should predict social anxiety. The data are in the file SocialAnxietyRegression.sav. This file contains three variables of interest to us:

  • The Social Phobia and Anxiety Inventory (SPAI), which measures levels of social anxiety (Turner, Beidel, and Dancu 1996).
  • Obsessive Beliefs Questionnaire (OBQ), which measures the degree to which people experience obsessive beliefs like those found in OCD (Steketee et al. 2001).
  • The Test of Self-Conscious Affect (TOSCA), which measures shame (Tangney et al. 2000).

Each of 134 people was administered all questionnaires. Fit a hierarchical linear model with two blocks: 1. Block 1: the first block will contain any predictors that we expect to predict social anxiety. In this example we have only one variable that we expect, theoretically, to predict social anxiety and that is shame (measured by the TOSCA). 2. Block 2: the second block contains OBQ, the predictor variable that we don’t necessarily expect to predict social anxiety.

Use what you have learned to fit this model and answer these questions.


Next tutorial

The next tutorial will continue to look at bias in the linear model.

Useful resources

These are useful resources for understanding some of the concepts in this tutorial. These are not written or hosted by me, so I take no responsibility for whether they work. If they are working though, you might find them useful.

  • Click here for an interactive app for illustrating why b-values have standard errors


Field, Andy P. 2017. Discovering Statistics Using Ibm Spss Statistics: And Sex and Drugs and Rock ’N’ Roll. Book. 5th ed. London: Sage.

Field, Andy P., and Sam Cartwright-Hatton. 2008. “Shared and Unique Cognitive Factors in Social Anxiety.” Journal Article. International Journal of Cognitive Therapy 1 (3): 206–22. doi:10.1680/ijct.2008.1.3.206.

Steketee, G., R. Frost, N. Amir, M. Bouvard, C. Carmin, D. A. Clark, J. Cottraux, et al. 2001. “Development and Initial Validation of the Obsessive Beliefs Questionnaire and the Interpretation of Intrusions Inventory.” Journal Article. Behaviour Research and Therapy 39 (8): 987–1006. <Go to ISI>://000170088800011.

Tangney, J. P., R. Dearing, P. E. Wagner, and R. Gramzow. 2000. The Test of Self–conscious Affect–3 (Tosca–3). Book. Fairfax,VA: George Mason University.

Turner, S.M., D. C. Beidel, and C. V. Dancu. 1996. Social Phobia and Anxiety Inventory: Manual. Book. Toronto: Multi–health Systems Inc.

  1. This tutorial is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, basically you can use it for teaching and non-profit activities but not meddle with it.

Andy Field