What is adventr?

R is a free software environment for statistical analysis and graphics. R in its native form comes with a reasonable amount of functionality. However, the beauty of R is that it can be expanded by downloading packages that add specific functionality to the program. Being open source, anyone can write a package - even I can.

You can use R as a standalone piece of software but it doesn’t have the most pleasant user interface. This is where RStudio comes in. RStudio is a free integrated development environment (IDE) for R. In plain English, this means that it is a user interface through which to use R. So, you can use R without RStudio but you can’t use RStudio without R. RStudio has functionality that make working with R easier, more efficient, and generally more pleasant than working in R alone. That’s why I base my teaching within RStudio and I recommend using the adventr package from within it too.

The adventr package contains a series of interactive tutorials that teach R alongside chapters of my 2016 textbook An Adventure in Statistics: the reality enigma. The tutorials are written using a package called learnr. Once a tutorial is running it’s a bit like reading a book but with places where you can practice the R code that you have just been taught. The adventr package is free (as are all things R-related) and offered to support tutors and students using my textbook who want to learn R.

Contents of adventr

The package was written initially to support my own teaching on a module where I base the content around An Adventure in Statistics. One quirk of this is that there are some advanced tutorials on topics not covered in the book (but continue the themes of the book …). Another quirk is that - at present - there are some chapters that don’t have associated tutorials (for example, the Chapter on probability).

The tutorials are named to correspond (roughly) to the relevant chapter of the book. For example, adventr_03 would be a good tutorial to run alongside teaching related to chapter 3, and so on.

  • adventr_02: Data basics in R and RStudio
  • adventr_03: Summarizing data (introducing ggplot2)
  • adventr_04: Fitting models (central tendency)
  • adventr_05: Presenting data (summarizing groups and more ggplot2)
  • adventr_08: Inferential statistics and robust estimation (covers Chapter 9 too)
  • adventr_11: Hypothesis testing
  • adventr_14: The general linear model
  • adventr_15: Comparing two means
  • adventr_15_rm: Comparing two means (repeated measures)
  • adventr_16: Comparing several means
  • adventr_16_rm: Comparing several means (repeated measures)
  • adventr_17: Factorial designs
  • adventr_mlm: Multilevel models (not covered in the book)
  • adventr_growth: Growth models (not covered in the book)
  • adventr_log: Logistic regression (not covered in the book)

Installing adventr

To use adventr you first need to install R and RStudio. There are detailed instructions on how to do this at the end of this webpage along with some introductory material to get you oriented to R and RStudio. Once you have installed R and RStudio you can install adventr. The package is in development so you have to install it from github. To install the package execute (in RStudio):

install.packages("devtools") #if you don’t already have it installed
library(devtools)
install_github("profandyfield/adventr")

To run a particular tutorial execute:

library(adventr)
learnr::run_tutorial("name_of_tutorial", package = "adventr")

and replace “name of tutorial” with the name of the tutorial you want to run. For example, to run tutorial 3 (for Chapter 3) execute:

learnr::run_tutorial("adventr_03", package = "adventr")

The name of each tutorial is in bold in the list above. Once the command to run the tutorial is executed it will spring to life in a web browser.

Suggested workflow

The tutorials are self-contained (you practice code in code boxes) so you don’t need to use RStudio at the same time. However, to get the most from them I would recommend that you open two RStudio sessions (i.e. two RStudio windows running simultaneously). Use one RStudio session to run the tutorial. You won’t then be able to use this RStudio window (because its resources are allocated to the tutorial). In the second RStudio session try replicating what you learn in the tutorial. That is, open a new script file and everything you do in the tutorial, practice in the script file (and save it). This workflow has the advantage of not just teaching you the code that you need to do certain things, but also provides practice in using RStudio itself.

Setting up R and RStudio

About R and RStudio

R is a free software environment for statistical analysis and graphics. R exists as a base package with a reasonable amount of functionality. However, the beauty of R is that it can be expanded by downloading packages that add specific functionality to the program. Anyone with a big enough brain and a bit of time and dedication can write a package for other people to use. These packages, as well as the software itself, are stored in a central location known as the CRAN (Comprehensive R Archive Network). Once a package is stored in the CRAN, anyone with an internet connection can download it from the CRAN and install it to use within their own copy of R. R is basically a big global family of fluffy altruistic people contributing to the goal of producing a versatile data analysis tool that is free for everyone to use. It’s a statistical embodiment of The Beatles’ utopian vision of peace, love and humanity: a sort of ‘give ps a chance’.

The CRAN is central to using R: it is the place from where you download the software and any packages that you want to install. It would be a shame, therefore, if the CRAN were one day to explode or be eaten by cyber-lizards. The statistical world might collapse. Even assuming the cyber-lizards don’t rise up and overthrow the Internet, it is still a busy place. Therefore, rather than have a single CRAN location that everyone accesses, the CRAN is ‘mirrored’ at different places across the globe. ‘Mirrored’ means that there are identical versions of the CRAN scattered across the world. As a resident of the UK, I might access one of the UK CRAN mirrors (it’s likely to be the fastest connection). In general access a mirror of the CRAN close to you. Figure 1 shows schematically what we have just learnt. At the centre of the diagram is the CRAN: a repository of the base R software and hundreds of packages. People with big brains from all over the world write new packages and upload them into the CRAN for others to use. The CRAN itself is mirrored at different places across the globe (which means there are multiple copies of it). As a user of R you download the software, and install any packages that you want to use via your nearest CRAN.

Figure 1

Figure 1

You can use R as a standalone piece of software but it doesn’t have the most pleasant user interface. This is where RStudio comes in. RStudio is a free integrated development environment (IDE) for R. In plain English, this means that it is a user interface through which to use R. So, you can use R without RStudio but you can’t use RStudio without R. RStudio has functionality that make working with R easier, more efficient, and generally more pleasant than working in R alone. That’s why I base my teaching within RStudio.

Installing R and RStudio

Overview

The process for setting up your computer to use R within RStudio is:

  1. Install R
  2. Install RStudio
  3. Set up RStudio to suit your personality!

The first step is to install R. We’ll look at installing on MacOS first so ignore the next section if you use Windows.

Quick R install for Mac OS

  • Click here
  • Click on the link to R-x.x.x.pkg (where x.x.x is the latest version of R) as shown in Figure 2 to download the install file.
  • Double-click the downloaded file and proceed through the install process (which will be similar to other software that you have installed)
Figure 2

Figure 2

Quick R install for Windows

  • Click here
  • Click on the link to Download R x.x.x for Windows (where x.x.x is the latest version of R) as shown in Figure 3 to download the install file.
  • Double-click the downloaded file and proceed through the install process (which will be similar to other software that you have installed)
Figure 3

Figure 3

Slow R install

The previous instructions rely on direct links. In case these fail, this section details a less direct method which is displayed visually in Figure 4. Hopefully the quick install worked and you can ignore this section.

  • Go to the R project website
  • In the left-hand menu click on the link labelled CRAN
  • This takes you to a list of CRAN Mirrors which are locations that host copies of the central R repository (see above). Scroll down and click on a link for a CRAN close to you.
  • On the next page either click the link to Download R for MacOS X or Download R for Windows
  • If you chose MacOS, on the resulting page click on the link to R-x.x.x.pkg (where x.x.x is the latest version of R - see Figure 2). This will download the install file.
  • If you chose Windows, click on base on the resulting page, and then click the link to Download R x.x.x for Windows (where x.x.x is the latest version of R - see Figure 3). This will download the install file.
  • Once you have downloaded the install file, double-click it and proceed through the install process (which will be similar to other software that you have installed).
Figure 4

Figure 4

RStudio install

  • Go to www.rstudio.com
  • Figure 5 shows the screens to get to the install file. On the opening page under RStudio click on the download link. On the page that subsequently opens either click the download button under RStudio Desktop or scroll to the bottom of the screen (clicking the download button jumps you to the bottom of the page). Under the heading Installers for Supported Platforms there are links to the install file for Windows, MacOS and various Linux installs. Click on the link for the operating system you use. The install file will download.
  • Double-click the downloaded file and proceed through the install process (which will be similar to other software that you have installed)
Figure 5

Figure 5

Working with RStudio

There are three main ways to work in RStudio:

  • Console: This is the worst way (in my view). In the console you can type commands to get R to do things and view the output of those ‘things’. The disadvantage of working in the console is that the instructions that you give to R run in real time after you hit the return key. So, you can’t save what you have done or re-run blocks of commands. You also can’t share what you have done with other people.
  • Script: An alternative to the console is to work on a script file. You type your commands into that file, edit them, play about with them, then when you want to execute them you select the bit you want to execute and click or (faster) press ctrl + ↩︎ ( + ↩︎ on a mac). The results of the executed code will appear in the console.
    • Advantages: You can save the script file, which means you can reproduce your analyses at a later date, and you can share your files so that others can reproduce your analysis. You will tend to do similar sorts of things with data and having a repository of scripts helps with creating new scripts. For example, imagine you want to produce an error bar chart of some data, and you recall doing a similar graph for a different project, rather than trying to remember how to create the graph from scratch, just find your old script and copy the code. This will be very useful in the early stages of learning R, but after a while certain things that you do frequently will begin to stick in your head.
    • Disadvantages: You can’t output the script file to a document containing both your code and the output.
  • R Markdown: another option is to create an R Markdown document rather than a script file. R Markdown is a version of the markdown language which enables you to use symbols to specify the formatting of your document (this tutorial was written using R Markdown). I’ll give you a flavour below. An R Markdown document is basically a flashy script file in which you can write text and embed blocks of R code that execute interactively displaying the outcome within the markdown document itself (rather than the console)
    • Advantages: The main advantage is that you can ‘knit’ the markdown file into a PDF, word or html file. This means you can create an entire report that includes formatted text (like in a word file), your R code (which you can choose to include or not) and the results of your output. In theory (although it’d take a bit of effort learning the necessary skills) you could write an entire journal article that generates tables and graphs directly from your data ‘on the fly’. I use R Markdown a lot when working on other people’s data to produce reports from them. This means that if something gets added to the report, or some error is spotted in the data, I can re-generate the entire report by simply clicking a button - all the analyses will be recreated automatically using the corrected datafile. Some people also like being able to view their output in the same window as their R code.
    • Disadvantages: You need to learn R markdown! Large documents can become quite unwieldy because of the integration of code, output and text.

Customizing RStudio

Panes

The RStudio environment is shown in Figure 6. By default there are four panes that show different types of information:

  • Source: This pane is where you’ll view and edit script (or R Markdown) files. It is where you write instructions that tell R what you’d like to do (see previous section).
  • Console: This is where you can type commands to get R to do things and view the output of those ‘things’. However, as mentioned in the previous section, I encourage you to use scripts rather than the console so treat this window as where you view the output of your scripts. (If you decide to work with R Markdown this pane becomes much less important)
  • Workspace: This pane displays information related to your workspace. The Environment tab lists of all the data files and objects (e.g., variables, functions etc.) that you have created in your workspace. You Can use this tab to load external data files by clicking on . The History tab contains a searchable history of all the code you have typed. You can use this to locate code that you want to re-run, selecting it and clicking (to run it) or to add it to your current script.
  • Files etc.: This pane has several tabs.
    • The Files tab lists the files and directories in your home directory and allows you to navigate them. Click to move or copy a selected item, or to set a selected directory as the working directory.
    • The Plots tab displays plots generated by your code. Use and to move forwards and backwards through plots created during your session, and to export a plot as and image file, PDF or copy it to the clipboard.
    • The Packages tab list all the packages currently loaded.
    • The Help tab displays help files. You can search for a topic by typing a phrase into , or use the ? function in the console (more on that later)
Figure 6: The RStudio environment

Figure 6: The RStudio environment

Options

RStudio options are accessible from the Tools > Options menu (Windows) or RStudio > Preferences (MacOS). This opens the dialogue box in Figure 6. There is a full guide to the preferences here, but here’s a summary.

  • General: You can set aspects of workspace behaviour. For example, you can set a default working directory (the directory where RStudio looks for files) but I’d recommend doing that from within your source code anyway.
  • Code: This section is for changing the behaviour of the source pane (i.e. whether line numbers are displayed, how text is wrapped and so on). The default options are all fine.
  • Appearance: Specify aspects of the visual theme for the console and source editor (e.g., font, colours etc.). Again, the defaults are fine. The main thing to change (if you like) is to select a different theme from the list of ‘Editor themes’. This will apply one of the list of pre-defined themes for RStudio. So, for example, if you prefer to work with light text on a dark background (rather than the default of dark text on a light background) you could select one of the dark themes such as Solarized Dark (Figure 7).
  • Pane Layout: Here you can change the layout of the four panes described in the previous section. You’ll see a grid of four blocks each with a drop down menu allowing you to select one of the four options that we’ve already discussed (Source, Console, Workspace/Environment, Files). By default RStudio arranges these panes as source top-left, console bottom-left, workspace top-right, files bottom-right (Figure 8, left). You might like this configuration but personally I prefer source top-left, console top-right, workspace bottom-left, files bottom-right (Figure 8, right). This configuration works for me because I don’t tend to use the workspace pane very much so I can make the source pane (where I’m doing most of my work) large by dragging the pane divider down in the main RStudio window (see Figure 6) and have the workspace pane as a small strip underneath. Then I can have my ‘output’ panes to the right with the top half showing text output and the bottom half showing plots (see Figure 6 for how I typically arrange my RStudio environment). However, that’s just my preference and you might prefer a different configuration.
  • Packages: here you can set default CRAN repository and specify options relevant to developing your own packages (which we won’t do so you can leave these options alone).
  • Spelling: these options relate to spell checking (for example whether you ignore upper case words and words containing numbers). The main thing you might want to change here is the default dictionary which is English (United States). I, for example, would change this to English (United Kingdom) (Figure 9).
  • Sweave, Git/SVN and Publishing: these options relate to things that we won’t do (i.e. use GitHub, publish documents or apps online) and the default options are fine anyway.
Figure 7: The Solarized Dark theme

Figure 7: The Solarized Dark theme

Figure 8: Customizing the panes

Figure 8: Customizing the panes

Figure 9: Spelling options

Figure 9: Spelling options

Other resources

Statistics

  • The tutorials typically follow examples described in detail in A. P. Field (2016), so for most of them there’s a thorough account in there. You might also find A. P. Field, Miles, and Field (2012) useful for the R stuff.
  • There are free lectures and screencasts on my YouTube channel
  • There are free statistical resources on my website www.discoveringstatistics.com

R

References

Field, Andy P. 2016. An Adventure in Statistics: The Reality Enigma. Book. London: Sage.

Field, Andy P., Jeremy N. V. Miles, and Zoe C. Field. 2012. Discovering Statistics Using R: And Sex and Drugs and Rock ’N’ Roll. Book. London: Sage.

Copyright © 2000-2018, Professor Andy Field.