Skip to contents

The goal of sessioncheck is to provide a simple tool that can be called at the top of a script, and produce warnings or errors if it detects signs that the script is not being executed in a clean R session:

# include this as the first line of a script
# as a safer alternative to using rm(list=ls())
sessioncheck::sessioncheck()

Who is sessioncheck for?

The intended user for sessioncheck is a beginner or intermediate level R user who wants to take reasonable precautions to ensure that their analysis scripts execute reproducibly, but is not looking for a full-featured solution that might require substantial time investment to learn and deploy.

Why is sessioncheck useful?

A common practice when writing R scripts is to include a snippet of code like rm(list = ls()) at the top of the script. The reason people do this is for reproducibility purposes, to ensure that the script is run in the context of a “clean” R session.

Unfortunately, while the goal is a good one the solution is not.

The problem with the “traditional” approach is that the only thing it does is remove objects from the global environment. If your goal is to ensure that the R session is clean, this isn’t sufficient. The reason it’s not enough is that the state of an R session is defined by a lot of different things, and the objects in the global environment form a very small part of that state. Yes, using rm() to clear the global environment will “clean” this specific aspect to the R session state, but it has no effect on any of the other things. What’s worse, the rm() approach can create false confidence: if users rely on rm() as an “automated” method for cleaning the session state, they may end up executing scripts in a profoundly irreproducible way, never noticing that something bad has happened. This is, to put it mildly, not ideal.

Because of this, a better practice is to restart the R session immediately before running the script. By running the script in a fresh R session, you’re much less likely to encounter these issues. By extension, the reason for including a call to sessioncheck() at the top of a script is not to try to clean the R session (which is very hard to automate). Instead, what it does is prompt the user to take appropriate action if potential issues are detected. For additional background, see the article on why session checking is useful.

What does sessioncheck do?

The main function in sessioncheck is sessioncheck(), which examines the state of the R session and informs the user if potential issues are detected. The behavior of sessioncheck() is customizable, allowing the user to make decisions about what criteria should be used to decide if an R session is “dirty”.

For the purposes of this article we will stick to the default checks. The simplest of these examines the contents of the global environment, very much in line with the traditional method of inserting rm(list=ls()) into the top of a script. At the moment there is nothing in the global environment associated with this document, so it is considered “clean”. When sessioncheck() is called in a clean state, no message is printed:

sessioncheck::sessioncheck()

By default, sessioncheck() adheres to the R convention that variables starting with a period are hidden variables, and does not report any issues if the session contains a variable like .Random.seed or .Last.value. This can be customized, but for the purposes of this article we’ll just look at the default behavior:

visible_1 <- "this will get detected"
visible_2 <- "so will this"
.hidden_1 <- "but this will not"

sessioncheck::sessioncheck()
#> Warning: Session check results:
#> - Objects in global environment: visible_1, visible_2
#> - Attached packages: [no issues]
#> - Attached environments: [no issues]

The first line of this output indicates that the script has detected visible_1 and visible_2 in the global environment, and issues a warning to suggest that the R session may be contaminated. This can be upgraded to an error if so desired, to ensure that the script will refuse to run if the R session is not deemed to be clean:

sessioncheck::sessioncheck(action = "error")
#> Error:
#> ! Session check results:
#> - Objects in global environment: visible_1, visible_2
#> - Attached packages: [no issues]
#> - Attached environments: [no issues]

By default, sessioncheck() runs three checks, and reports the results if any of the checks do not pass. The first one is the global environment check discussed above. The second one checks for packages that have been attached to the search path, usually via library() or require(). The third one checks for other environments that have may have been attached, perhaps by inadvertently calling the attach() function. This is illustrated in the following example:

require(knitr) # non-base packages are detected
#> Loading required package: knitr
require(stats) # base R packages are ignored
attach(iris)   # attached data frames are detected

sessioncheck::sessioncheck()
#> Warning: Session check results:
#> - Objects in global environment: visible_1, visible_2
#> - Attached packages: knitr
#> - Attached environments: iris

To an experienced R user it will likely be obvious that these three checks are not sufficient to ensure that the R session is clean (and indeed this is the reason why the behavior of sessioncheck() can be customized). However, it does work better than using rm(list=ls()) and moreover, because most cases in which a script is executed in a dirty R session are due to the user previously executing code that loads packages or creates variables in the global environment, it tends to work fairly well in practice.

Further reading

For more information about the logic behind sessioncheck and how its behavior can be modified, see the following articles: