Filling up the home directory with RStudio Workbench / RStudio Server

Follow

Problem

Your home directory can fill up as a result of large amounts of session or project data being written to file. The home directory is part of your server’s file system or it is mounted to it from a network file system. The home directory is essential in managing user configuration files and maintaining a consistent experience across the network. When the home directories fill up the system ceases to function properly.

Causes

RStudio Workbench and RStudio Server use the home directory as the default location for configuration files and project files. There are typically two causes for filling up the home. The first is when users write large project files to home. The second is when large sessions timeout and RStudio Workbench or RStudio Server automatically writes sessions to home.

  • Project files written to home. Most analytic projects require temporary scratch space and on-going project space. One feature in R is to save your workspace as an .RData file so that you can restore your workspace at some later date. RStudio Server gives you the option set the default (always/never/ask) when saving and restoring your workspace. By default new projects are started in home and thus .RData files are often written to home.
  • Suspended sessions. RStudio Server comes with a feature to save suspended sessions by default. When a session exceeds the timeout it is automatically written into the RStudio user state directory.

The RStudio user state directory

All of the information pertaining to the configuration and management for a specific user is contained in RStudio's user state directory. This directory contains a lot of useful information, such as your open tabs, your code history, etc. It also contains all your session information including the number of sessions and active state.

Most importantly for the purposes of this discussion, however, it also includes suspended data for each session. RStudio Server automatically suspends sessions when they have been inactive for a while, which reduces memory pressure. Suspending a session involves writing all of the data in the Global Environment to disk and then quitting R. When the session is later resumed, R is started up and the data is read back into the Global Environment.

The RStudio user state directory is ~/.local/share/rstudio in RStudio 1.4, and ~/.rstudio in older versions.


Recommended solutions

We recommend minimizing the amount of data written to home and maximizing the amount of home space available.

Turn off the session time out

You can turn off the session timeout by setting session-timeout-minutes to zero minutes in the /etc/rstudio/rsession.conf file.

session-timeout-minutes=0

Turning off the session timeout will prevent RStudio Server from automatically writing the session data to the home directory. If you are dealing with large amounts of data or a large number of sessions, turning off the session timeout could save a lot of space in your home directory. The session timeout setting can optionally be specified at the user or group level by adding session-timeout-minutes to the /etc/rstudio/profiles file.

Automatically delete unused sessions

As an administrator, you have the ability to automatically suspend sessions to disk after a certain period of inactivity by specifying the session-timeout-minutes option in /etc/rstudio/rsession.conf. RStudio Workbench (previously RStudio Server Pro) has the ability to also kill and delete these sessions entirely after a certain amount of hours, freeing up valuable system resources. Simply add the following line to /etc/rstudio/rsession.conf.

session-timeout-kill-hours=96

This setting will kill and delete any inactive sessions that have not been used for the specified hours. You should set a long timeout period to ensure that only sessions users have forgotten about or no longer need are deleted, as the session’s data is lost forever. Again, for more information, see the RStudio Workbench Administration Guide.

Turn off the default save action

Be default, R asks you whether to save workspace to file. Admins can turn off the default save action for projects in the /etc/rstudio/rsession.conf file.

session-save-action-default=no

Start new projects outside of home

You may want to start new projects in some other location outside of home. This can be done by specifying the project path in the new projects dialog box. Server administrators can select a default location for new projects by configuring the /etc/rstudio/rsession.conf file. 

session-default-new-project-dir=/mnt/rprojects 

Use the system library [optional]

You might consider putting commonly used packages into the system library in order to dissuade users from duplicating package installations in home. However, this approach requires additional admins to reinstall updated packages into the system library every time a new version of R is installed on the server. The extra effort in administration might not justify the amount of space saved. 

Increase the available space in home

RStudio Workbench (previously RStudio Server Pro) relies heavily on the home directory for its regular operation. If there are relatively few R users, then getting a little more space for these users makes a huge difference. Getting more space on home is the easiest way to ensure a good user and admin experience on RStudio Workbench.

Change the RStudio user state directory

The default RStudio user state directory is ~/.local/share/rstudio (or ~/.rstudio in old versions, 1.3 and prior). Starting with RStudio 1.4, the directory now follows the XDG Base Directory standard, and it can be customized using standard XDG environment variables. You can read instructions for doing this in the RStudio administration guide, under "User State Storage".

https://docs.rstudio.com/ide/server-pro/r-sessions.html#user-state-storage

You can use this technique to store RStudio state information, including suspended session data, outside user home directories. 

 


Other solutions

Using local home

Alternatively, you can use the local home on the server itself (not mounted). There is typically plenty of space on local home for user files. The downside to this approach doesn't support load balancing or high availability. It also means that user profiles are unique to the individual server. This approach is really only useful for on server, and even then the user experience is compromised, therefore we don't recommend this setup. 

Comments