Using Python with the RStudio IDE

Follow

RStudio 1.4 brings improved support for the Python programming language to the RStudio IDE. This article will explore how Python can be used together with R and RStudio.

Getting Started

RStudio uses the reticulate R package to interface with Python, and so RStudio's Python integration requires:

  1. An installation of Python (2.7 or newer; 3.5 or newer preferred), and
  2. The reticulate R package (1.20 or newer, as available from CRAN)

Installing Python

First, Python needs to be installed on your machine. If you haven't already installed Python, it can be installed in a couple of ways:

  1. (Recommended) Using reticulate::install_miniconda(), to install a Miniconda distribution of Python using the reticulate package,
  2. (Windows) Install Python via the official Python binaries available from https://www.python.org/downloads/windows/,
  3. (macOS) Install Python via the official Python binaries available from https://www.python.org/downloads/mac-osx/,
  4. (Linux) Install Python either from sources, or via the version of Python provided by your operating system's package manager. See https://docs.python.org/3/using/unix.html for more details. If you are installing Python from sources yourself, prefer installing it into a location like /opt/python/<version>, so that RStudio and reticulate can more easily discover it.

Note that if you are installing Python from sources, it must be configured with --enable-shared; e.g.

./configure --enable-shared
make
make install

reticulate will be unable to bind to a version of Python without an accompanying shared library available.

Installing reticulate

The reticulate package can be installed from CRAN, with:

install.packages("reticulate")

Note that if you are installing reticulate from sources, you will need the requisite build tools available -- see https://support.rstudio.com/hc/en-us/articles/200486498-Package-Development-Prerequisites for more details.

Using Python

The reticulate package makes it possible to load and use Python within the currently running R session. After reticulate has been installed, Python scripts (with the extension .py) can be opened, and the code within executed, similarly to R.

Screen_Shot_2021-04-21_at_3.42.51_PM.png

Note that RStudio uses the reticulate Python REPL to execute code, and automatically switches between R and Python as appropriate for the script being executed.

When the reticulate REPL is active, objects in the R session can be accessed via the r helper object. For example, r["mtcars"] can be used to access the mtcars dataset from R, and convert that to a pandas DataFrame (if available), or a Python dictionary if not.

Screen_Shot_2021-04-21_at_3.41.01_PM.png

Selecting a Default Version of Python

It is possible to configure a default version of Python to be used with RStudio via Tools -> Global Options... -> Python:

Screen_Shot_2021-04-21_at_3.33.02_PM.png

The "Python interpreter:" text box shows the default Python interpreter to be used (if any). If you already know the location of the Python interpreter you wish to use, you can type the location of the interpreter into that text box.

Otherwise, Python interpreters can be discovered on the system by clicking the "Select..." button:

Screen_Shot_2021-04-21_at_3.34.00_PM.png

RStudio will search for Python interpreters via a few different methods:

  • On the PATH;
  • For virtual environments, located within the ~/.virtualenvs folder;
  • For Conda environments, as reported by conda --list,
  • For pyenv Python installations located in ~/.pyenv,
  • For Python installations located in /opt/python.

The RETICULATE_PYTHON environment variable can also be used to configure the default version of Python to be used by reticulate. If set, that environment variable will take precedence over any value configured via RStudio options.

Custom Python Locations

If you have Python installed into a custom location, and you'd like for that location to be visible to RStudio, you can set the following R option:

options(rstudio.python.installationPath = "/path/to/python")

This can be set within an appropriate R startup file; for example, your R installations etc/Rprofile.site.

Environment Pane

When the reticulate Python REPL is active, the RStudio Environment pane will update to show any available Python objects. For example:

Screen_Shot_2021-05-12_at_12.44.13_PM.png

The Environment pane will automatically update to show objects as they are added and removed.

Note that mutated objects may not update in the Environment pane until the Environment pane is explicitly refreshed.

Learning More

For more detailed documentation around how RStudio, R, and reticulate can be used together, please consult the reticulate documentation online at https://rstudio.github.io/reticulate/.

Comments