Using RStudio Workbench / RStudio Server with Microsoft R Server for Cloudera

Follow

When you load the RevoScaleR package in Microsoft R Server, you must set certain environment variables to work with your Cloudera cluster. You can set these environment variables at start up by including them in the rsession-profile of RStudio Workbench (previously RStudio Server Pro).

These instructions assume that the MRS and MRO directories were installed in the Cloudera cluster using the parcels provided by Microsoft, and that RStudio Workbench is installed in one of the cluster's nodes.  

The Revo64 command contains a list of environment variables for RevoScaleR to work. Set these same environment variables in your rsession-profile by following the steps below.

Steps

1. Start Microsoft R Server on the command line if you haven't before.

sudo /mrs/bin/Revo64

2. Copy a .RevoHadoopEnvVars.site to the rsession-profile

sudo cp $HOME/.RevoHadoopEnvVars.site /etc/rstudio/rsession-profile

3. Insert the following lines at the top of the /etc/rstudio/rsession-profile file:

export SCRIPT_DIR=/opt/cloudera/parcels/MRS/bin/
export MRS_PARCEL_PATH=/opt/cloudera/parcels/MRS
export MRO_PARCEL_PATH=/opt/cloudera/parcels/MRO
export MRS_PARCEL_MKL=/opt/cloudera/parcels/MRS/lib64/R/lib
export R_LIBS=/opt/cloudera/parcels/MRS/lib64/R/library
export REVOLIBS=/opt/cloudera/parcels/MRO/lib64/R/library/RevoRsrConnector/rxLibs
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MRS_PARCEL_MKL}:${JAVA_HOME}

4. Create or update the /etc/rstudio/r-versions file with the following lines:

/opt/cloudera/parcels/MRO/lib64/R
/opt/cloudera/parcels/MRS/lib64/R

5. Restart the RStudio Workbench

sudo service rstudio-server restart

Comments