When creating web-based software products, authentication and authorization are essential for controlling who has access to your infrastructure.
There are two aspects to that control - who can access the system, and what they can do when they get access.
We call the first aspect Authentication. This is most commonly represented by a login screen, with a request for a username and password. Increasingly there are more complex systems that include a second factor, but in essence, we are asking "do we recognize this user, and do they have enough information to prove who they are?" Another common name for this process is called Host Based Access Control, or HBAC.
The second aspect is called Authorization. A user will generally only be made aware of this aspect when they hit their head against it - an error message saying "you can't go there", or "you can't perform that action":
<username> is not in the sudoers file. This incident will be reported.
Another name for this process is Role Based Access Control, or RBAC - once a user has been Authenticated, what are they allowed to do?
There are four RStudio products that need authentication and authorization. Two of these are https://shinyapps.io and https://rstudio.cloud. I won't be addressing either of these products since, from a user perspective, there is little configuration required.
RStudio Workbench
Workbench requires both authentication and authorization.
The nature of the product is that it provides an easy-to-use IDE that is backed by a filesystem - usually a user's home directory, sometimes a shared project directory. Users can log in - this is authentication. Users can also save files, have other files automatically built using packages like knitr
, or they might write images or new data sets to disk - these require authorization. They can't write files anywhere - it can only be where that user is allowed to write files.
Authentication options
The options for authentication in Workbench:
Login System | Description | Authentication links |
PAM, Pluggable Authentication Modules | PAM comes by default with most Linux OSes, and all the OSes supported by RStudio. | PAM, default |
LDAP, Lightweight Directory Access Protocol | LDAP is available for all Linux OSes, and all the OSes supported by RStudio. | LDAP |
AD, Active Directory | The Microsoft implementation of LDAP with added bells and whistles | LDAP |
SAML, Security Assertion Markup Language | Multiple providers - Okta, AzureAD, OneLogin, JustConnect, generics. | Single Sign On, aka SSO |
OpenID | Multiple providers - Okta, AzureAD, , generics. | Single Sign On, aka SSO |
There are two points that are implied in this table, but not explicit.
The first is that PAM, being native to Linux, will automatically create a home directory for a user that has authenticated successfully but doesn't have a home directory yet.
The second is that LDAP and SSO will both require an extra process to provide Authorization or RBAC. This is the step in SSO configuration that is most frequently missed.
LDAP and SSO on their own, as-built into RStudio Workbench, will only Authenticate users. But systems administrators need those users to be able to have home directories made if they don't exist, and provided with the relevant permissions on that directory. This requires the extra step of joining the Workbench server to the domain in question.
- For configuring RHEL equivalent OSes to use SSSD we have a support article here:
https://support.rstudio.com/hc/en-us/articles/360016587973-Integrating-RStudio-Workbench-RStudio-Server-Pro-with-Active-Directory-using-CentOS-RHEL - For configuring Ubuntu equivalent OSes to use SSSD we have a support article here:
https://support.rstudio.com/hc/en-us/articles/360024137174-Integrating-Ubuntu-with-Active-Directory-for-RStudio-Workbench-RStudio-Server-Pro
Authorization options
PAM Sessions is the only way to configure authorization with RStudio Workbench. As mentioned above, this is the configuration step that is most frequently missed. Unless you have disabled PAM Sessions - which is so rare as to be unseen - you will need to configure PAM Sessions. Newer versions of Workbench have automated this step.
There is a single source of documentation, and you can find it here:
https://docs.rstudio.com/ide/server-pro/r_sessions/pam_sessions.html
There is one further technology that is adjacent to PAM Sessions - occasionally an admin will want the Authentication process to flow onto a third party invisibly to the end-user. The most frequently seen use case asked for is access to a Microsoft Database via their Active Directory credentials. In these cases and others like them, we use Kerberos tickets. These are also managed by PAM and SSSD but require extra configuration:
https://docs.rstudio.com/ide/server-pro/r_sessions/kerberos.html
Workbench and SAML, the top-level documentation:
https://docs.rstudio.com/rsw/configuration/authentication/saml/
Authenticating Users Overview:
https://docs.rstudio.com/ide/server-pro/authenticating_users/authenticating_users.html
Workbench and PAM, the admin documentation:
https://docs.rstudio.com/ide/server-pro/authenticating_users/pam_authentication.html
Workbench and SAML, the admin documentation:
https://docs.rstudio.com/ide/server-pro/latest/authenticating_users/saml_sso.html
Workbench and OpenID, the admin documentation:
https://docs.rstudio.com/ide/server-pro/authenticating_users/openid_connect_authentication.html
RStudio Connect
RStudio Connect has a simpler structure by default - most installations just need Authentication. In those cases, uploaded applications run as the unprivileged local user rstudio-connect in a sandbox on the server.
Of course, there are options in Connect as well. Most notably, the unprivileged user that runs the apps can be changed with few adjustments - this is the Run As functionality.
https://docs.rstudio.com/connect/admin/process-management/#runas
The next most frequent request is to Run As Current User - examples of this include when an audit trail is required, or if each user has different permissions on a database. This has the hard requirement that PAM is used as the Authentication system.
https://docs.rstudio.com/connect/admin/process-management/#runas-current
Run As Current User - no further Authentication is required
By default, the PAM su
service is used. If you need to add more functionality, the PAM service can be customised to your needs. In this situation, the important change to configuration is the use of the directive PAM.SessionService:
https://docs.rstudio.com/connect/admin/process-management/#pam-sessions
Run As Current User - further Authentication is required
In the case that Kerberos tickets are required - for example, when per-user database access is needed, there is an example of a PAM session service available. In this situation, the important change to configuration is the use of the directive PAM.AuthenticatedSessionService:
https://docs.rstudio.com/connect/admin/process-management/#pam-credential-caching-kerberos
Comments