Running Multiple RStudio Products on the same Machine

Follow

It can be appealing to host multiple RStudio products on a single server. It reduces the number of servers that you may need to manage, and it may seem like compute capacity is better utilized, especially in periods of low use when CPU and memory are almost idle. However, we would usually recommend against doing so. Our best practice architecture recommends providing each product with its own server. This microservice architecture ensures that each product will work as effectively as possible, and is scalable for your organization's future growth.

 

The layers of each product include:

  • RStudio Workbench
  • RStudio Connect
  • RStudio Package Manager
  • Proxy
  • Database
  • Shared Storage

 

More information on best practice architectures can be found here:

https://solutions.rstudio.com/sys-admin/architectures/

 

Why should I separate each product?

Scalability

It is difficult to scale a host that has multiple applications on one resource. This server will need to have multiple ports configured to separate each service to be distinguishable and accessible which requires additional administration work. Also, if a particular application has more users added to it over time, then you will eventually need to vertically scale (increase compute capacity) the host to meet demand. Vertical scaling is far more costly & a less preferred solution than horizontal scaling (adding more smaller nodes and distributing load between them) which is currently the best practice for server scaling.

 

Compute Resources

In periods of high application utilization, you may see each application fight for your server's compute resources, which will cause them to be over-utilized. Let's say you have an NFS share, with RStudio Workbench, and RStudio Connect on the same server. If your RStudio Workbench server is being used by many users & the server is nearing full utilization of its compute capacity, your NFS share users and your RStudio Connect users sessions will also be affected due to insufficient resources. You will see the server start to slow down & users on other applications will be kicked out as the server struggles to cope with the demand.

 

Hosting multiple products on a single server also inherently increases your idle CPU utilization and memory, which can contribute to resource over-utilization.

 

Single Point of Failure

This architecture introduces a single point of failure for your environment. If your server fails, or if dependent processes clash, then your server can fail which will prevent any users from accessing all of the applications that exist on your server.

 

This is not ideal, especially for business-critical workloads. It is always best to spread the point of failure across multiple resources so if the worst case happens, your system is still partially usable.

 

DNS Issues

There are a number of places where RStudio will use FQDN instead of IP addresses, as well as SSL certificates for secure communication between services, and between the user's browser and the server. This can be achieved with multiple services on a single machine but does increase the complications involved. The management of certificates and key files, the domain names associated with those certificates and keys, and the domain names, are far less complex when they refer to different machines.

 

 

Conclusion

We would strongly recommend separating each of your RStudio products & applications on different servers/VMs. This conforms to our best-practice architecture and will prove to be a scalable & more reliable setup for your organization to be able to grow into the future.

 

 

 

 

 

 

 

Comments