RStudio Professional Products rely on databases to store metadata. Out of the box, they come with a SQLite database. If you are running on a single server, you don't need anything else.
For RStudio Server Pro, this requirement is for versions 1.4+.
However, if you are running in a load-balanced configuration, you will need to provide an external Postgres installation. The individual products will manage all of the data and tables inside the Postgres installation.
Common Questions:
Can I use an existing Postgres installation for RStudio Professional Products?
Yes! The RStudio product requires read/write access to a dedicated Postgres schema, but the schema can live in a Postgres installation that houses other schemas as well.
Can I use a different database provider like Oracle, MySQL, or SQL Server?
No, not at this time.
Do I need a dedicated DBA for the database(s)?
No. The product manages all of the data inside the database including data permissions. A DBA can assist with the initial setup and potentially data backups, but consider this database an application requirement not a part of your data organization.
What is stored in the database?
The databases store metadata about content, users, packages, and settings. The databases also store metrics including content or package usage.
In particular, neither the RStudio Connect nor the RStudio Server Pro database store the data used by the applications or reports. For example, if you have a dashboard that shows sales forecasting data, that data is accessed by the application code and references your company data warehouse. The product database would not contain any sales data.
How big are the databases?
The size of each database will depend on the amount of content and activity on the server. A good rule of thumb is to start with 1 GB of storage for a Postgres installation or 1 GB of disk space for the SQLite database (located at /var/lib/rstudio-connect/db, /var/lib/rstudio-pm/db, or /var/lib/rstudio-server/db by default).
Can I migrate from a single-node server using SQLite to a multi-node configuration with a Postgres installation?
RStudio Connect and RStudio Server Pro support database migrations, see the admin guide chapter on migrations for RStudio Connect or RStudio Server Pro. RStudio Package Manager does not support database migrations at this time.
How should I handle data backups for the product databases?
SQLite: RStudio Connect has built-in support for backing up the SQLite database while the RStudio Connect service is running, see the admin guide. For RStudio Package Manager or RStudio Server Pro, stop the service and make a copy of the SQLite database.
Postgres: Postgres has native support for backups. For example, a cron job can be set to use the pgdump command to create backups on a schedule.
RStudio professional products also rely on disk storage. The database and disk storage should be kept in sync and backed up together at the same time.
What about the file storage requirements?
In addition to the database, RStudio Connect requires file storage that includes the source code deployed to RStudio Connect, rendered reports, log files, and R packages. The admin guide outlines the on-disk storage requirements.
RStudio Server Pro requires each user have a home directory they can access, as well as shared file storage for configuration and other files. The admin guide outlines the purposes for RStudio Server Pro storage.
RStudio Package Manager also relies on storage, and administrators can pick between using shared files or S3. For more details refer to the admin guide.
For a comprehensive overview, please see the admin guide chapter on databases:
Comments