Memory Based Load Balancing

Follow

The default load-balancing methods on RStudio Workbench largely consider CPU utilization as the determinant for routing between nodes. There may be instances where administrators may wish to uses memory, as opposed to CPU-utilization, to route requests to nodes that are using less RAM.


To enable this, set the configuration below in your /etc/rstudio/load-balancer configuration file:

# /etc/rstudio/load-balancer
[config]
balancer = custom

From there, we will need to set the following directive in /etc/rstudio/rserver.conf

# /etc/rstudio/rserver.conf 
server-health-check-enabled=1

From there, using your text editor of choice, edit the /usr/lib/rstudio-server/bin/rserver-balancer file with the below:

#!/usr/bin/Rscript

get_nodes <- function(nodes = Sys.getenv("RSTUDIO_NODES")) {
  unlist(strsplit(nodes, ",", fixed = TRUE))
}

get_health_check <- function(node_address) {
  con <- url(sprintf("http://%s/health-check", node_address))
  on.exit(close(con))
  readLines(con)
}

parse_health_check <- function(health_check) {
  fields <- unlist(strsplit(health_check, "\n|,"))

  keys <- sub(
    ":",
    "",
    regmatches(fields, regexpr("^[a-z\\-]+:", fields))
  )

  values <- sub(
    ": ",
    "",
    regmatches(fields, regexpr(":.+$", fields))
  )
  data.frame(keys, values, stringsAsFactors = FALSE)
}

get_value <- function(hc_table, hc_value) {
  as.double(hc_table[hc_table[["keys"]] == hc_value, "values"])
}

main <- function() {
  nodes <- get_nodes()
  named_hc <- setNames(lapply(nodes, get_health_check), nodes)
  parsed_hc <- lapply(named_hc, parse_health_check)
  mem_percent <- lapply(parsed_hc, get_value, "memory-percent")
  least_mem <- names(which.min(unlist(mem_percent)))

  least_mem
}

cat(main())

Restart the Workbench & launcher services for this to take effect:

sudo rstudio-server restart
sudo rstudio-launcher restart

Comments