Last Updated: January 28, 2026

Autoscaling Overview

SaladCloud offers three main methods for autoscaling your containerized workloads:

Job Queue Autoscaling: This method allows you to scale your workloads based on the number of jobs in your Salad Job Queue. It is particularly useful for workloads that can be processed in parallel and where the number of jobs can vary significantly over time.
Scheduled Scaling (Time of Day): This method lets you scale replica counts based on a fixed schedule. This is useful for workloads with predictable demand, such as business-hour traffic or nightly batch processing. Scheduled scaling allows you to increase capacity ahead of known peaks and reduce replicas during off-hours. Configuration is available through the SaladCloud Portal or the Salad API. See Scheduled Scaling (Time of Day) for details.
Programmatic Autoscaling: This method allows you to scale your workloads via the API based on any custom metrics you can come up with. We have detailed guides on how to set up programmatic autoscaling based on:
- Time of Day: Scale your workloads based on the time of day.
- Hardware Metrics: Scale your workloads based on hardware metrics such as GPU utilization. This is useful for workloads that require specific hardware resources and where the demand for those resources can vary over time.
- Job Queue Volume: Scale your workloads based on the volume of messages in your SQS queue. This is useful for workloads that process messages from a queue and need to scale based on the current backlog of messages.

Documentation Index