Scaling and HA
Velixir has two scaling dimensions: vertical (the plan, which sets per-replica CPU / memory / disk) and horizontal (replica count, optionally autoscaled). Plus an optional high-availability mode that keeps a warm second region.
Plans
Plans set per-replica resources. Every app picks one at creation; you can change it any time without redeploying — the next deploy rolls pods over with the new plan.
Plans range from a small dev tier (shared CPU, 512 MB RAM) up to dedicated multi-vCPU plans for production traffic. The full plan grid is on the Scaling tab of each app and on the public pricing page.
Autoscaling
On the Scaling tab, set:
- Min / max replicas — the bounds the autoscaler operates within.
- CPU threshold — scale up when sustained CPU exceeds this percentage. Required.
- Memory threshold — optional. When both are set, K8s scales up if either is exceeded.
- Sustain for / cool-down — how long load must exceed the threshold before scaling up, and how long to wait before scaling down. The defaults (180s / 300s) suit most apps.
We use the standard Kubernetes HPA under the hood. Metrics come from metrics-server inside the cluster — the same data that drives the live charts on the Overview tab.
Slots
Every app has a production slot plus optional staging slots. Slots share the app's plan but each has its own replicas and its own deploy history. Use them to:
- Stage releases — deploy to
staging, smoke-test atstaging-<your-app>-<region>.velixir.run, then promote. - Run preview environments — spin up a per-feature slot per PR (and tear it down when the PR merges).
Promoting a slot is near-instant: we swap which slot the edge route points at; the previously-live pods stay warm so flipping back is also fast.
High availability
On the Scaling tab, toggle Enable HA. We:
- Pick the geographically-furthest region from your primary as the backup.
- Provision a warm-standby set of replicas there.
- Wire the edge so traffic fails over automatically if the primary region goes dark.
The warm-standby plan is configurable separately — you can run a cheaper plan on the standby if its only job is to absorb traffic during a primary outage. Standby replicas are billed at the normal hourly rate while they're running; pausing the slot stops billing but takes longer to fail over.
Failover is automatic. You don't need to flip a switch.