Prescaler allows you to proactively scale up your Kubernetes HPAs minutes before anticipated spikes in load or traffic, such as those occurring on the hour. After this initial prescaling, your HPA will manage subsequent scaling, either continuing to scale up or scaling down once the spike has passed.
Imagine you have a Kubernetes deployment of SuperHeavyApp with 10 replicas and an associated Horizontal Pod Autoscaler (HPA).
Every hour, your application experiences a brief, intense traffic spike that necessitates scaling SuperHeavyApp up to 200% (requiring 10 additional pods) with total 20 pods.
Here's how a typical scenario might unfold:
- 8:30:00: You have 10
SuperHeavyApppods ready, with HPA at 70% CPU utilization (based on 2000mCPU across 10 pods). - 8:55:00: Still 10
SuperHeavyApppods ready, HPA remains at 70% CPU utilization. - 9:00:00: The traffic spike begins. You still have 10
SuperHeavyApppods ready, but HPA CPU utilization jumps to 140%. - 9:00:30: HPA triggers a scale-up of 10 additional pods. However, clients experience throttling or disruptions (receiving 503/504 errors) due to the delay.
- 9:02:00: The traffic spike ends. You have 10
SuperHeavyApppods ready, but 10 new pods are stuck in a pending state, waiting for a new node. HPA is still at 140% CPU utilization (among the initial 10 pods). - 9:03:00: A new node is added to the cluster (by Karpenter/Cluster Autoscaler), and the 10 new pods become available. HPA now shows 70% CPU utilization (among 20 pods).
- !?!?!?
- 9:08:00: HPA scales back down to 10
SuperHeavyApppods, with CPU utilization at 70% (among the 10 pods).
- Configure Horizontal Pod Autoscaler (HPA) to scale based on 40% CPU usage with a
stabilizationWindowSecondsof3600. This will keep it upscaled most of the time, leading to wasted resources and money. - Use Prescaler. For the
SuperHeavyAppHPA/Deployment, configure Prescaler's Custom Resource (CR) to prescale to 200% at "56 * * * *". At 8:56:00, Prescaler will:- Change HPA's CPU
averageUtilizationfrom (current CPU averageUtilization if useCurrentCpuUtilization=true) 70% to (70*100/200) 35% and sethpa.Spec.Behavior.ScaleUp.StabilizationWindowSecondsto 0 (for immediate scale-up). - HPA will start scaling up from 10 to 20 pods (2x). Prescaler will then wait for XX seconds.
- Once HPA has scaled to 20 pods, Prescaler will revert HPA's CPU
averageUtilizationandhpa.Spec.Behavior.ScaleUp.StabilizationWindowSecondsto their original values. - HPA will remain upscaled (with real usage at 35% across 20 pods) until
behavior.scaleDown.stabilizationWindowSeconds. It's better to increase this to 10 minutes (the default is 5 minutes). - At the top of the hour (9:00:00), you'll experience a traffic spike, and CPU usage will naturally increase to the expected 70% (across 20 pods).
- If the spike continues, HPA will not scale down; it will even continue to scale up natively.
- If the spike ends, HPA will scale down after it concludes (e.g., if the spike ends at 9:03:00, HPA scales down at 9:08:00, i.e., 9:03:00 + 5 minutes).
- This process repeats for every hour.
- Change HPA's CPU
- Increase
behavior.scaleDown.stabilizationWindowSecondsfrom recommended 10 minutes (for example, to 20 minutes) - Start prescaling earlier (for example, at
"51 * * * *"), or use multiple schedules for different times of day.
Attention!
- When you specify a cron schedule, keep in mind a timezone! Probably controller will run in your Kubernetes cluster in UTC timezone!
- Also check
maxConcurrentReconcilesparameter and logs (in case you have overdue prescaling) - Prescaling may not be accurate/precise due to constant changes in scaling and CPU usage
- Your HPA
scaleUp.policiesshould allow scaling by maximal percent to ever wanted to scale in your schedules! Or by default HPA will scale up only to 100% (10->20 pods!) - Your HPA
maxReplicasshould fit max scaling percent!
spec:
targetHpaName: nginx-project # target HPA to scale, must be in the same namespace as Prescale CR
schedules:
- cron: "55 * * * 6,0"
percent: 165
- cron: "55 0-1 * * 1-5"
percent: 200
- cron: "55 2-12 * * 1-5"
percent: 500
- cron: "55 13-23 * * 1-5"
percent: 200 # To calculate new CPU AverageUtilization value based on formula: original * 100 / percent
useCurrentCpuUtilization: true # Use current (last value) of avg cpu instead of Spec avg value
suspend: false # enable/disable prescaler
revertWaitSeconds: 40 # max wait time before reverting back to original values. It is important, because we must provide Kubernetes time to detect and react on HPA changes (to trigger scaleup desiredReplicas)Verify that your HPA contains ScaleUp Policy like this:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-project
spec:
behavior:
scaleUp:
policies:
- periodSeconds: 5
type: Percent
value: 500
selectPolicy: MaxWhere 500 is the max percent from your scedules in Prescaler.
Notice how percent parameter works! It is used in formula to calculate new/temporary value for CPU AverageUtilization.
For example:
- you have 30 pods
- HPA periodically calculates averageUtilization and updates its status in
status.currentMetrics, for example, current avg cpu = 50 but spec avg cpu remains static (like 70) -useCurrentCpuUtilizationswitches which value to use for calculation - in Prescaler you set percent = 200
Than Prescaler will do math 50 * 100 / 200 = 25 and set CPU AverageUtilization = 25. HPA will scale up to 60 pods, because it will need 60 pods to maintain avg cpu about 25% (because on 50% it needed 30 pods)
- go version v1.24.0+
- docker version 17.03+.
- kubectl version v1.11.3+.
- Access to a Kubernetes v1.11.3+ cluster.
Build and push your image to the location specified by IMG:
make docker-build docker-push IMG=<some-registry>/prescaler:tagNOTE: This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.
Install the CRDs into the cluster:
make installDeploy the Manager to the cluster with the image specified by IMG:
make deploy IMG=<some-registry>/prescaler:tagNOTE: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.
Create instances of your solution You can apply the samples (examples) from the config/sample:
kubectl apply -k config/samples/NOTE: Ensure that the samples has default values to test it out.
Delete the instances (CRs) from the cluster:
kubectl delete -k config/samples/Delete the APIs(CRDs) from the cluster:
make uninstallUnDeploy the controller from the cluster:
make undeployFollowing the options to release and provide this solution to the users.
- Build the installer for the image built and published in the registry:
make build-installer IMG=<some-registry>/prescaler:tagNOTE: The makefile target mentioned above generates an 'install.yaml' file in the dist directory. This file contains all the resources built with Kustomize, which are necessary to install this project without its dependencies.
- Using the installer
Users can just run 'kubectl apply -f ' to install the project, i.e.:
kubectl apply -f https://raw.githubusercontent.com/<org>/prescaler/<tag or branch>/dist/install.yaml- Build the chart using the optional helm plugin
kubebuilder edit --plugins=helm/v1-alpha- See that a chart was generated under 'dist/chart', and users can obtain this solution from there.
NOTE: If you change the project, you need to update the Helm Chart using the same command above to sync the latest changes. Furthermore, if you create webhooks, you need to use the above command with the '--force' flag and manually ensure that any custom configuration previously added to 'dist/chart/values.yaml' or 'dist/chart/manager/manager.yaml' is manually re-applied afterwards.
// TODO(user): Add detailed information on how you would like others to contribute to this project
NOTE: Run make help for more information on all potential make targets
More information can be found via the Kubebuilder Documentation
Copyright 2025.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.