Version: Operator 4.0.1

Horizontal Pod Autoscaling

Aerospike clusters can dynamically adjust, or autoscale, the number of pods based on workload demands. This ensures efficient resource usage and consistent performance as application demands change. Kubernetes provides a Horizontal Pod Autoscaler (HPA) tool for scaling workloads like Deployments, StatefulSets, or custom resources. This article explains how to use this tool in your AKO workflow. For more details, see the HPA official documentation.

The amount of demand that triggers scaling can come from a CPU or memory utilization threshold or specific Prometheus metrics that come from Aerospike. For example, you might want to trigger scaling if CPU utilization reaches a certain percentage, or if the amount of data used in an Aerospike namespace reaches a certain amount.

Each of the following sections describes one of these approaches.

Scaling on resource utilization metrics
Scaling on Prometheus metrics from Aerospike Database

Scaling on resource utilization metrics

An Aerospike cluster can be scaled based on the CPU or memory resource usage of its cluster pods.

Prerequisites

Aerospike Kubernetes Operator (AKO) 4.0 or later installed.
Aerospike Cluster: The pod spec for the Aerospike container should define CPU and memory resource requests. For more details, see the Aerospike Cluster Configuration.

Deploy HPA

Create a new YAML file to act as your HPA object. In this example, the file is hpa-cpu-scaler.yaml. You can use any name for the file.

Add the HPA configuration parameters to the file. The following example file instructs HPA to scale the Aerospike cluster if the average CPU usage of the aerospike-server container exceeds 60%. The spec.metrics section contains a containerResource subsection where this information is defined. The spec.scaleTargetRef.name field must reference your Aerospike Database cluster.

Example configuration for hpa-cpu-scaler.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
  namespace: aerospike
spec:
  minReplicas: 2
  maxReplicas: 5
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 300
  metrics:
  - type: ContainerResource
    containerResource:
      name: cpu
      container: aerospike-server
      target:
        type: Utilization
        averageUtilization: 60

  scaleTargetRef:
    apiVersion: asdb.aerospike.com/v1
    kind: AerospikeCluster
    name: aerocluster

Run kubectl apply -f FILE_NAME to create an HPA object in the same namespace as the workload you want to scale.
Applying hpa-cpu-scaler.yaml with kubectl
```
kubectl apply -f hpa-cpu-scaler.yaml
```

After the file is applied, HPA automatically scales the Aerospike Database cluster when the scaling threshold is met.

Scaling on Prometheus metrics from Aerospike Database

AKO also supports scaling based on Aerospike Database metrics exposed by Prometheus. This requires the Kubernetes Event-Driven Autoscaler (KEDA) tool, which enables HPA to read and scale based on these metrics. KEDA connects directly to Prometheus as a metrics source and uses PromQL queries to define scaling thresholds.

Prerequisites

Aerospike Kubernetes Operator (AKO) 4.0 or later installed.
Aerospike Cluster: Deployed and operational, with the Aerospike Prometheus Exporter running as a sidecar container.
KEDA Installed: Follow the KEDA installation guide.
Prometheus Installed: Should be collecting Aerospike custom metrics.
Helm installed.

Deploy Monitoring stack

A monitoring stack must be installed to enable HPA to use Aerospike metrics exposed by the Aerospike Prometheus Exporter.

For detailed instructions on setting up the monitoring stack, see: Aerospike Kubernetes Operator Monitoring.

Install and configure KEDA

Run the following Helm commands to install KEDA.

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Create a KEDA ScaledObject YAML file. In this example, the file is named scaledObject.yaml. You can use any name for the file.

Add the ScaledObject configuration parameters to the file. This example scales the Aerospike cluster based on the aerospike_namespace_data_used_pct metric exposed by the Prometheus server running on address http://aerospike-monitoring-stack-prometheus:9090.

Example scaledObject.yaml file
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: aerospike-scale
  namespace: aerospike
spec:
  advanced:
    horizontalPodAutoscalerConfig:                   
      name: keda-hpa-aerospike-scale                  
      behavior:                               
        scaleUp:
          stabilizationWindowSeconds: 300
  scaleTargetRef:
    apiVersion: asdb.aerospike.com/v1
    kind: AerospikeCluster
    name: aerocluster
  minReplicaCount: 2
  maxReplicaCount: 5
  triggers:
  - type: prometheus
    metricType: Value
    metadata:
      serverAddress: http://aerospike-monitoring-stack-prometheus:9090
      metricName: aerospike_namespace_data_used_pct
      query: |
        avg(aerospike_namespace_data_used_pct{ns="test1"})
      threshold: "50"

Apply this file with kubectl apply -f FILE_NAME to create a KEDA ScaledObject in the same namespace as the Aerospike cluster you want to scale.
Applying scaledObject.yaml
```
kubectl apply -f scaledObject.yaml
```
After the file is applied, KEDA automatically creates an HPA instance that scales the Aerospike Database cluster when the scaling threshold is met.

Verify that HPA is automatically created by KEDA.

Example showing the command to verify HPA creation
kubectl get hpa -n aerospike

NAME                       REFERENCE                      TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
keda-hpa-aerospike-scale   AerospikeCluster/aerocluster   25/50        2         5         5          22h

Key configuration parameters

These are the most common parameters used in an autoscaler deployment.

stabilizationWindowSeconds: Prevents rapid scaling up when metric values fluctuate. This is useful to avoid unnecessary scaling during temporary spikes, such as migrations.
scaleTargetRef: Specifies the Aerospike cluster that needs to be scaled.
minReplicaCount & maxReplicaCount: Defines the minimum and maximum number of replicas the HPA can scale between. Ensure that minReplicaCount is always greater than or equal to the minimum replication factor of all namespaces in the Aerospike cluster.
triggers: Uses a Prometheus-based trigger to scale the Aerospike cluster based on the aerospike_namespace_data_used_pct metric.
query: The PromQL query to fetch the metric value. In this example, the query fetches the average value of the aerospike_namespace_data_used_pct metric for the test1 namespace.

HPA also includes other parameters. For more details, see HPA Scaling Behavior.

Example: rack-specific querying

The previous example used the avg(aerospike_namespace_data_used_pct{ns="test1"}) query to get an average value of namespace data used across all racks. You can also scale based on a certain metric for a specific namespace or specific rack in the Aerospike cluster.

Consider a cluster with two racks and four nodes:

Rack 1 contains namespace test (Nodes: n1, n2)
Rack 2 contains namespaces test and test1 (Nodes: n3, n4)

To query on just one namespace, use the following command in the spec.triggers.metadata.query parameter of the scaledObject.yaml file:

aerospike_namespace_data_used_pct{ns="test1"}

This query fetches the data usage percentage for the namespace test1. Since test1 exists only in Rack 2, the result is calculated as the average of the data_used_pct value on racks 3 and 4: (n3_data_used_pct + n4_data_used_pct) / 2.

If the query result crosses the defined threshold, the cluster scales up. Scaling continues until the desired rack, in this case Rack 2, receives new nodes, reducing the query result below the threshold.

Commonly used metrics for scaling

The most effective metrics for autoscaling differ based on workload size and performance goals in different deployments. Monitor your traffic, latency, and throughput regularly, then set thresholds that avoid inefficient scaling.

The following table contains some recommended metrics for scaling an Aerospike Databse cluster. For more details, refer to the Aerospike metrics.

Metric Name	Description
aerospike_namespace_data_used_pct	Percentage of used storage capacity for this namespace.
aerospike_namespace_indexes_memory_used_pct	Percentage of combined RAM indexes' size used.
aerospike_namespace_index_mounts_used_pct	Percentage of the mount(s) in-use for the primary index used by this namespace.
aerospike_namespace_sindex_mounts_used_pct	Percentage of the mount(s) in-use for the secondary indexes used by this namespace.
aerospike_node_stats_process_cpu_pct	Percentage of CPU usage by the `asd` process.
aerospike_node_stats_system_kernel_cpu_pct	Percentage of CPU usage by processes running in kernel mode.
aerospike_node_stats_system_total_cpu_pct	Percentage of CPU usage by all running processes.
aerospike_node_stats_system_user_cpu_pct	Percentage of CPU usage by processes running in user mode.

Scaling on resource utilization metrics​

Prerequisites​

Deploy HPA​

Scaling on Prometheus metrics from Aerospike Database​

Prerequisites​

Deploy Monitoring stack​

Install and configure KEDA​

Key configuration parameters​

Example: rack-specific querying​

Commonly used metrics for scaling​

Scaling on resource utilization metrics

Prerequisites

Deploy HPA

Scaling on Prometheus metrics from Aerospike Database

Prerequisites

Deploy Monitoring stack

Install and configure KEDA

Key configuration parameters

Example: rack-specific querying

Commonly used metrics for scaling