I see there are 2 separate metrics ApproximateNumberOfMessagesVisible and ApproximateNumberOfMessagesNotVisible.
Using number of messages visible causes processing pods to get triggered for termination immediately after they pick up the message from queue, as they're no longer visible. If I use number of messages not visible, it will not scale up.
I'm trying to scale a kubernetes service using horizontal pod autoscaler and external metric from SQS. Here is template external metric:
apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
name: metric-name
spec:
name: metric-name
queries:
- id: metric_name
metricStat:
metric:
namespace: "AWS/SQS"
metricName: "ApproximateNumberOfMessagesVisible"
dimensions:
- name: QueueName
value: "queue_name"
period: 60
stat: Average
unit: Count
returnData: true
Here is HPA template:
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: hpa-name
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: deployment-name
minReplicas: 1
maxReplicas: 50
metrics:
- type: External
external:
metricName: metric-name
targetAverageValue: 1
The problem will be solved if I can define another custom metric that is a sum of these two metrics, how else can I solve this problem?
We used a lambda to fetch two metrics and publish a custom metric that is sum of messages in-flight and waiting, and trigger this lambda using cloudwatch events at whatever frequency you want, https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#rules:action=create
Here is lambda code for reference: