Filebeat CPU throttle in kubernetes with django logging

51 Views Asked by At

I have a simple Django API served with gunicorn on a Kubernetes cluster. And after long time of running the pod seems to CPU throttle.

So I tried to investigate, I used Lucost to swarm my API of requests to see how it handles a unsual amount of requests.

Here is a recap of normal activity of the API:

  • 60 requests per hour
  • 10 lines of log per request
  • 600 lines of logs per hour

So it's not an intensive requested API. Internally the API will check the body and make a request to CosmosDB server to retreive some data and then format it to send it back to the caller. Less than 200ms requets with very few memory needed.

When doing a swarm with locust, I see that the cpu throttles and when using top command in the pod, I see that filebeat uses 40-60% of the cpu. While in normal activity it stays at 0.1-0.5% CPU.

I kill -9 the PID of the filebeat, did the same swarm, and everything was smooth.

I thought that my log file was too big and filebeat had issue with reading the file.

Here is how my django logger is defined:

"app_json_file": {
    "level": "INFO",
    "class": "logging.handlers.TimedRotatingFileHandler",
    "filename": APP_LOG_FILE,
    "when": "D",
    "interval": 7,
    "backupCount": 3,
    "formatter": "app_json_formatter",
}

I tried to modify the strategy to have a more frequent rotation to make files smaller, it does help a bit, but latter fileabet CPU usage will ramp up.

Here are the ressources for the k8s pod:

  • cpu request: 250m
  • cpu limit: 500m
  • memory request: 125Mi
  • memory limit: 250Mi

I can not enable Horizontal Pod Autoscaling (HPA), because the K8S cluster does not have this option enabled.

Here is the filebeat configuration file:

#================================ Logging =====================================

logging:
  level: error
  json: true
  to_files: true
  to_syslog: false

  files:
    path: ${path.logs}
    keepfiles: 2
    permissions: 0600
  metrics:
    enabled: false
    period: 30s

#================================ Inputs =====================================

filebeat.inputs:
  - type: filestream
    enabled: true
    id: "access-log"
    paths:
      - ${LOG_DIR}/${ACCESS_LOG_FILE_NAME}

I did not find a way to limit the cpu usage of filebeat directly from the configuration. So I don't know how to handle this situation where when we get more logs, filebeat will eat all the CPU and will make the API suffocate without dying, and making Time Outs for all requests.

How could I limit the filebeat CPU usage in larger workload, maybe a queue in filebat?

0

There are 0 best solutions below