No logs captured in kafka topic with filebeat config

38 Views Asked by At

I am new to the Filebeat, Logstash, Kafka, ElasticSearch stack where the logs are captured and processed. Recently I have been working on this filebeat configuration in the filebeat-daemonset.yaml file in the master node of a cluster.

This is the configuration :

# default filebeat_expire_time is 30 days


---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config.inputs:
      enabled: true
      path: /etc/filebeat.yml
      reload.enabled: true
      reload.period: 10s

    filebeat.registry.flush: 60s

    filebeat.inputs:
    - type: container
      paths:
        - /var/log/containers/*.log
      exclude_files: [".*_kube-system_.*", ".*_monitoring_.*", ".*_argo_.*"]
      close_removed: false

    processors:
    - script:
        lang: javascript
        id: js_filter
        source: >
          function process(event) {
              // extract pod metadata
              var path = event.Get("log").file.path;
              var kube_uid = path.substring(20, path.length-4); // /var/log/containers/KUBE_UID.log
              event.Put("kube_uid", kube_uid);
              var segments = kube_uid.split("_");
              event.Put("kube_pod_name", segments[0]);
              event.Put("kube_namespace_name", segments[1]);
              event.Put("container_name", segments[2].substring(0, segments[2].length-65));
              event.Put("container_id", segments[2].substring(segments[2].length-64));

              // drop useless fields
              event.Delete("agent");
              event.Delete("ecs");
              event.Delete("host");
              event.Delete("log");
              event.Delete("input");

              // rename message fields
              var message = event.Get("message");
              event.Put("log", message);
              event.Delete("message");

              // convert timestamp field
              event.Put("time", event.Get("@timestamp"));
              event.Put("time_milli_long", Math.round(event.Get("@timestamp").UnixNano() / 1000000));
              event.Put("time_nano_long", event.Get("@timestamp").UnixNano());
              // add time filter
              if (event.Get("time_milli_long") < Date.now() - 43200 * 60 * 1000 ) {
                event.Cancel();
              }
          }
    - add_fields:
        target: ""
        fields:
          host: ${NODE_NAME}
          es_index: "k8s_log_ai_prod"

    output.kafka:
      # initial brokers for reading cluster metadata
      hosts:
      - "kafka-las-share-01.server.hulu.com:6667"
      - "kafka-las-share-02.server.hulu.com:6667"
      - "kafka-las-share-03.server.hulu.com:6667"
      - "kafka-las-share-04.server.hulu.com:6667"
      - "kafka-las-share-05.server.hulu.com:6667"
      version: "0.10.2"

      # message topic selection + partitioning
      topic: "vai-caposv2-prod-las"

      worker: 8
      required_acks: 1
      compression: none
      max_message_bytes: 1000000
      bulk_max_size: 2048
      channel_buffer_size: 4096
      keep_alive: 60

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: Default
      tolerations:
        - operator: Exists
      containers:
        - name: filebeat
          image: docker.elastic.co/beats/filebeat:7.11.1
          args: [
            "-c", "/etc/filebeat.yml",
            "-e",
          ]
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          securityContext:
            runAsUser: 0
          resources:
            limits:
              cpu: 4
              memory: 2Gi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
            - name: config
              mountPath: /etc/filebeat.yml
              readOnly: true
              subPath: filebeat.yml
            - name: data
              mountPath: /usr/share/filebeat/data
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: varlib-mount-dockercontainers
              mountPath: /mnt/volume1/docker/docker/containers
              readOnly: true
            - name: varlib-mount-cricontainers
              mountPath: /mnt/volume1/docker/pods-log
              readOnly: true
      volumes:
        - name: config
          configMap:
            defaultMode: 0640
            name: filebeat-config
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: varlog
          hostPath:
            path: /var/log
        # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
        - name: data
          hostPath:
            # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
            path: /var/lib/filebeat-data
            type: DirectoryOrCreate
        - name: varlib-mount-dockercontainers
          hostPath:
            path: /mnt/volume1/docker/docker/containers
        - name: varlib-mount-cricontainers
          hostPath:
            path: /mnt/volume1/docker/pods-log

This filebeat-daemonset.yaml file was already existing in the master node. I just applied this by using this command :

kubectl apply -f filebeat-daemonset.yaml

After running this I can see the filebeat pods running on all the nodes in the cluster -

vai.deploy@caposv2-prod-ai-master01:/opt$ kubectl get daemonset -n kube-system
NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
calico-node                      34        34        33      34           33          <none>          164d
filebeat                         34        34        33      34           33          <none>          23d
nvidia-device-plugin-daemonset   10        10        9       10           9           <none>          176d

And I can see the logs being captured in one set of nodes whose names start are like 'ai-general-xx' but I cannot see the logs from the other nodes whose names are like 'vai-computing-con-xxx'.

I describe the node using kubectl describe node vai-computing-con-012 And I can see the filebeat pod running in one of the vai-computing-con nodes.

Non-terminated Pods:          (7 in total)
  Namespace                   Name                                              CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                                              ------------  ----------  ---------------  -------------  ---
  kube-system                 calico-node-gn6pc                                 250m (0%)     0 (0%)      0 (0%)           0 (0%)         164d
  kube-system                 filebeat-4pft9                                    100m (0%)     4 (3%)      100Mi (0%)       2Gi (0%)       19d
  monitoring                  node-exporter-fzflb                               112m (0%)     270m (0%)   200Mi (0%)       220Mi (0%)     167d
  nimbus                      caposv2-3606523f4cd84c25b392d8a66d12647d-lx2t9    8 (7%)        0 (0%)      64Gi (8%)        64Gi (8%)      6h26m
  nimbus                      caposv2-7864b8706ee74e3fb3fdec4529587196-7jxwt    16 (15%)      0 (0%)      180Gi (23%)      180Gi (23%)    83m
  rook-ceph-admin             csi-cephfsplugin-8xnls                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         167d
  rook-ceph-admin             csi-rbdplugin-p2gxm                               0 (0%)        0 (0%)      0 (0%)           0 (0%)         167d

I can also see the logs being captured when I executed this command : kubectl logs filebeat-4pft9 -n kube-system And I see no errors when I looked into the logs. Here is a part of the logs -

2024-01-09T16:41:41.517Z    INFO    [monitoring]    log/log.go:144  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":8}},"cpuacct":{"total":{"ns":22051632}},"memory":{"mem":{"usage":{"bytes":-155648}}}},"cpu":{"system":{"ticks":5260,"time":{"ms":2}},"total":{"ticks":14150,"time":{"ms":23},"value":14150},"user":{"ticks":8890,"time":{"ms":21}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33210056}},"memstats":{"gc_next":21179168,"memory_alloc":10662112,"memory_total":930745080,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.78,"15":0.74,"5":0.64,"norm":{"1":0.0075,"15":0.0071,"5":0.0062}}}}}}
2024-01-09T16:42:11.517Z    INFO    [monitoring]    log/log.go:144  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":12}},"cpuacct":{"total":{"ns":8068913}},"memory":{"mem":{"usage":{"bytes":-163840}}}},"cpu":{"system":{"ticks":5260,"time":{"ms":6}},"total":{"ticks":14160,"time":{"ms":8},"value":14160},"user":{"ticks":8900,"time":{"ms":2}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33240056}},"memstats":{"gc_next":21179168,"memory_alloc":11488064,"memory_total":931571032,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.86,"15":0.75,"5":0.67,"norm":{"1":0.0083,"15":0.0072,"5":0.0064}}}}}}
2024-01-09T16:42:41.517Z    INFO    [monitoring]    log/log.go:144  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":9}},"cpuacct":{"total":{"ns":6832351}}},"cpu":{"system":{"ticks":5270,"time":{"ms":6}},"total":{"ticks":14170,"time":{"ms":8},"value":14170},"user":{"ticks":8900,"time":{"ms":2}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33270057}},"memstats":{"gc_next":21179168,"memory_alloc":12444280,"memory_total":932527248,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.59,"15":0.73,"5":0.62,"norm":{"1":0.0057,"15":0.007,"5":0.006}}}}}}
2024-01-09T16:43:11.517Z    INFO    [monitoring]    log/log.go:144  Non-zero metrics in the last 30s    {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":10}},"cpuacct":{"total":{"ns":8659954}}},"cpu":{"system":{"ticks":5280,"time":{"ms":6}},"total":{"ticks":14180,"time":{"ms":7},"value":14180},"use^Z
[1]+  Stopped                 kubectl logs filebeat-4pft9 -n kube-system

But I could not see the logs coming from this node being captured in the kafka topic that is specified in the filebeat-daemonset.yaml file.

What and where do I need to look into to get the logs from all nodes to be captured in the kafka topic? What needs to be fixed?

0

There are 0 best solutions below