I am new to the Filebeat, Logstash, Kafka, ElasticSearch stack where the logs are captured and processed. Recently I have been working on this filebeat configuration in the filebeat-daemonset.yaml file in the master node of a cluster.
This is the configuration :
# default filebeat_expire_time is 30 days
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: kube-system
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.config.inputs:
enabled: true
path: /etc/filebeat.yml
reload.enabled: true
reload.period: 10s
filebeat.registry.flush: 60s
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
exclude_files: [".*_kube-system_.*", ".*_monitoring_.*", ".*_argo_.*"]
close_removed: false
processors:
- script:
lang: javascript
id: js_filter
source: >
function process(event) {
// extract pod metadata
var path = event.Get("log").file.path;
var kube_uid = path.substring(20, path.length-4); // /var/log/containers/KUBE_UID.log
event.Put("kube_uid", kube_uid);
var segments = kube_uid.split("_");
event.Put("kube_pod_name", segments[0]);
event.Put("kube_namespace_name", segments[1]);
event.Put("container_name", segments[2].substring(0, segments[2].length-65));
event.Put("container_id", segments[2].substring(segments[2].length-64));
// drop useless fields
event.Delete("agent");
event.Delete("ecs");
event.Delete("host");
event.Delete("log");
event.Delete("input");
// rename message fields
var message = event.Get("message");
event.Put("log", message);
event.Delete("message");
// convert timestamp field
event.Put("time", event.Get("@timestamp"));
event.Put("time_milli_long", Math.round(event.Get("@timestamp").UnixNano() / 1000000));
event.Put("time_nano_long", event.Get("@timestamp").UnixNano());
// add time filter
if (event.Get("time_milli_long") < Date.now() - 43200 * 60 * 1000 ) {
event.Cancel();
}
}
- add_fields:
target: ""
fields:
host: ${NODE_NAME}
es_index: "k8s_log_ai_prod"
output.kafka:
# initial brokers for reading cluster metadata
hosts:
- "kafka-las-share-01.server.hulu.com:6667"
- "kafka-las-share-02.server.hulu.com:6667"
- "kafka-las-share-03.server.hulu.com:6667"
- "kafka-las-share-04.server.hulu.com:6667"
- "kafka-las-share-05.server.hulu.com:6667"
version: "0.10.2"
# message topic selection + partitioning
topic: "vai-caposv2-prod-las"
worker: 8
required_acks: 1
compression: none
max_message_bytes: 1000000
bulk_max_size: 2048
channel_buffer_size: 4096
keep_alive: 60
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: Default
tolerations:
- operator: Exists
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:7.11.1
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
resources:
limits:
cpu: 4
memory: 2Gi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: varlib-mount-dockercontainers
mountPath: /mnt/volume1/docker/docker/containers
readOnly: true
- name: varlib-mount-cricontainers
mountPath: /mnt/volume1/docker/pods-log
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0640
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
# When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
path: /var/lib/filebeat-data
type: DirectoryOrCreate
- name: varlib-mount-dockercontainers
hostPath:
path: /mnt/volume1/docker/docker/containers
- name: varlib-mount-cricontainers
hostPath:
path: /mnt/volume1/docker/pods-log
This filebeat-daemonset.yaml file was already existing in the master node. I just applied this by using this command :
kubectl apply -f filebeat-daemonset.yaml
After running this I can see the filebeat pods running on all the nodes in the cluster -
vai.deploy@caposv2-prod-ai-master01:/opt$ kubectl get daemonset -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
calico-node 34 34 33 34 33 <none> 164d
filebeat 34 34 33 34 33 <none> 23d
nvidia-device-plugin-daemonset 10 10 9 10 9 <none> 176d
And I can see the logs being captured in one set of nodes whose names start are like 'ai-general-xx' but I cannot see the logs from the other nodes whose names are like 'vai-computing-con-xxx'.
I describe the node using kubectl describe node vai-computing-con-012
And I can see the filebeat pod running in one of the vai-computing-con nodes.
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system calico-node-gn6pc 250m (0%) 0 (0%) 0 (0%) 0 (0%) 164d
kube-system filebeat-4pft9 100m (0%) 4 (3%) 100Mi (0%) 2Gi (0%) 19d
monitoring node-exporter-fzflb 112m (0%) 270m (0%) 200Mi (0%) 220Mi (0%) 167d
nimbus caposv2-3606523f4cd84c25b392d8a66d12647d-lx2t9 8 (7%) 0 (0%) 64Gi (8%) 64Gi (8%) 6h26m
nimbus caposv2-7864b8706ee74e3fb3fdec4529587196-7jxwt 16 (15%) 0 (0%) 180Gi (23%) 180Gi (23%) 83m
rook-ceph-admin csi-cephfsplugin-8xnls 0 (0%) 0 (0%) 0 (0%) 0 (0%) 167d
rook-ceph-admin csi-rbdplugin-p2gxm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 167d
I can also see the logs being captured when I executed this command : kubectl logs filebeat-4pft9 -n kube-system
And I see no errors when I looked into the logs. Here is a part of the logs -
2024-01-09T16:41:41.517Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":8}},"cpuacct":{"total":{"ns":22051632}},"memory":{"mem":{"usage":{"bytes":-155648}}}},"cpu":{"system":{"ticks":5260,"time":{"ms":2}},"total":{"ticks":14150,"time":{"ms":23},"value":14150},"user":{"ticks":8890,"time":{"ms":21}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33210056}},"memstats":{"gc_next":21179168,"memory_alloc":10662112,"memory_total":930745080,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.78,"15":0.74,"5":0.64,"norm":{"1":0.0075,"15":0.0071,"5":0.0062}}}}}}
2024-01-09T16:42:11.517Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":12}},"cpuacct":{"total":{"ns":8068913}},"memory":{"mem":{"usage":{"bytes":-163840}}}},"cpu":{"system":{"ticks":5260,"time":{"ms":6}},"total":{"ticks":14160,"time":{"ms":8},"value":14160},"user":{"ticks":8900,"time":{"ms":2}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33240056}},"memstats":{"gc_next":21179168,"memory_alloc":11488064,"memory_total":931571032,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.86,"15":0.75,"5":0.67,"norm":{"1":0.0083,"15":0.0072,"5":0.0064}}}}}}
2024-01-09T16:42:41.517Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":9}},"cpuacct":{"total":{"ns":6832351}}},"cpu":{"system":{"ticks":5270,"time":{"ms":6}},"total":{"ticks":14170,"time":{"ms":8},"value":14170},"user":{"ticks":8900,"time":{"ms":2}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":9},"info":{"ephemeral_id":"95411b4f-6455-4547-be4d-42d3b86bd48c","uptime":{"ms":33270057}},"memstats":{"gc_next":21179168,"memory_alloc":12444280,"memory_total":932527248,"rss":69300224},"runtime":{"goroutines":23}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0},"scans":3},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.59,"15":0.73,"5":0.62,"norm":{"1":0.0057,"15":0.007,"5":0.006}}}}}}
2024-01-09T16:43:11.517Z INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":10}},"cpuacct":{"total":{"ns":8659954}}},"cpu":{"system":{"ticks":5280,"time":{"ms":6}},"total":{"ticks":14180,"time":{"ms":7},"value":14180},"use^Z
[1]+ Stopped kubectl logs filebeat-4pft9 -n kube-system
But I could not see the logs coming from this node being captured in the kafka topic that is specified in the filebeat-daemonset.yaml file.
What and where do I need to look into to get the logs from all nodes to be captured in the kafka topic? What needs to be fixed?