How to include the hostname that is added to event logs from fluentbit in fluentd

1.5k Views Asked by At

I Have configured fluent-bit in application pod which is sending nginx access log to fluentd which is deployed as a separate pod and fluentd is responsible for sending logs to elasticsearch but before sending to elasticsearch I want to include the hostname (of a pod where fluent bit is installed and shipping the log to fluentd) but I am not able to include the hostname after parsing in fluentd as I am not knowing how to parse two key_name together, could any one help me how to keep the hostname present in event logs? Thanks in advance.

Below are the fluent-bit and fluentd configuration

This is fluent-bit configuration

[SERVICE]
    # Flush
    # =====
    # set an interval of seconds before to flush records to a destination
    flush        1

    # Daemon
    # ======
    # instruct Fluent Bit to run in foreground or background mode.
    daemon       Off

    # Log_Level
    # =========
    # Set the verbosity level of the service, values can be:
    #
    # - error
    # - warning
    # - info
    # - debug
    # - trace
    #
    # by default 'info' is set, that means it includes 'error' and 'warning'.
    log_level    info

    # Parsers File
    # ============
    # specify an optional 'Parsers' configuration file
    parsers_file parsers.conf

    # Plugins File
    # ============
    # specify an optional 'Plugins' configuration file to load external plugins.
    plugins_file plugins.conf

    # HTTP Server
    # ===========
    # Enable/Disable the built-in HTTP Server for metrics
    http_server  Off
    http_listen  0.0.0.0
    http_port    2020

    # Storage
    # =======
    # Fluent Bit can use memory and filesystem buffering based mechanisms
    #
    # - https://docs.fluentbit.io/manual/administration/buffering-and-storage
    #
    # storage metrics
    # ---------------
    # publish storage pipeline metrics in '/api/v1/storage'. The metrics are
    # exported only if the 'http_server' option is enabled.
    #
    storage.metrics on

    # storage.path
    # ------------
    # absolute file system path to store filesystem data buffers (chunks).
    #
    # storage.path /tmp/storage

    # storage.sync
    # ------------
    # configure the synchronization mode used to store the data into the
    # filesystem. It can take the values normal or full.
    #
    # storage.sync normal

    # storage.checksum
    # ----------------
    # enable the data integrity check when writing and reading data from the
    # filesystem. The storage layer uses the CRC32 algorithm.
    #
    # storage.checksum off

    # storage.backlog.mem_limit
    # -------------------------
    # if storage.path is set, Fluent Bit will look for data chunks that were
    # not delivered and are still in the storage layer, these are called
    # backlog data. This option configure a hint of maximum value of memory
    # to use when processing these records.
    #
    # storage.backlog.mem_limit 5M



[INPUT]
    name tail
    tag  my.log
    path /opt/nginx/logs/access.log

[FILTER]
    Name record_modifier
    Match my.log
    Record hostname ${HOSTNAME}


[OUTPUT]
    Name          forward
    Match         my.log
    Host          fluentd.default.svc.cluster.local
    Port          24284
    Shared_Key    secret
    Self_Hostname flb.local
    tls           on
    tls.verify    off
    Time_as_Integer On

This is fluentd configuration

<source>
  @type         secure_forward
  self_hostname myserver.local
  shared_key    secret
  secure no
  use_record_host true
</source>


<filter my.log>
  @type parser
  key_name log
  <parse>
    @type regexp
    #expression /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
    #time_format %d/%b/%Y:%H:%M:%S %z
    #expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?)?$/
#time_format %d/%b/%Y:%H:%M:%S %z
    expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?)?$/
    time_format %d/%b/%Y:%H:%M:%S %z
  </parse>
</filter>






<match my.log>
  @type elasticsearch
  host es.xxx.internal
  port 80
  include_timestamp true
  index_name fluentd.nginx.access
  logstash_format false
</match>

This is nginx access log I am trying to parse 10.x.x.1 - - [17/Apr/2023:05:21:40 +0000] "GET /site_status.json HTTP/1.1" 200 16 "-" "kube-probe/1.21"

How to include hostname shipped with event logs sent from fluent-bit, I didn't get how to add two keynames log and hostname while parsing could anyone help to configure that?

0

There are 0 best solutions below