Filtering CloudWatch logs for EKS containers
Reducing Kubernetes Logging Costs and Noise with Customized FluentBit Outputs: A Guide for EKS Users Sending Logs to Cloudwatch
If you run containers in EKS Kubernetes you are probably using Fluentd/FluentBit to send logs from pods to Cloudwatch. One issue many encounter with this approach is a steep increase in logging costs due to the way that the cwagent and fluentbit kubernetes plugin send all logs from all containers. Excluding logs based on namespace or containers is difficult and poorly documented but by tweaking your input configuration you can save on logging costs and reduce log noise.
What is FluentBit and how does it relate to Kubernetes logging
FluentBit is the successor to Fluentd - the defacto logging agent for kubernetes clusters. When running containers in kubernetes FluentBit can mount container logs into /var/log
and aggregate the results. It can then export logs to a number of logging clients such as Cloudwatch.
Installation and setup
Assuming you have already installed fluentbit on your AWS EKS kubernetes cluster following the workshop instructions we can inspect the configuration and customize the outputs. The standard config file is probably stored in a ConfigMap YAML file with the name like fluent-bit.conf
. Here is the config from the MailSlurp cluster we run our disposable email account services on.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: amazon-cloudwatch
labels:
k8s-app: fluent-bit
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server ${HTTP_SERVER}
HTTP_Listen 0.0.0.0
HTTP_Port ${HTTP_PORT}
storage.path /var/fluent-bit/state/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
@INCLUDE application-log.conf
application-log.conf: |
[INPUT]
Name tail
Tag application.*
Exclude_Path /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
Path /var/log/containers/*
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline
Parser docker
DB /var/fluent-bit/state/flb_container.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Rotate_Wait 30
storage.type filesystem
Read_from_Head ${READ_FROM_HEAD}
[FILTER]
Name kubernetes
Match application.*
Kube_URL https://kubernetes.default.svc:443
Kube_Tag_Prefix application.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude On
Labels Off
Annotations Off
[OUTPUT]
Name cloudwatch_logs
Match application.*
region ${AWS_REGION}
log_group_name /aws/containerinsights/${CLUSTER_NAME}/application
log_stream_prefix ${HOST_NAME}-
auto_create_group true
extra_user_agent container-insights
parsers.conf: |
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
[PARSER]
Name syslog
Format regex
Regex ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
[PARSER]
Name container_firstline
Format regex
Regex (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
[PARSER]
Name cwagent_firstline
Format regex
Regex (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
Take note of the application.conf [INPUT]
section. Notice the following line:
Path /var/log/containers/*
This standard setup uses a wildcard pattern to match all container logs mounted inside the FluentBit agent at the /var/log/
directory. In the next section we can modify the Path field or the Exclude_Path property to filter containers for logging and exclude namespaces or pods.
Excluding containers
The easiest way to exclude everything and only include the pods you wish is to change the Path property in your input configuration to include a comma separated list of container patterns that you want to include in the logging pipeline. Here is an example:
application-log.conf: |
[INPUT]
Name tail
Tag application.*
Exclude_Path /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
Path /var/log/containers/my-container.log, /var/log/containers/my-other-container.log
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline
Parser docker
DB /var/fluent-bit/state/flb_container.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Rotate_Wait 30
storage.type filesystem
Read_from_Head ${READ_FROM_HEAD}
Note that the Path field now matches containers that we wish.