My lab has a few Docker containers as follows:
name | Docker image |
---|---|
Fluentd | fluent/fluentd:v1.16-1 |
Fluent-bit | cr.fluentbit.io/fluent/fluent-bit |
Loki | grafana/loki |
Grafana | grafana/grafana-enterprise |
Caddy | caddy:builder |
My goal is to collect Caddy logs and visualize them in Grafana.
Scenario: Fluent-bit tails the logs and sends them to Fluentd. Then Fluentd pushes the logs to Loki. My aim is to use Fluentd as the central log collector.
The problem is the parsing of those logs Grafana-side.
The Caddy logs are in (nested) JSON format. Sample:
{"level":"info","ts":1712949034.535184,"logger":"http.log.access.log1","msg":"handled request","request":{"remote_ip":"172.18.0.1","remote_port":"39664","client_ip":"172.18.0.1","proto":"HTTP/1.1","method":"POST","host":"grafana.darknet.com","uri":"/api/short-urls","headers":{"Content-Length":["580"],"Origin":["http://grafana.darknet.com"],"Content-Type":["application/json"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0"],"Accept":["application/json, text/plain, */*"],"X-Grafana-Org-Id":["1"],"Connection":["keep-alive"],"Accept-Language":["en-US,en;q=0.5"],"Accept-Encoding":["gzip, deflate"],"Referer":["http://grafana.darknet.com/explore?schemaVersion=1&panes=%7B%22Efb%22:%7B%22datasource%22:%22f779c221-7bd2-468d-9f9c-96e069b869f8%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bjob%3D%5C%22caddy.log.loki%5C%22%7D%20%7C%20json%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22f779c221-7bd2-468d-9f9c-96e069b869f8%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-1m%22,%22to%22:%22now%22%7D%7D%7D&orgId=1"],"X-Grafana-Device-Id":["f343e938e74b3a57997faff69d24de8a"],"Cookie":[]}},"bytes_read":580,"user_id":"","duration":0.011267887,"size":72,"status":200,"resp_headers":{"X-Xss-Protection":["1; mode=block"],"Date":["Fri, 12 Apr 2024 19:10:34 GMT"],"Content-Length":["72"],"Server":["Caddy"],"Cache-Control":["no-store"],"Content-Type":["application/json"],"X-Content-Type-Options":["nosniff"],"X-Frame-Options":["deny"]}}
I have tried two different configurations so far:
Have Fluent-bit send the logs to Fluentd, then Fluentd forwards the logs to Loki (tagged as
caddy.log
)
Schema:Cady --> Fluent-bit --> Fluentd --> Loki
Have Fluent-bit send the logs straight to Loki (tagged as
caddy.log.loki
)
Schema:Cady --> Fluent-bit --> Loki
Here I have the following Fluent-bit config to send logs to both Loki and Fluentd at the same time, with different tags:
[INPUT]
Name tail
Path /var/log/caddy/*.log
Parser json
Tag caddy.log
Path_Key log_filename
# send logs to Fluentd
[OUTPUT]
Name forward
Host fluentd
Port 24224
Match caddy.*
# send logs straight to Loki
[OUTPUT]
name loki
match caddy.*
host loki
port 3100
labels job=caddy.log.loki
Fluentd config:
<source>
@type forward
</source>
<match caddy.*>
@type loki
url "http://loki:3100"
extra_labels {"job": "caddy.log"}
<buffer>
flush_interval 5s
flush_at_shutdown true
</buffer>
</match>
Then in Grafana I can browse the logs and I have the two labels available in the Explore window.
If I choose the tag caddy.log.loki
the logs are displayed in plain JSON as shown below. With this expression I can parse them: {job="caddy.log.loki"} | json
. Some of the nested JSON is extracted eg: request_client_ip
but not all of it, for example request.headers
is missing but I can live with that.
If I choose the tag caddy.log
then the logs are displayed in a "mixed" format:
It appears that some transformation took place but I am not sure where.
I can use logfmt
to parse the lines. But I am still left with some unparsed fields (request
, resp_headers
) as shown below:
Questions:
- why is that the logs are not rendered in plain JSON anymore if I add the Fluentd step?
- what would be the best way to ship and parse nested JSON logs in Loki/Grafana with Fluentd?
According to the fluentd loki output plugin docs, the default
line_format
iskey_value
. You did not specify the format in yourfluentd
configuration, so the logs are not in JSON but in<key>=<value>
format.You can try adding Nest filter to your
fluentbit
configuration:Result:
Note that LogQL json parser without parameters will skip arrays (https://grafana.com/docs/loki/latest/query/log_queries/#json), so if you want fields with arrays you have to specify it in the parameters, for example: