I'm trying to centralise logging in an environment that using multiple application technologies (Java, Rails and various DBs).
We want to developers to bring up stacks with Docker Compose, but we want to them to refer to a central log source (ELK) to debug issues, rather than trying to open shells into running Docker containers.
The applications all write to the file system rather than to STDOUT/STDERR, which removes all of the options associated with the Docker logging driver, and logspout too.
What we have done is configure the containers to have rsyslog include the application log files and forward those to logstash which has a syslog input. This works in terms of moving the logs from A to B, but managing multi-technology logs in ELK based on the syslog input is horrible (eg trying to capture multine Java stacktraces, or MySQL slow queries).
Is there a better way to do this? Should I be running logstash in each container, so that I can apply filters and codecs directly to the log files, so that I don't have to rely on the syslog input?
Of is there some way to use the Docker logging driver with application log files that are written to the file system?
Recent versions of Docker support transmitting logs in 'GELF' format to a network port. Logstash has a GELF input. You could run Logstash on every node and have all Docker instances on the node forward to it.
As a Logstash input: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-gelf.html
For Docker output: https://docs.docker.com/engine/admin/logging/overview/#gelf
(The gelf-address is from outside the containers perspective, not inside)
You could also configure logstash to parse the various json log files docker produces by default.
Another approach is to use what is called a sidecar in Kubernetes.
They provide a few different examples in their cluster logging concepts page.
How you choose to apply that concept is entirely dependent on your needs.
However, a simple proof of concept might work by:
You could also of course set up a central syslog listener (using logstash, or rsyslog for example), and do this without a sidecar.
This approach is also very much in the same vein as @Jason Martin's suggestion to use GELF.
Another use of a local sidecar might be to create a container running logstash with a file input, and exposed a log volume (e.g. /var/log/, or /logs). You could then share that volume with other containers, to allow them to write their logs (e.g. /logs/$INSTANCE_ID/file.log), and have logstash parse them.
This last setup allows monitoring files rather than STDOUT/STDERR, but you will probably have to have your log directory
chmod 1777
(or have several such sidecars).The 'reverse' setup would also work, of course (but seems harder to manage/maintain): have your application containers expose a log volume, and have a logstash sidecar deal read the log volume's content.