I want to accumulate logs on a central server. Typing appropriate keywords into Google takes me to all sorts of sophisticated tools like fluentd and elk - but I don't want to aggregate the logs and provide analytics, I only want to get them off the origin server as quickly, efficiently and reliably as possible.
Piping the stream output through logger would give me the ability to transfer the data as syslog - but that means UDP (not reliable) and I can't see how to de-multiplex the data from multiple servers at the log host.
Piping the data via netcat makes it easier to use TCP so it should be possible to build some sort of high availability capability and this has less overhead than sending each log entry as an http request, but this feels a little ad hoc and too minimalist.
Writing the logs to an NFS share has a lot going for it apart from the security aspect - the origin the log entries has too much control over the log file. And, since nfs4 is not an option, high availability is difficult.
Has anyone got a better robust and efficient solution?
(The logs all originate from applications on Linux / a single log entry may be multi-line)
What is your OS? The rsyslog daemon is included in most distros and has built-in ability to forward via UDP, TCP, and RELP. (Reliable Event Logging Protocol)
As you've already mentioned you don't like UDP for reliability; consider TCP or RELP.
If I understand what you are asking here, the answer is to use a template in your output directory that uses the hostname of the sender. (And also index by date to make it easier to expire old files.) You end up with something like;
/var/log/loghost/$yyyy/$MM/$dd/$hostname/messages
For application logs, you can read them with rsyslog's input file module. (imfile)
There are some excellent cookbooks/examples at the maintainer's website. You seem to just need pointed in the right direction so I haven't been too specific. I'd start here and come back with more specific questions if you get stuck.
https://www.rsyslog.com/