I've been looking at RFC5424 to find the formally specified marker that will end a syslog event.
Unfortunately I couldn't find it. So If I wanted to implement some small syslog server that reacts on certain messages what is the marker that ends a message (yes commonly an event is a single line, but I just couldn't find it in the specification)
Clarification:
I call it event because I associate a message with a single line. An event could possibly be some thing like
Type: foo
Source: webservers
whereas a message to me is this:
Type: foo Source: webservers
https://www.rfc-editor.org/rfc/rfc5424#section-6 defines:
SYSLOG-MSG = HEADER SP STRUCTURED-DATA [SP MSG]
neither STRUCTURED-DATA
nor MSG
tell me how these fields end. Especially MSG
is defined as as MSG-ANY / MSG-UTF8
which expands to virtually anything. There's nothing that says a newline marks the end (or an 8
or an a
for that matter). Given the example messages (section 6.5):
This is one valid message, or 2 valid messages depending on wether you say that a HEADER
element must never occur in any MSG
element:
literal whitespace
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - <34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47
|
is this an end marker?
\t
stands for a tab
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 -\t<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47
|
is this an end marker?
\n
stands for a newline
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 -\n<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47
|
is this an end marker?
Either I'm misreading the RFC or there just isn't any mention. The sizes specified in the RFC just say what the minimum length is expected that I can work with...
ANSWER?: Appearantly I was reading the wrong RFC. One needs to go the the specific transport RFCs and keep to that https://www.rfc-editor.org/rfc/rfc5426#section-3.1 says it all for the UDP transport.
@joechip: Since your comments and answer lead me to actually read a bit more in the transport RFCs I'll be happy to accept your answer if you update it a bit in that direction :)
Well, what do you mean by "syslog event"? In case you refer to syslog messages, RFC5424 unambiguously defines the syslog message syntax in its section 6, as how it is to be transmitted from one syslog application to another.
In case you are referring to how they are stored in the log files by the receiving syslog application, typical syslog implementations simply separate one record from another with newlines, and this is not usually a configurable behavior. Furthermore, a syslog record's text field can also include newlines and this complicates the task of parsing the log file correctly. It can usually be parsed nonetheless because each syslog record starts with the usual sequence of date, time, host and tag while newlines inside a syslog record would not normally be followed by text similar to those.
I think that the ability to change the syslog stored-record separator would be a useful feature, but any ocurrence of such separator inside the record itself should be escaped for this to be useful. Adding so much structure to a plain text file is bound to be a compromise. If you care much about this issue, perhaps you should support writing to log files in some well-defined binary format (e.g., sqlite could be useful here).
Edit: A more careful examination of RFC5424 section 6 shows that a syslog message can have two forms:
or
By expanding the ABNF specification, we can easily see that the first form ends in either "-" or "]". There could be other "-" and "]" chars before this final char, so it can't be taken for a syslog message terminator.
The second form ending depends on how MSG ends. MSG can be either a UTF-8 string (as specified in RFC 3629, which contains no string termination) or an arbitrary octet stream ending in any value. Evidently, there's no such termination symbol specified for this form either.
But the fact is that there is no need for a syslog message terminator, no matter what form it is in, because the message length is communicated out-of-band by the transport layer. When the UDP packet is sent by the application, the syslog message must be already prepared according to spec and stored in a buffer. This buffer is passed by the application to a function or method in order to send it, and the amount of bytes to send is passed too. For example, in C we have:
In this example, len is the amount of bytes that should be taken from the buffer buf and sent to the remote host.
Likewise, on the syslog server another function or method is called, such as this one:
This function returns the length in bytes of the UDP payload received in buffer buf. If the application attempts to read more than this returned length, it will get garbage (or a segmentation fault). To avoid reading over this limit, it is usual to put a NULL value at position buf[siz] right after the siz=recvfrom(...) call. This way, any later function call that uses buf as a string will work properly. This null-termination only applies to strings, of course, and not to octet streams. And this null value is, as I said, usually not transmitted over the network but only added by the receiving application.
In the case of the syslog server as a receiving application, most syslog servers might add this null-termination for their internal handling of the received string (if they treat it as a string at all), but in any case this null value is left out when the string is appended to the logfile so as not to disrupt text processing of the logfile as a whole.
In section 6.1 they define a message length. I would figure that when you get the complete message you'd have the header and data and it would add up to that length.
Beyond that, I see no facility in there for multiple messages. So I'd figure each message is an event. There is no multi-message tracking of any sort and no specified coding for start, middle, and ending messages. Syslog tracks logged messages, it doesn't really have a higher-level event concept.