I have an application creating reports that are centralized using syslog, and I'm storing them in a Postgres database for custom use.
The database has a specific format (given that the data I'm centralizing are in a kind of csv, each column with a specific meaning). So far, so good, the data are correctly inserted into the database, in the correct format.
If a badly formated message makes it to the syslog (for instance, with text instead of an int), I get an insert error because the type is invalid, which is expected... BUT the next messages of the batch are also dropped silently (I believe it's because the insert in transactional)
Sep 2 16:10:45 my-computer postgres[7642]: [2-2] 2011-09-02 16:10:45 CEST STATEMENT: insert into RudderSysEvents (executionDate, nodeId, configurationRuleId, policyInstanceId, serial, Component, KeyValue, executionTimeStamp, eventType, msg, Policy) values ('2011-09-02T16:10:45.592739+02:00','bla', '' , '', '', '', '', '', '', '', 'sdfsfsf' )
Sep 2 16:10:45 nicolas-laptop postgres[7643]: [2-1] 2011-09-02 16:10:45 CEST ERROR: invalid input syntax for integer: "" at character 224
My question is : how can I avoid this ?
I'm contemplating these solutions :
Trying to make the insert non-transactional on the rsyslog side, or make one-line transactions
$ActionQueueSize 1 $ActionQueueType Direct $MainMsgQueueSize 1 $MainMsgQueueType Direct
But it didn't work (and I suspect it'll be a performance killer)
Check with regexp the content of the fields prior inserting them
Well, it's a difficult task, especially since i'm checking by $programname and $msg, I can't really use regexp
if $programname startswith 'rudder' and $msg startswith ' R: @@' then
- Relaxing constraints in the database, and these have a trigger copying valid data in another table or a programme parsing this content and inserting relevant lines into a new table.
Well, i'm not super keen on this solution
Oh, and i'm using rsyslog 4.6.4-2
Thank you !
Edit : Finally, I circumvented this solution by filtering on the message with a rather complex regular expression
:msg, ereregex, "R: @@[a-zA-Z0-9\-]+?@@[a-zA-Z0-9_\-]{1,64}?@@[a-zA-Z0-9\-]+@@[a-zA-Z0-9\-]+?@@[0-9]+?@@[a-zA-Z0-9\-]+?@@[a-zA-Z0-9\-]+?@@[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}[+-][0-9]{2}:[0-9]{2}##[a-zA-Z0-9\-]+?@#.*" :ompgsql:localhost,rudder,rudder,Normation;RudderDbLinuxReportFormat
This is not really easy to maintain, but it works and hasn't break since. Thank you for your suggestion snap, I'll look into using a stored procedure in a not so far future.
Just an idea, I am not sure if it helps:
How about modifying the default SQL INSERT template so that it calls a PL/pgSQL STORED PROCEDURE which has a sufficient logic to handle the garbled input? There is some information here: http://www.rsyslog.com/doc/ommysql.html (it is for MySQL but applies similarly to the PostgreSQL module).
Finally, I circumvented this solution by filtering on the message with a rather complex regular expression
This is not really easy to maintain, but it works and hasn't break since. Thank you for your suggestion snap, I'll look into using a stored procedure in a not so far future.