I am trying to set up a python script to query a DB for all updates over the past 5 minutes. Given the number of updates, it must be as precise as possible (our records are timestamped to the microsecond). There are two ways I'm considering tackling this - first is using cron. However this relies on cron being precise enough to always run at the exact same interval every time it runs. So if the first execution time is 00:00:00.123456789, it would need to run again at 00:05:00.123456789. Otherwise, there is a possibility of records being missed in the gaps in between.
The other option is finding a way to "snap" the sql query to the nearest minute, rounded down. But if I can use cron, i'd prefer to do that to keep things as simple as possible.
I've had cron scripts that for some reason or another, needed to output the time and I have seen them sometimes be a second late (only ever tracked down to the second). I do not know if this is due to Cron or due to the variations in how long it takes to load and execute the script. I imagine it is a little of both. Either way, relying on the sql query to execute at the exact same microsecond, is not going to work.
You can have your python script get the current time and then round it the closest 5-min interval. For example, if it retrieves the current time as 00:05:03.123, just drop the seconds and query based on 00:05:00 and back to 00:00:00.
Either that or you could record the timestamp of the last record, and then next time query from the current time back to that timestamp.
Edit: removed the first sentence of the second paragraph - it didn't really make sense there.