My problem: cfengine3 policies are manually created/updated on the server (policy hub), and the clients regularly (every ~ 5 minutes) pull those from the server:/var/cfengine/masterfiles to their respective someclient:/var/cfengine/inputs (as they should).
But this behaves inconsistently some times. An updated file in the server may not be reflected in the clients until a good while later. It can be > 30 minutes or more until it suddenly "sees" the update. This happens especially if what I have created/updated is in a subdirectory under ./masterfiles.
I have checked with tcpdump that each client is in fact communicating with the master server via the cfengine port (5308) every 5 minutes.
I cannot see the reason why the policy files are not updated then.
Any one has experienced the same or has a suggestion ? Thanks.
(Just upgraded to cfengine 3.3.1, mixed CentOS/Fedora isolated environment - the rest of the network runs happily on cf2).
David,
Are the updated files being copied, and the local cf-agent doesn't react to the changes? Or are the updated files not being copied until much later?
The one reason I can think off the top of my head is that the clocks between the systems are out of sync. Check /var/cfengine/inputs/cf_promises_validated - this file is populated with the last time the promises were checked on the server, and the clients reload their local policies using this timestamp.
You may also want to post your question in the CFEngine Help forum, where it's bound to be seen by a larger number of CFEngine experts :)
David,
I also suspect clock skew, like Diego. Search for cf_promises_validated at http://cfengine.com/blog/cfengine-330-release-notes which may give you some resources. The key are the copy promises in your failsafe.cf.
You mentioned you just upgraded your CFEngine to 3.3.1.
There is a new timestamp in /var/cfengine/masterfiles/cf_promises_validated in 3.3.1 (previous version is a blank file I guessed) which means we can just change a way to copy the file from "mtime" to "digest" in your current failsafe.cf to avoid system clock problem. See also /var/cfengine/share/CoreBase/failsafe.cf, body copy_from u_rcp already has "digest" compound body.
As others I suspect clock skew. Check your times. You can also delete the cf_promises_validated file in /var/cfengine/inputs on the remote agent. Then it will see that the cf_promises_validated file in /var/cfengine/masterfiles on the policy hub is different and will proceed with a full policy update.
As of 3.3.0 cf_promises_validated should contain a datetime stamp so that we aren't relying on proper time sync for policy updates.
Check your failsafe.cf if you are using the default bootstrap policy.
The policy as of 3.3.1 or 3.3.0 (unsure which generated the failsafe I am looking at) has a promise with handle "check_valid_update" that uses u_rcp to update the cf_promises_validated file if necessary. If it repairs the promise, then it raises the validated_updates_ready class/context which the promise with handle update_files_inputs_dir is restricted on to update the rest of the policy. Check in body copy_from u_rcp what is used for the compare attribute. If its digest then it should be using the content of the file instead of just the timestamps on it.