I'm trying to choose a configuration management system for 500-2000 very-geographically-distributed hosts. Due to varying network reliability, it's possible that a number of hosts may be temporarily unavailable at any given time. For this reason, my initial choice was Chef, since it uses a "pull" model, and when hosts come online and check in, they'll immediately get current configuration.
However, if my hosts only poll the Chef server for new configuration every 30 minutes, rapid deployments are impossible. Also, I am not a Rubyist. I would prefer to use a push-based model, where I can push configuration to hosts as rapidly as possible. So, the natural choices seem to be Ansible or SaltStack (probably SaltStack). But my question is: How do Ansible and SaltStack handle failed or down hosts? Is there some way to keep retrying a push forever until a host comes back online? Are there existing patterns for properly handling eventual consistency of down-hosts with either of these tools? Thanks!