I have a web Linux-based infrastructure which consists of 15 virtual machines and over 50 various services. It is fully controlled by Chef. Most of the services are developed internally.
Basically the current deployment process is triggered by a shell script. A build system (a mix of Python and shell scripts) packages the services as .deb
files and puts these packages into a repo. It runs apt-get update
on all 15 nodes then because the standard Chef apt
cookbook only runs apt-get
once per day and we definitely do not want to run apt-get update
unconditionally on each chef-client
wake. The build system restarts chef-client
daemons on all 15 nodes finally (we need this step because of pull Chef nature).
The current process has a number of drawbacks we want to address. First off, it is asynchronous because the deployment script does not check chef-client
logs after restart so we don't even know if the deployment was successful. It does not even wait for Chef clients to complete the cycle. Second, we definitely do not want to force chef-client
restarts on all nodes because we usually deploy only a small number of packages. And third, I am not quite sure using chef-client
for deployment is legitimate, probably we are just doing it wrong from the start.
Please share your thoughts/experience.
I don't think you need to restart the client - 'chef-client --once' must be sufficient. Also if I were you, I'll craft a data bag where the packages needing to be deployed are marked and base the apt-get runs on that bag's data.
As far as success/failure reporting goes, what you want is a Chef handler that reports the success/failure back to some central point of aggregation.