I'm trying to use Jenkins to run a Salt execution module command; if any minion fails to execute the command, I want the Jenkins job to fail. Jenkins just follows the general shell scripting practice of failing on a nonzero exit code, so to make it work Salt should too.
And that is where I'm stuck, running something like this works as expected:
root@salt-master:~# salt --batch-size 1 --failhard -G 'ec2_roles:stage' cmd.run 'exit 0'
Executing run on ['stage-12']
jid:
20170209212325270060
retcode:
0
stage-12:
Executing run on ['stage-13']
jid:
20170209212325423735
retcode:
0
stage-13:
Executing run on ['stage-197']
jid:
20170209212325590982
retcode:
0
stage-197:
root@salt-master:~# echo $?
0
root@salt-master:~# salt --batch-size 1 --failhard -G 'ec2_roles:stage' cmd.run 'exit 1'
Executing run on ['stage-12']
{'stage-12': {'jid': '20170209212334018054', 'retcode': 1, 'ret': ''}}
ERROR: Minions returned with non-zero exit code.
root@salt-master:~# echo $?
1
But when I try to run an execution module like the following test:
# mymodule.py
from salt.exceptions import CommandExecutionError
def testfailure():
raise CommandExecutionError('fail!')
I get the following result:
root@salt-master:~# salt --batch-size 1 --failhard -G 'ec2_roles:stage' mymodule.testfailure
Executing run on ['stage-12']
jid:
20170210023059009796
stage-12:
ERROR: fail!
Executing run on ['stage-13']
jid:
20170210023059179183
stage-13:
ERROR: fail!
Executing run on ['stage-197']
jid:
20170210023059426845
stage-197:
ERROR: fail!
root@salt-master:~# echo $?
0
I'm not sure how you handle errors in your module but anyway, I would like to shed some light on it.
There is available a dunder dictionary
__context__
. When you run an execution module the__context__
dictionary persists across all module executions until the modules are refreshed. State modules behaves similarly. The dictionary can ave a key'retcode'
which seems to refer to the return code salt minion/client should return and which you are missing.I can see it used in some execution modules. One example from nspawn module:
Now, the bad things. I tested it on old SaltStack 2015.8.12 and it somehow works but without using exception:
Executing the module returns error code higher than
0
:When you raise an exception it stops working and it always returns
0
.Executing the module returns error code equal to
0
although it shouldn't:I also tested it on the latest available release 2016.11.3 and the behavior is the same. IMO, this is a bug. I reported it here.
AFAIK exit codes is a general issue with Salt. There's set of tickets in their Github bug tracker regarding this issue. The best way to find whether salt states were applied successfully or not I've seen is the one used by salt-kitchen. In a nutshell there's just a simple wrapper around salt command that greps output for the specific messages. The grep command is following:
In your case you can also add match on string
ERROR:
. You also probably need to invert grep's exit code as it is0
when the match is found. You could do this with a simple trick explained in this question. So in the end your salt command may look like:This will show the full output of salt, suppress grep output and return the inverted return code of grep meaning
1
if any of the error messages are found and0
if not.For anyone still trying to solve this, you can do the following:
salt * state.highstate --retcode-passthrough
or
salt-call * state.highstate --retcode-passthrough