First of all, I am not looking for a product recommendation (well, I am, but it's more like me looking for a solution to a problem and not trying to pick something from an existing list)
The environment: I am using Jenkins to run a complex, multi-stage jobs. The job creates and provisions servers, deploys the OS and the application stack on them, populates the stack with demo data, runs standard tests, then some additional negative tests, then scales the stack with some more provisioned servers and runs some more tests.
The problem: At any one of these stages a failure might occur, either just a reporter misconfiguration or a complete failure for a step. Either way, I can do one of two things in Jenkins - exit(1) or keep going, dumping the problem to the console. Both aren't great because I will end up with either a broken incomplete job (if I exit) or with a nice green false positive, where in order to make sure it's actually green I have to read a few megs of log dumps from the console.
What I'm looking for: A way for Jenkins or some other system, to be able to monitor and report the success/failure of each step in a job separately, maintaining the job consistency (e.g. I cannot start a new job for each step), so that I can see, for example, that provisioning and OS deployment worked, stack deployed with errors but completed, and negative tests failed, instead of the current green/red status per run with no detail unless I go in and read the logs
Any suggestions are welcome, whether it's for a plugin for Jenkins I'm not aware of, or to another system, or whatever else.