Let's say I bought two Intel Xeon's and installed them into server class hardware... If one CPU failed would the other still function and pick up the slack, therefore providing fault tolerance?
This does not seem very likely, but I figured I would ask instead of making any assumptions.
In a normal dual-socket system, no, although there are servers that do permit hot-swapping of processors and RAM. So these things do exist, but they're at the very, very high-end of the market.
It's not really a big deal - of everything in your server that can fail, the processor is right on the bottom of the list, next to those little brass risers that hold the motherboard off the chassis.
Talking about x86 commodity hardware, if a system is running and a CPU fails things will grind to a halt normally. However the system will function fine after a reboot, albeit somewhat slower.
Multiple CPUs mostly are there to have parallel processing, not really for fault tolerance. But it's nice to have a system that still boots would a CPU (or more) fail.
I would say it's bit more likely your CPU fails than Mark Henderson suggests, but it still is very unlikely. In my experience mostly it happens when the system frequently overheats and shuts itself down (that's quite easy in a badly airconditioned office server room). The CPUs don't tend to like that a lot.
Of course if you had a nice IBM mainframe or similar, hot swapping a CPU (board) is "easy" enough.
If a CPU were to fail - which is extremely unlikely, per the other answers - there is basically nothing that the system could do to recover. Depending on the way it fails it could end up corrupting memory in strange ways, or destroying the process table, or who knows what else. If you were to have some sort of active monitoring system that keeps tabs on the CPU to make sure it's working well (and able to, say, roll back any changes made by the CPU during its death throes), that would also be another system that can fail, and determining software failure programmatically is pretty dang difficult (basically the only way you can practically do it is by having another CPU doing the exact same stuff at the exact same time and compare the results - which will then end up slowing things down such that there's no point to having another CPU to begin with).
That said, as rare as a CPU failure is, increasing the CPU count in a system will actually make your failure rate go up, as now you have twice as many things that can fail. You also have other subsystems that can fail as well, such as those which keep the CPUs' caches synchronized, and the increase in power consumption and thermal output also contribute to the factors behind overall system failure (and of course, active cooling fans are another point of failure).
You'll have to define exactly what kind of failures you want to handle. If we regard a collection of cores/CPUs/computers working together as a network, one type of failure is that a node simply stops answering. A much more severe failure is when a node starts to corrupt data and sends faulty information to the others. This is called a Byzantine failure, and in the worst case it's actively disrupting the operation of the network through strategic "lies". It's relatively easy to show that no system could handle a third or more of its nodes going Byzantine.
What you need to do, is to decide exactly what kind of failures you're expecting, and design your system with that in mind, and accept the fact that the problem of handling an arbitrary number of malicious nodes is unsolvable. In your case, you need at least four CPUs if one of them is faulty.
On a side note: In quantum physics there are no impossibilities, but if have to wait longer than the age of the universe to statistically have a chance to observe a certain behavior, we don't have to say that it's possible. Keep that in mind when you design your system. ;)
CPU failure is might-rare. A failure would probably result in other problems at the OS level. I would not think of this as any form of fault-tolerance.
As the other answers, is very rare that a CPU fails, and in the average servers you can't do a hot swap, what you can probably do is leave the server with one CPU until the failed one is replaced, of course, this procedure is totally offline and you need to do a stop of the server