This is a multi-step question, so bear with me.
I am considering building a small (10 - 20 node) cluster. However, I want normal programs (not designed for clusters) to be able to take advantage of the extra processing speed. In the most ideal scenario, I would like to be able to run a single hypervisor over the whole cluster. As far as I can tell, there is not a good solution to this problem that can take a normal program and run it faster on a cluster.
Therefore, I am brainstorming how I would go about designing such a system, and if it is feasible to attempt. It seems that the inherent problem with clustering is that it takes more time to move data around than it does to process it. (i.e. it takes 2 seconds to transfer a problem from one node to another, but only 1 second to solve it on the first node.) However, I have thought of a possible solution for this.
Let's just say that it is theoretically possible for all the nodes in a cluster to boot off of the same disk. Therefore, they all have direct access to identical data and identical programs. Secondly, let's suppose that the Linux kernel could be modified to send each new command to a different slave node, looping infinitely through all the nodes. Given these two conditions, a user can log into the terminal of the master node and run commands in a normal (non-cluster oriented) format, but the load of the commands will be more or less evenly spread across the cluster.
So with that introduction, here are my two questions:
- Is it possible to create such an environment in which all the computers boot off of a single disk (probably a NAS)? (I am aware of PXE, but as far as I can tell it does not provide persistent storage, it only hosts the OS.) If it is currently possible, how can it be done?
- Is it possible to modify the kernel to delegate each new command to a separate node? (This may be able to be done by modifying the bash binary instead of the kernel itself - I am not sure.) If so, please expound on how.
That's the most complicated question I have ever tried to ask on Stack Exchange, so I expect people to have questions in the comments. However, if this solution could actually be implemented, it could potentially revolutionize virtualization.
Sure, a combination of PXE with NFS-mounted share for persistent storage.
Yes.
Simple, right? No so much.
The question you should be asking is this: how much work would it be to do this?
There have been companies working on this problem for 30 years, with billions of USD invested, and the problem isn't yet solved. That's not to say it couldn't be solved, but it's a massively-complex problem.
Multiple node, independent OS image, shared storage systems exist. For example, VMScluster running on OpenVMS is inherently a multiple node system. However, that is not a Linux operating system, and applications would need to be designed for the different platform and multiple node awareness.
Usually Linux scales either bigger or to more machines. For bigger, add more CPUs and memory to one OS image. Think big database server. For more, add smaller machines with their own OS image, network them together, and use some kind of job management software or other load balancer. Think cloud application or high performance computing. Edit: Note that HPC applications in particular use MPI interfaces for direct memory access. This requires them to be developed for and built against those message libraries.