I use a distributed user-space filesystem (GlusterFS) and I would like to be sure GlusterFS processes will always have the computing power they need.
Each execution node of my grid have 2 CPU, with 4 cores per CPU and 2 threads per core (16 "processors" are seen by Linux).
My goal is to guarantee that GlusterFS processes have enough processing power to be reliable, responsive and fast. (There is no marketing here, just the dreams of a sysadmin ;-)
I consider two main points :
- GlusterFS processes
- I/O for data access (on local disks, or remote disks)
I thought about binding GlusterFS instances on a specific "processor".
I would like to be sure that :
- No grid job will impact the kernel and the GlusterFS instances
- Researchers jobs won't be affected by system processes (I'd like to reserve a pool of cores to job execution and be sure that no system process will use these CPUs)
But what about I/O ? As we handle a huge amount of data (several terabytes), we'll have a lot of interuptions.
How can I distribute these operations on my processors ? What are the "best practices" ?
Thanks for your comments!