I share a server with HAL. The server has 32 GB of memory.
I rarely use more than 1 GB of memory, and when I do, it is for a few minutes at a time, and I don't mind sending such jobs to the back of the line.
HAL read/writes large files (e.g. using gunzip). This can take up to 100% of the memory CPU, intermittently, for hours. This is usually done overnight, but when running, will make even simple commands such as cd
take 30s, opening emacs
can take minutes.
I would like to be able to reserve 1 GB for use by processes that use << 1GB (like a text editor). I would also like to stay out of HAL's way, and see no reason that this should be an issue.
HAL says that a queueing system (like PBS) can not be used to put a low priority on read/write, e.g. to leave 1 GB of memory always available when large jobs are running. In his words:
the script used to gunzip snags all the processors it can because the data is large... queueing would not solve this... during transfer of files from (that server) to (this server), an inflation step does lots of read/write
Why couldn't queuing solve this problem? What could?
You could have a job queuing system or modify the kernel's scheduling approach.
I'm going to ignore those options and suggest that you use ionice -- or more specifically that Bob uses it to lower his priority. It sound like you're having a disk access issue rather than a memory issue.
Regular nice may also be an option as it will indirectly affect disk priority (from the ionice man page: "The priority within the best effort class will be dynamically derived from the cpu nice level of the process: io_priority = (cpu_nice + 20) / 5.") The software atop is also really handy for getting an overview of what's bottle-necking and if it's regular IO or swapping to disk that is at issue.
First
gzip
andgunzip
do not work the way you think they do -- the algorithm used by gzip is block based, and while it may be slightly larger when chugging through a large compressed file even uncompressing a 1GB .gz file only chews up about 15M of RAM (total process size) on my machine.Second, unless you're sucking the entire file into RAM simply reading or writing a large file won't chew up much memory - The OS may hold the data in a filesystem cache, but cache data will be evicted the moment a program needs that RAM. Only data being held in a program's working memory counts toward "memory pressure" (used RAM, plus or minus a few other factors).
Stop trying to outsmart your operating system's pager: The kernel will swap out tasks to ensure that whoever is currently executing has RAM in which to work. Yes, this means you will be hitting disk if you're using more RAM than you have available. The solution is to limit the amount of RAM you're using by running fewer programs, or to add more RAM.
The concept of "reserving" RAM is fundamentally flawed from an OS design perspective: You could have no other activity going on, but Bob's program can't touch the "reserved" RAM, so now it has to go and swap to disk. For want of (e.g.) 1KB, Bob's program is now making constant disk hits paging data in and out of RAM, and your performance goes through the floor.
You can artificially limit Bob's RAM usage (
ulimit
), but when he hits the hard limit his programs will probably not react well (think:malloc(): Unable to allocate XXXXX bytes
followed by an ungraceful exit).You can, as rvs mentioned in their comment, virtualize the environment and ensure that Bob's processes only have access to the resources available to their VM, but this simply moves the problem (Bob's VM will begin swapping, and swapping in a VM is, by necessity, even slower than on bare metal).
In the Real World, Jeff is probably right - You're probably hitting Disk IO limits rather than RAM limits: Decompressing files is a huge amount of disk I/O (read in from the compressed file, pass it through the CPU and a tiny bit of RAM, spit it out to the uncompressed file).
nice
(to affect CPU priority) andionice
if supported (to affect disk priority) will alleviate your problem.Lecture
Not for nothing, but I recall this same question from my Operating System design textbook (although the example wasn't gzip/gunzip). While there's a slim chance you're actually encountering this problem in the real world I have my doubts: It's simply too contrived of an example.
Remember that
Server Fault is for system administrators and desktop support professionals, people who manage or maintain computers in a professional capacity
- Not for CS/301 homework.(FAQ)If this is an actual real-world problem then I apologize - you may be the 1 in 10000 that actually encountered this corner case in a production environment.