I've got a CentOS 6.5 x86-64 KVM server with a bunch of guest VMs of different breeds, mostly EL5 and EL6. However one and only one of them keeps crashing every couple of days with:
pthread_create failed: Resource temporarily unavailable
Here is the full log from /var/log/libvirt/qemu/vws3-pp.log
:
2014-07-24 21:27:27.451+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none
/usr/libexec/qemu-kvm -name vws3-pp,process=qemu:vws3-pp -S -M rhel6.5.0
-enable-kvm -m 1536 -redhat-disable-KSM -realtime mlock=on
-smp 1,sockets=1,cores=1,threads=1 -uuid d11de823-8bab-4e8d-8457-61ef7ab877a7
-nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vws3-pp.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-drive file=/vm/prod/vws3-pp-disk1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writethrough
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=32
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0
-vnc 127.0.0.1:9,password -vga cirrus
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
char device redirected to /dev/pts/5
pthread_create failed: Resource temporarily unavailable <==== ### HERE ####
2014-07-29 15:29:52.063+0000: shutting down
There are 8 other VMs on the box and all of them run happily for months, just this one crashes every few days. There is nothing special about this VM - pretty standard LAMP, not overloaded - I can't think of any significant difference between this and the other VMs that exhibit no problems. Some of those are very busy but still rock stable.
Somewhere on the net I found a suggestion to set max_processes = 4096
in /etc/libvirt/qemu.conf
and restart the box - done that but it didn't help. The VM crashed again this morning for no good reason.
NEW INFO:
As it turns out the VM always dies while rdiff-backup
is running from a remote backup server and in most cases the last log in the rdiff-backup-data/backup.log (on the remote side, ie not affected by the crash) is:
Processing changed file tmp
Incrementing mirror file /extpool/backup/vws3-pp/tmp
Even though /tmp/**
is excluded from the backup. It could indeed be failing in /usr
which is the next one alphabetically in /
, who knows...
Backup runs every night but the VM crashes only about once a week.
What does rdiff-backup
do so strange that it makes a KVM gues die with pthread_create failed: Resource temporarily unavailable
?
Any ideas?
Check
to see if the limits are actually set after restart.