Is there any practical limit on the number of concurrent LXC containers to run on one box.
Obviously, this depend on the node spec (cpu, ram,..) but really what I am wondering is the cost of running a process directly on the os vs running the same process in dedicated container on the same box? If you have to put a number on the overheard difference, what is that over head? Is this a good way to figure out the maximum number of lxc container you should run on one server?
As was said above, the overhead is minimal. Switching namespaces usually comes with no overhead (those fields already exist even for host processes), so the main overhead comes from the few extra resources that need to be created by LXC, typically 4-5 bind-mounts, maybe a couple of tmpfs mounts, two VETH devices and a loopback device.
As libraries will typically be shared between containers, even starting init and a bunch of other processes doesn't cost that much resources.
All that to say, it's pretty hard to answer your question :) If you take a single process and compare it running on the host or running in a container, the overhead for that specific process will be 0. The actual LXC overhead comes from that process' parent and additional resources the process may use (network devices, ...).
Anyway, I haven't done any crazy benchmarking recently, but a few months back I managed to run around a thousand of simple apache2 containers complete with an init system on a simple Pentium4 box with 4GB of RAM. Those containers were sharing their rootfs to make optimal use of shared memory but everything started fine nonetheless.
Oh and about the pid_max limit, it's not really a limit as this can be bumped all the way to 2^64 nowadays (at least on 64bit systems), so the 32768 limit is something of the past.
The overhead is very low and you can run a very high number of containers that doesn't do anything, but that also wouldn't be very interesting. How many useful containers you can run in your system, depends on how you define useful. You'll have to test real use cases and see.
One practical limit, though, is the number of pids allowed. You can see this by looking at
/proc/sys/kernel/pid_max
. But it's probably not a useful number, for the reasons explained above.