(I've read a lot about 64-bit versus 32-bit OS/Apps, but this question is specifically in regards to databases.)
I'm trying to understand the pros and cons of 32-bit versus 64-bit databases, and namely, under what conditions that it starts to make sense to use 64-bit installations.
The database systems that I am interested in are: SQL Server 2008, MySQL, and PostgreSQL 9.0.
I have read that pre-9.0 versions of PostgreSQL only comes in 32-bit for Windows, and this article about running 32-bit PostgreSQL on 64-bit Windows clears up some of my confusion, but I'm looking for more info.
When would I benefit from using 64-bit databases (i.e. database size/disk space, available system memory, types of data sernarios that are known to benefit from it, which database engine being used, etc.)?
There is - today - no scenario where it makes sense to install a 32 bit SQL Server version if one has a chance.
Databases are specific in this - as they want to use a lot of memory as cache if necessary. A lot more than the meager 2gb / 3gb a 32 bit process can give them. PAE is not the same. Even ignoring limits, PAE memory is not equal to real memory for a SQL Server´(it is only used for ONE thing - caching db pages).
32 bit OS - is on the same level. it makes no sense on modern hardware at all to install a 32 bit OS.
PostgreSQL benefits from having a 64-bit build in two main ways. First, data types that can fit into 64-bits (larger integers and timestamp types mainly) can be more efficiently passed around directly in registers rather than using pointers. Second, it's possible to allocate more memory for the database's dedicated buffer cache. The point of diminishing returns on that tunable (shared_buffers) is usually around 8GB, but it will be limited to <2GB on a 32-bit system.
However, if you are on Windows, PostgreSQL doesn't handle shared memory as efficiently as on UNIX-ish platforms. The point of diminishing return generally ends up being <=512MB of dedicated memory for the database whether you have a 32-bit or 64-bit build of PostgreSQL. You'll do better to leave the rest for the operating system cache rather than dedicate it to the database. Accordingly, there really isn't that much of a performance gain going from 32 to 64 bits with PostgreSQL on Windows; the main tunable that would normally benefit from having more RAM available doesn't actually utilize it very well.
I run MySQL on 64-bit architecture because I want them to most efficiently utilize more than 4GB of memory per thread. Generally speaking, this should apply to all databases.
One of the primary differences between the architectures is increased addressing allows greater memory handling. While Intel's Physical Address Extension allows addressing of more than 4GB, it is still limited to 4GB per thread. PAE allows up to a 64GB maximum.
Wikipedia has a comparison of 64-bit versus 32-bit, which includes more low level details.
Note that if you only have 64-bit MySQL client libraries, you will get "wrong architecture" errors when trying to link them together with 32-bit code. This happened to me when I tried to install python bindings ("pip install MySQL-python").
One can use 64-bit MySQL server with a 32-bit MySQL client and it's a shame that the MySQL Community Server doesn't include both 32-bit and 64-bit client library versions. The correct solution is to install additional 32-bit MySQL client libraries. However, since the easiest way to install MySQL seems to be the MySQL Community Server binary download, and given that the 64-bit installer comes with 64-bit client libraries only, the path of least resistance is to just download the 32-bit installer.
(all this, assuming that you will always use very small data sets)
For so many things, 32 bit is a win (as long as you can live with the address space), but DB are one thing where even small databases can get a real boost running in 64 bit. Granted, I don't know a thing about MS SQL server, but I have seen benchmarks (for example, on a Sun 5 (older 64 bit Sun desktop), 32 bit was generally a little faster, except for mysql, which was 30% faster in 64 bit.