We are building an application where each client will have their own database. None of the databases are particularly large (20MB to 400MB each), but there will be ~5,000 to start and at any one time 100 or so will be active.
Our team has been debating how best to set up the system. Clients only access their database once every 2 weeks (401k/finance processing), and only use it for 10-30 minutes at a time. Operations are evenly spread between reads/writes.
Half of our team feels that as a result we should spread the databases across multiple cheap servers and just use SQL Express ... they say that memory/caching wouldn't be as useful given the short period of time each database is used (we don't have the budget for the full SQL Standard on more than 1 server).
Is this the case? A higher memory limit is really the only advantage I see MSSQL Standard bringing us (we already have scripts for doing backups/restores, schema upgrades, migrating data, etc).
Update
I'm particularly interested in performance characteristics of multiple databases vs. one database. Wouldn't the end user experience be better hitting a single 200MB database than a 1TB database (even if both were well indexed)? This also means we can easily backup/restore single client databases very very fast, right? Would we need to tune SQL Server to better handle the 'thousands of databases' scenario?
I'd throw it all on one server. Maintaining the several cheap sql express servers will be a pain. You can spread databases, logs and the temp database across different RAID disk arrays. You should consider moving your temp database to it's own array as it may be in use by all the databases at once.
Check out the 2008 resource governor to make sure no one user will bring the server to a crawl. It is only in the enterprise version though.
If you offer hosting for multi-tenant client access you also have the option of volume licensing as a Service Provider, which can turn out to be significantly cheaper than buying per-CPU Standard license, and the cost can effectively pass on onto your service per-month pricing.
Deploying multiple SQL Express instances for hosting is actually against the recommendations.
Although is a bit outdated with regard to SQL 2008 and R2, I recommend reading the white paper SQL Server 2005 Deployment Guidance for Web Hosting Environments, the guidance applies to other hosting and multi-tenant environments, not only to Web hosting.
Jess,
Without knowing more in regards to your environment or usage patterns, I will say that you may see improved performance with each client having their own database (so thousands of smaller databases vs. 1 large database). You could potentially reduce the chances of table and row locking, as clients will only be hitting their own unique set of tables, versus sharing a set of tables. Disk I/O would still be a limiting factor however.
Also, security would be clearer, as each database would have its own unique set of permissions for each client. As you stated, backups and restores would be a lot faster with the smaller databases, but the setup and maintenance of these backup jobs would be extremely complex (but it sounds like you have already accounted for that).
If you have the hardware, I would highly recommend setting up different RAID arrays for your Data, Logs, and TempDB (as Sam has suggested). If you are using some kind of Direct Attach Storage or SAN, and you can afford extra arrays, you may even consider splitting up the actual files for your databases onto different arrays.
HTH, Dan
Your biggest challenge will probably involve backups; if you go SQL Express, you have no Agent to run them, you'll need to rely on Windows Scheduled Tasks and some fancy scripting.
If you use SQL Std/Ent Edition, and you try to use the built-in maintenance plans to backup all databases, it will do them one at a time and could take a while. Same for log backups.
Don't even think about using mirroring with that many databases on a server.
I would lean towards more servers, with a well thought out failover strategy.
Another consideration you should keep in mind is the peripheral costs of maintain thousands of SQL databases spread across "multiple cheap servers". With the increase in the number of servers, your rack space requirements increase, as well as your costs to power and cool your data center with all the additional hardware. Not to mention the increase in administrative costs/time in having to maintain multiple servers versus one server.
One thing to consider here is adding multiple files to tempdb and to db files spread across multiple disks. SQL can then spread read & write across the spindles... see https://stackoverflow.com/questions/719869/how-to-spread-tempdb-over-multiple-files
Please don't take this wrong. You may have good reasons for doing what you are doing. From where I am, I cannot see why anybody would want to have several thousand databases. It looks to me like there is a design flaw in your application and now you are looking for ways of overcoming it without fixing it. I would seriously recommend reconsidering the fundamental database design.
Having said this: If there are good (possibly legal) reasons to have so many databases, consider this: A decent MS SQL Server can easily handle several hundred small databases (500 x 400MB = 200GB of data, which is not a problem). That should generate more than enough income to not have to worry about additional SQL Server licenses. You could then split them simply by naming or any other method you choose.
Or, you might consider using MySQL, where licensing is not such a problem (although I realise that this isn't an option if you make heavy use of stored procedures).