My company is building a database from a large publicly available data set. When we complete it, we will have something like 500GB of data, but the data will never grow beyond that. It takes advantage of Postgres's polygon manipulation features, and as such has to stay in Postgres.
How can we host this database in the most cost effective way possible?
EDIT: I should mention that we are wanting to host this database in the cloud, since we don't have our own onsite servers.
EDIT 2: Sorry, let me elaborate. This database will integrated into a SAAS webapp, so potentially many users will be hitting the database at the same time. However, once we have it in place, the data will rarely change, and if it does change, it will only ever be added to, never deleted. Something like Linode, which we use for hosting the rest of the site, doesn't have enough storage space. We want to optimize cost, but secondarily we would prefer to not touch any hardware ourselves, so buying a large drive would be less than ideal.
That SO depends on usage pattern.
But really - a 500gb SSD does not cost that much - and it has the hugh advantage that it has a TON of IO, you would need dozens (plural) of drives to counter that.
I would likely say that this is about the best you can do - get a nice 512gb ssd.
Use amazon instances. 500GB space is cheap like 50 USD per month (10 cents per GB) and you can do stuff like multiple volumes to spread pgsql data and what not.
There's also services like newservers.com where you can start instance-like real servers with real disks and etc that removes you from the need of a datacenter.
Local storage is definitely filesystem efficient.
Local SSD is a definite speed advantage.
Alot of RAM memory is cheap at this point of time.
For your planned 500 GB, you actually need 500+ GB for filesystem overheads, no matters stating "the data will never grow beyond that".
Of the above storage, you need redundancy and backup, with uninterruptable power to run the setup.
As well, you have to pay attendtion to filesystem logs.
Running local is indeed costly and workforce intensive, but just hope to help here for making up someones mind on cost concern.
If you don't have on site server just get a nice dedicated server with ssds. I'm not shure if you really mean you want a cloud provider like rackspace or amazon etc. or if in a datacenter is enough for you?