Say, I have a big database globalclients
which is very very big.
Now, I divide this big database to 100 sub-databases, as:
chinaclients
japanclients
koreaclients
indiaclients
brazilclients
etc.
Each one running in the corresponding country.
Now, I want to do some analysis over all databases. I want to merge them all into the master-database at first: some tables are merged entirely, and some tables are aggregated before merging:
customer_contact: merged entirely
product_sales: `select sum(sales), ... group by month` before merging
Do you have any suggestion? What's the english terms related to this technique? "synchronize" or something else?
It will be a great bonus if I can upload only the difference from the sub-database.
I don't know whether I can do it in database-level or not, "synchronize" the distributed databases in Java would be a complex task. is there any support by PostgreSQL-8.4?
You could use PL/Proxy to set up a frontend, and then write some functions that run the queries that you need. It could be quite tricky, though. You should plan this architecture carefully.
SQL queries can be run across multiple tables like this:
You can also use UNION to sequentially retrieve data from multiple queries:
I don't know about postgresql, but you might be able to create a VIEW table that gives you a way to access all the tables in a merged format to run simpler queries on.