Does anyone have an idea of mongodb performance? I've been using mysql for years and im trying to determine if its more optimal to have more collections or more documents (referring to very large quantities of data) - in MySQL it is more optimal to have more rows than to accomodate for more tables.
For instance we can have two tables (collections) of hotels with customers or one table/collection that has an additional column containing the hotel.
Perhaps it could be more optimal because it makes a collection size smaller on disk (by having two collections instead of one) so its faster to query?
Note: do not forget that mongodb has a limit to the document size, i think it is 16mb, but you'll need to check the documentation
The entire point of mongodb is to store your data denormalized, and to avoid 'joins', but if your data is entirely separate, it should still be stored in separate collections.
On our site, we have a few different collections, and one of them is linked via reference. It depends which driver you are using if the driver will resolve the references for you or not.
The other thing to consider is how you will be updating the data.
MongoDB mmap's the entire collection in to memory, and allows your OS to determine which parts should be paged out to disk, and which parts of the data should be stored in memory. There shouldn't be much difference in performance between 1 large table and 2 medium sized tables if the size of the entire dataset is the same. A consideration here is indexes, if you combine the data into one collection, and an index can cover it all, you may be able to lookup the data more quickly.
So, you could have a collection with each of your hotels, containing a property called 'customers' which is an array or hashes with the details of each customer, and you can push and pull items off of that array (or you can make it a hash on a unique customer identifier for easier access). Don't forget about the 16mb limit though.
It might be easier to answer your question if I had more context and details about what it is you are trying to store, and what kind of queries you need to run against the data