Possible Duplicate:
Can you help me with my capacity planning?
I have written an application that uses MySQL & Sphinx for searching and storing information.
the application contains a database that weights 300GB (mostly text, mediumtext and varchars). my sphinx index weights 30GB.
my problem stated when I was planning to have a 100GB~ DB and I now have a 300GB one, which can't be loaded into memory(?).
I'm looking for the best architecture that will give me the highest performance.
my application first gets a query, then runs it against the sphinx index, gets a bulk of IDs from sphinx, queries the DB for several text field for those IDs and sends the results to a .NET ranker (special algorithm).
In order to get the best performance I used a Xeon i7 W3520 server with 2 SSD disks (24GB ram) for MySQL, ain't cheap and I'm out of space and can't add more disks so I need another server.
Do you think there will be a major perfomance impact if I'll use a server with regular disks for mysql? I know that I need fast disks for the sphinx searching (this is my full text search engine), I'm using an indexed field (ID) to select rows from the DB without any complicated queries, I may need to host the DB on a cheaper server with enough storage.
can you spot my bottleneck? the .NET application needs mostly RAM and CPU, the DB needs a lot of space and or RAM and the sphinx needs fast disks and RAM.
- is it possible to load the MySQL index to the RAM?
- what can I do to mitigate the fact the DB's size if 300GB?
- will a server with slower disks impact the performance by much? (I can't benchmark it without spending hundreds of dollars on hardware).
- does it matter if I use a linux or windows machine for the MySQL?
What is the question here? Do not get me wrong, but there is exactly ONE way to do that - put the database in the fastest SSD you can get.
May it make sense? No idea, but you do not ask for a fast enough scenario, but the HIGHEST performance.
You know, this is not true. A 512gb server is not particularly large by todays standards.
Sure it will. Normally IOPS - IO peformance - is THE limiting factor for a database when the database is not loaded in RAM.
Define "regular disc". An array of 15k RPM SAS disc is "regular" for a high performance database, every disc about 3-4 times as many IOPS than your regular disc, but then.... with a SMALL database like that put it into a 512gb SSD and get 50.000 IOPS instead of 500 or so a 15k SAS drive gives you.
ANY database that does not fill out all things from RAM is limited by discs normally, mostly because discs are terrifically SLOW SLOW SLOW. Getting the fastest disc subsystem possible is standardin larger database installations, but you have a small database, so a simple SSD is good enough. It is STILL a lot less performance than RAM, sadly.
Realize this is tiny? I have a 850gb database at home, and my last projet was a 21000gb database. Depending what I do with some data I have access to now, I am just getting 44000gb highly compressed data that I Would need to process. 300gb is "get memory" small.
Maybe it does, but it is totally irrelevant for THIS question. Windows may have - depending on configuration - a larger overhead, but it wont make a difference. When you deal with systems like this, then brutally spekaingan additional 128M overhead makes zero difference.
do you actually know what your server is doing, do you know where are the performance bottlenecks? if you dont have any sort of monitoring and trend tracking you're blind.
maybe just adding proper indexing for the data can help you?
probably IO indeed is your limit, but maybe you have a lot of spare CPU cycles and just compressing the blobs kept in the database can increase memory-cache hit ratio at the expense of [de]compressing on the fly which probably will be unnoticeable.
i would argue that running mysql/sphinx under linux will give you better chance to see and track 'what's going on inside'.