I am looking for some valid points to choose between Oracle Vs SQL Server to handle around 5-6 Terabytes of data. This data will get accumulated over a period of 8 Months. Any data older then 8 months is purged out of the database.
I am considering Oracle 11G Standard Edition and SQL Server 2008 Standard Edition.
I am not looking for ease of development, I am purely looking in terms of handling the huge data and still be able to provide good performance. Apart from performance if there is any parameter which makes any difference then please highlight.
I know price wise there is a lot of difference but that is not a factor if the performance difference is a lot.
I hope to get non-biased answers and no religious war.
You won't like this answer, but neither. Oracle and MS SQL Server are roughly equal in terms of their large scale data handling (SQL Server can take an edge in ease of use, Oracle in utility), and when it really comes down to the raw data PostgreSQL can actually edge them both out by a very small margin when optimized.
But if you really want a "big" database, the kind that is proven to work well when you are making legitimate use of 64bit ID columns and TBs of data, then it's (IBM) DB2.
(So far as the religious wars go, I'm an SQL Server guy, but even I know its limits)
I guess it will really depend on what your application developers and ops engineers can support; I imagine that you won't keep this data set on a single server, for redundancy and performance.
If you're writing 6Tb over 8 months that isn't actually a massive insert-rate, so the data churn won't be a problem for any decent hardware.
Most of the performance will come out of correct design, which will to some extend depend on the features of the specific product used.
Without doing a correct design for both, then comparing the fully-prototyped application with production-volumes of data, you're not going to be able to compare them. I guess that's going to be ineffective in terms of cost (developer effort to build TWO prototypes and test them at full data load on production-grade hardware).
I imagine that either would do the job quite nicely, as well as anything else you care to name. It's all going to be about correct application design, using the specific features of the database properly.
It completely depends on what this data is, how its stored, and what you are doing with the data
Assuming its a single table, then Oracle certainly has the 'chops' to do it, but you'll want enterprise edition with the partitioning option. With that, you can partition your data into months, or weeks, or days, or hours, whatever is needed. Its then an easy job to keep your rolling 8 month window ( I assuming its a rolling window, not just that in 8 months time the whole thing will be strapped ).
With any solution, you will want to look at how you are going to do this rolling window, some databases don't take too kindly to you deleting a large % of rows.
Also consider, you'll likely want some solution which can backup that data and recover from failure in a reasonable amount of time.
I'm also thinking for that size database you may need features in the Enterprise edition of SQL Server.
http://www.microsoft.com/sqlserver/2008/en/us/compare-std-ent.aspx
Enterprise tends to have the scalibility features. I'm sure that may be true of Oracle as well.
There are many possible factors influencing the answer and without more information it is hard to give a definitive answer. Examples of these factors are what the source of the data is and at what speed it needs to be recorded, how often it needs to be queried, how easy it is to segment etc.
In general Oracle certainly has the reputation that it can handle larger loads better and can be run on more powerful hardware than SQL Server.
However I think that for large volumes as you are describing it would be wise to also consider other options unless your organisation mandates the use of only those two products.
You are probably better off looking at an "exotic" database product specifically designed to handle those kind of volumes, such as Vertica or even considering non-relational products designed for massive volumes as used by the cloud service providers such as Amazon Elastic Mapreduce and Google App Engine datastore. These products are gaining traction in industries that require huge volumes of data such as telecommunications providers and the financial services industry and the telematics industry.
You haven't mentioned whether you'll be using this database for Online transaction processing, or this is for more data warehousing, business intelligence. There are definitely some purpose-built options for both. Teradata comes to mind for handling very large amounts of data for BI for instance.
I can't speak to "5-6 TB of data", but I currently have 1700 full time fat-client users (Application built in .NET) hammering against a 1.5 TB database using SQL 64bit Itanium.
It performs fine. I think the scaling question is not so much the size of the DB, as it is the number of users and transactions per second.
Oracle can do clustering to expand capacity as far as transactions/second (in certain circumstances) but I wouldn't necessarily prefer either one regarding raw DB size.