I'm working on implementing Paul Randal's method of manually spreading DBCC CHECKDB over several days from his excellent article CHECKDB From Every Angle: Consistency Checking Options for a VLDB. In short, the strategy consists of:
- Divide the tables in the database equally between 7 buckets (using number of pages)
- Run a DBCC CHECKALLOC twice a week
- Run a DBCC CHECKCATALOG once a week
- Run a DBCC CHECKTABLE on one bucket each day of the week
I've got a separate StackOverflow question regarding a good algorithm to divide the tables into buckets (chime in there if you have a working TSQL script), but this question is about making sure I'm checking the right things. The Books Online documentation for CHECKDB says that in addition to doing the CHECKALLOC, CHECKCATALOG and CHECKTABLE, it also:
- Validates the contents of every indexed view in the database.
- Validates link-level consistency between table metadata and file system directories and files when storing varbinary(max) data in the file system using FILESTREAM. (SQL 2008 only)
- Validates the Service Broker data in the database.
So here are my questions:
Are there ways to perform these additional checks separately? Should I even worry about it? (Indexed views are probably a more immediate concern to me than the other two)
Any significant alterations I need to make to my strategy if I want to implement this across SQL2000, 2005, and 2008 databases?
CHECKALLOC and CHECKCATALOG seem to run pretty quickly. Any reason not to run these 2 checks every day?
Thanks!
Not sure how you would do the checks in 1), perhaps view the CHECKDB in Profiler to see what is running for these checks under the covers and see if that code isn't something that could be duplicated?
Regarding 2) and while I tend to think of partioning based upon time (monthly) such as for a general ledger application, would placing tables into filegroups based upon criticality (more critical to less critical, more actively used to less actively used, etc.) make more sense for you? This presumes that heavy traffic table group this month would be the same next month. Perhaps used_page_count from the DMV sys.dm_db_partition_stats would help confirm intial table-to-filegroup-placement.
3) If the time doesn't cause creep into the production available window, why not?
You say that this is an estate wide implementation - unless you really have no window, and all the databases are VLDBs (10TB+), it seems to me that this is over complicating things. If most of your instances are small enough, run with a standard checkDB job - and just have a special job for the really large databases. They should be exceptions.
Have you thought about doing checkDBs against restores into pre-prod? or physical_only most of the time? Or doing checkDB against a backup file using a third party tool such as Redgate's SQL Virtual Restore.