Let's say I have a 'MyDB' SQL Server 2005 database (simple recovery) in which I do a Full backup on Sunday, and Differentials every other night
BACKUP DATABASE [MyDB] TO DISK = N'c:\Database Backups\MyDB\MyDB_Full.bak'
WITH NOFORMAT, INIT, NAME = N'MyDB.BAK', SKIP, NOREWIND, NOUNLOAD, STATS =
10
and
BACKUP DATABASE [MyDB] TO DISK = N'c:\Database Backups\MyDB\MyDB_Diff.bak'
WITH NOINIT, DIFFERENTIAL, NAME= 'MyDB.BAK', STATS= 10
What does the differential backup process use to decide what data gets backed up on the differential nights? Does it need the mydb_full.bak file to do its business?
If I wanted to save disk space, could I zip up the mydb_full.bak file to a .zip file after it's created without adversely affecting the differential backups, and if I needed to restore, just unzip the full backup before starting?
No - differential backups don't use the full backup file itself as a reference. You can (and should!) safely move your full backup dump to another machine or whatever you like.
SQL Server stores internally a bitmap of dirty extents (parts of the database which have changed since the last full backup), and when you run a differential backup, it consults the bitmap and only writes those changed parts of the database to the backup.
I just did a quick test on this by first creating a full backup of one of my test dbs, deleting it, and then running a differential backup. To my surprise, the differential backup ran fine so I don't think it's going off the last full backup file itself.
So it does seem like you'd be able to zip your full backups without a problem. I am curious to hear exactly where it determines it's start point for differential backups now so I'm hoping someone can enlighten us.