I have a client with a very basic single server setup that includes a small SQL Server Express database.
I recently configured Symantec Backup Exec 2010 Quickstart edition for them. This is the free, feature-limited OEM version of Backup Exec that does not include the SQL Server Agent (or any application agent). However, it does support VSS via the Advanced Open File Option (AOFO). In this situation, I've always configured a scheduled task to dump the databases and back up the dumps so I can be sure I have a consistent backup.
However, after running an initial test job of the whole box with AOFO enabled, I noticed that it happily backed up the SQL data files, including the MDF/LDF files, and simply gave me a very softly worded "recommendation" that I might want to consider purchasing the SQL Server Agent as it had detected SQL Server data. This is surprised me on two fronts:
- It was my understanding that Backup Exec automatically excludes MDF/LDF files from flat-file backups using the Active File Exclusion or 'AFE' feature unless you manually disabled this via a registry key. My understanding was that this was because it's always a bad idea to back them up directly. The fact they weren't excluded was therefore odd and perhaps intentional.
- The lack of any serious warning stating that the SQL data may be inconsistent, unrestorable or whatever. Just a polite recommendation regarding the SQL Agent.
This led me to wonder, is it actually safe to back up the SQL Server data directly if you're using VSS (via AOFO)? After all, that would presumably mean the SQL Server VSS Writer is called to ensure the data files are app-consistent before the snapshot is taken. Backup Exec seems to have 'let' me do it, despite having recognised the SQL data as such.
I appreciate that using a dedicated SQL Agent provides a number of benefits, but purely on the question of taking a basic, consistent, restorable backup, is this safe?
There seems to be very little in terms of a definitive answer to the question, with some conflict. Obviously in the absence of one I'll be taking the tried and tested route, but it's got me thinking.
Here's some of what I've found so far:
Think about what is happening. the SQL database at any one time could be writing a complex transaction to multiple rows and tables that exist all over the MDF file. If the backup is taken while these complex transactions are going on part of the database is going to be backed up before the first part of the transaction is written and the other after. If it is a table with a lot of columns it could theoretically be copying that part of the file while a row is being written. Now you have an inconsistent database in a way the database logs can't know about. Volume Shadow Copy is going to make a copy of the volume but can freeze the volume for ... well they say a minute but..... There are also a ton of dependencies and gotcha's. Depend on it if you want but I'm not sure why. https://www.brentozar.com/archive/2018/01/perils-vss-snaps/
Conceptual view of what is going on in SQL backup. A real SQL Backup stops writing changes to the database while making the backup. It continues to log all changes and read those appropriately from the logs mixed with database reads. Once the database is backed up and a new log is started and the logs to that point are backed up. When you restore this the database is restored then the log files that were happening were applied. This way everything is stable and reliable. There are different nuances to this based on simple or full recovery model but the basics are the same. The file being backed up is guaranteed to be non-changing. This is all done by the database, specifically for the nuances of database integrity. That is worth relying on.