My experience with ZFS has generally been that it just works, so I expect the answer will be, it’s not a problem — but I have a data pool which will ruin my January if it fubars, so I want to double-check.
This is a question that could actually come up in two different situations involving a separate data pool. Right now I’m dealing with the first, but I’ve also wondered about the second:
- The storage for the system disk (i.e., the one holding
rpool
) fails, but storage for the data pool is fine, so you want to restore the system disk from backups but just keep going with the live storage of the data pool. - You have Solaris running in a VM and want to roll back to a snapshot the hypervisor has taken (not a ZFS snapshot of
rpool
), but the data pool is stored on disks that are in “independent” mode, RDMs, etc., so will not be rolled back.
In both of these situations, when Solaris is booted back up, it’s going to see a data pool that it knows about but which is in a state it had never (as far as it would remember) put it into.
I’m primarily only concerned with the case where the system was cleanly shut down before the system disk is rewound, and where the system had been cleanly shut down prior to the image it’s being rewound to. I’d expect switching between running states could be a bit trickier.
Note also that in my particular case, the pool’s storage geometry and paths to the storage have not changed. Again, I’d expect this to be trickier if they had.
I wouldn’t even be asking this with Windows and NTFS because that’s a comparatively simplistic decoupled system so it’s hard to see why it wouldn’t work. However, it seems that Solaris keeps some kind of pool metadata out of band, as evidenced by the fact that you’re supposed to zpool export
and zpool import
when you move pools between systems (something I’ve never done in that manner thanks to VMware). My knowledge of this metadata and its purpose is limited so it’s hard for me to reason about the impact in this situation. (An explanation of this would be great!)
I actually still have access to the pre-rollback system. It’s sitting in a VMFS datastore backed by an HP SmartArray that threw a 1716 POST warning after an ill-fated preventive maintenance disk change (which lost data because SmartArray is dumber than ZFS). All important VMs still seem fine and scans of their filesystems found no errors, but I plan to restore the array from a very recent backup anyway because I have reason to suspect that ESXi silently zeros bad sectors instead of passing the errors to the guests, so I don’t want to risk some zeroed sector lurking somewhere to bite me in the butt later.
For the Solaris VM, I don’t have to worry about zeroed sectors, because ZFS would catch that, but most of the other VMs use dumb filesystems. The backup is an image of the whole VMware datastore, though, so fixing them will roll back the Solaris VM, too. Actually, I did a scrub on the rpool
of this VM and it found no errors, so hell, if I wanted, I could just stash its VMDK somewhere else and copy it back in after the roll-back, and then this whole question would be moot. I guess that’s what I’ll do if nobody answers, lol. But it’s something I’ve wondered for a while, so I’ll still ask.
So, the question is, can I just go ahead and roll back the system disk’s storage and be done with it? Or would I have to export the pool from the pre-rollback system, roll back, delete the pool before attaching its storage, then attach the storage and import the pool? I don’t like the sounds of the latter, partly because there is both CIFS and iSCSI being served from that pool and I don’t remember off hand how I set those up or even how to do so, so if they break I’ll be mad. (Can you tell we don’t have a full-time sysadmin? lol)
The VM is running an older version, Solaris 11.0.
(Incidentally, it’s older partly because of the same question. I wanted to snapshot the VM prior to attempting an upgrade in case I bork it, but then I was worried about how a rolled-back system might react to the independent pool, so just left it alone. And yeah, I realize I could also snapshot the rpool
, but that doesn’t give the same level of comfort for someone who doesn’t work with Solaris daily.)