A server restart produced this output on the console:
ZFS: i/o error - all block copies unavailable
ZFS: failed to read pool zroot directory object
qptzfsboot: failed to mount default pool zroot
FreeBSD/x86 boot
ZFS: i/o error - all block copies unavailable
ZFS: can't fild dataset 0
Default: zroot/<0x0>
boot:
I booted the host from the usb livecd and mounted the /etc directory under /tmp to enable ssh access:
ifconf -a # get available i/f names
ifconf em0 inet 192.168.216.46
route add default 192.168.216.1
hostname vhost06.internal
mkdir /tmp/etc
mount_unionfs /tmp/etc /etc
echo 'PermitRootLogin yes' >> /etc/sshd_config
passwd
Changing local password for root
New Password:
Retype New Password:
service sshd onestart
There are no zfs pools available to import:
root@vhost06:~ # zpool status
no pools available
root@vhost06:~ # zpool list
no pools available
root@vhost06:~ # zfs list
no datasets available
gpart
shows this geometry:
gpart show
=> 40 15628053088 ada0 GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 ada1 GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 ada2 GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 ada3 GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 diskid/DISK-VAGWJ6VL GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 diskid/DISK-VAGWV89L GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 diskid/DISK-VAHZAD2L GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 40 15628053088 diskid/DISK-VAH3PXYL GPT (7.3T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 16777216 2 freebsd-swap (8.0G)
16779264 15611273216 3 freebsd-zfs (7.3T)
15628052480 648 - free - (324K)
=> 1 30240767 da0 MBR (14G)
1 1600 1 efi (800K)
1601 2012560 2 freebsd [active] (983M)
2014161 28226607 - free - (13G)
=> 0 2012560 da0s2 BSD (983M)
0 16 - free - (8.0K)
16 2012544 1 freebsd-ufs (983M)
=> 1 30240767 diskid/DISK-00241D8CE51BB011B9A694C1 MBR (14G)
1 1600 1 efi (800K)
1601 2012560 2 freebsd [active] (983M)
2014161 28226607 - free - (13G)
=> 0 2012560 diskid/DISK-00241D8CE51BB011B9A694C1s2 BSD (983M)
0 16 - free - (8.0K)
16 2012544 1 freebsd-ufs (983M)
How do I recover from here?
<------ end of original question
I have made some progress and managed to import and mount one root dataset - iocage. This is a boot on zfs system but I cannot find the dataset that contains the root filesystem so I cannot get at /var/log to see if anything is there:
mkdir /tmp/zroot # /tmp is a writable file system
zpool -f zroot # force the zpool import
zfs set mountpoint=/tmp/zroot zroot # mount the imported pool in a writable fs
zfs mount -a # find and mount all the datasets
ll /tmp/zroot
total 12
drwxr-xr-x 9 root wheel 11 Feb 27 13:09 iocage/
Fortunately, all of the absolutely critical stuff is in /zroot/iocage
as the host simply acts as a platform for the jail. However the absence of the root dataset is bothersome to me.
A zpool status
showed zroot with no errors.
I next transferred the iocage dataset to another system using zfs send
.
zfs snapshot -r zroot/iocage@vh6iocsend1
zfs send -R zroot/iocage@vh6iocsend1 | ssh 192.168.216.45 zfs receive zroot/iocagev6
This took a while but it has completed successfully.
Now I need to get the problem host started. This host was restarted yesterday at noon without a problem. I do not recall running freebsd-update fetch
, but even if I had there was nothing for fetch to deliver as the system was already at 12.1p2
I still need help getting the host to boot.
<----------
Additional notes:
I was able to mount the entire zpool
using the altroot
option of zpool import
:
- Boot into the live cd shell.
- Import the zfs pool(s) but do not allow
import
to auto mount any datasets:zpool import -o altroot=/tmp/altroot -N -a
. - Mount the root
/
dataset first:zfs mount zroot/ROOT/default
. - Now mount the remaining datasets:
zfs mount -a
.
The entire zroot
pool's file system is now accessible at /tmp/altroot
.
I have used this with zfs send
to move the contents of /var
to another host. Actually, I sent the entire pool.
However, the original system still will not boot.
As a last resort I pulled the four hdds from the host that failed to boot and placed them in an identically configured server. That server booted from those hdds. Evidently a hardware problem.
There is still an anomaly on the new host: if bay 1 is occupied then the system will only boot if at least one of bay 0 or bay 2 is likewise occupied. The configuration 0-,1A,2-,3B will not boot. I do not have a clue as to why this is so.
This error (when the server cannot boot from zfs root pool, but pool appears to be totally intact when imported on another machine) usually indicates that boot blocks (sections of kernel and other files that loader is searching for, not actual loader blocks) have migrated somewhere beyond presumable 1024th gigabyte, where the boot loader is known to be able reach them, and now they are located somewhere far, where it can not.
This is a widely known FreeBSD gptzfsboot loader issue. Unfortunately, these boot block are placed near the start of the partition after the installation, to it acts as a charged time bomb with unknown time to explode.
Thus it is recommended (by the experienced people, unfortunately - not by the handbook and not by the bsdinstall) to have zfs root pool smaller than 1 terabyte. These small root pool aren't known to be affected.
Another solution seem to be (at least reported) to switch to the UEFI boot loader: it is way larger and more functional when it comes to handling big disks;
gptzfsboot
is very small and lacks all the functionality needed.