I launched an Ubuntu EBS instance on Amazon EC2 using the Ubuntu's very own latest AMI for 10.04 Lucid, ami-ad36fbc4`
After getting the instance up, I ran the command sudo aptitude safe-upgrade
which seems to have upgraded the kernel from vmlinuz-2.6.32-318-ec2
to vmlinuz-2.6.32-340-ec2
Now the instance won't boot, it gives the following error: Waiting for root file system ...
If I detach the EBS and edit the /boot/grub/menu.lst
file and remove the entries referencing vmlinuz-2.6.32-340-ec2
it will boot again.
So the questions are:
- Why is this happening?
- Isn't
safe-upgrade
supposed to be conservative enough not to break things? - Or should I just not be using safe-upgrade on an EC2 instance? And if so why not?
ps: A related issue I read up on while researching this was System boot hangs on Waiting for root file system - Procedure to recover from /dev/hda that became /dev/sda (see section 4.8), but as you can see from the menu.lst
the entries are referred to by LABEL=cloudimg-rootfs
and not /sda/a
and /hda/a
For reference, the grub menu file is as follows:
title Ubuntu 10.04.3 LTS, kernel 2.6.32-340-ec2
root (hd0)
kernel /boot/vmlinuz-2.6.32-340-ec2 root=LABEL=cloudimg-rootfs ro xencons=hvc0 console=hvc0
initrd /boot/initrd.img-2.6.32-340-ec2
title Ubuntu 10.04.3 LTS, kernel 2.6.32-340-ec2 (recovery mode)
root (hd0)
kernel /boot/vmlinuz-2.6.32-340-ec2 root=LABEL=cloudimg-rootfs ro single
initrd /boot/initrd.img-2.6.32-340-ec2
title Ubuntu 10.04.3 LTS, kernel 2.6.32-318-ec2
root (hd0)
kernel /boot/vmlinuz-2.6.32-318-ec2 root=LABEL=cloudimg-rootfs ro xencons=hvc0 console=hvc0
initrd /boot/initrd.img-2.6.32-318-ec2
title Ubuntu 10.04.3 LTS, kernel 2.6.32-318-ec2 (recovery mode)
root (hd0)
kernel /boot/vmlinuz-2.6.32-318-ec2 root=LABEL=cloudimg-rootfs ro single
initrd /boot/initrd.img-2.6.32-318-ec2
title Ubuntu 10.04.3 LTS, memtest86+
root (hd0)
kernel /boot/memtest86+.bin
And the boot console looks like this (when it hangs):
i-3121e5b7
2011-11-27T19:20:03+0000
Xen Minimal OS!
start_info: 0xac4000(VA)
nr_pages: 0x26700
shared_inf: 0xbb4b2000(MA)
pt_base: 0xac7000(VA)
nr_pt_frames: 0x9
mfn_list: 0x990000(VA)
mod_start: 0x0(VA)
mod_len: 0
flags: 0x0
cmd_line: root=/dev/sda1 ro 4
stack: 0x94f860-0x96f860
MM: Init
_text: 0x0(VA)
_etext: 0x5ff6d(VA)
_erodata: 0x78000(VA)
_edata: 0x80b00(VA)
stack start: 0x94f860(VA)
_end: 0x98fe68(VA)
start_pfn: ad3
max_pfn: 26700
Mapping memory range 0xc00000 - 0x26700000
setting 0x0-0x78000 readonly
skipped 0x1000
MM: Initialise page allocator for c01000(c01000)-26700000(26700000)
MM: done
Demand map pfns at 26701000-2026701000.
Heap resides at 2026702000-4026702000.
Initialising timer interface
Initialising console ... done.
gnttab_table mapped at 0x26701000.
Initialising scheduler
Thread "Idle": pointer: 0x2026702010, stack: 0x26640000
Initialising xenbus
Thread "xenstore": pointer: 0x20267027c0, stack: 0x26650000
Dummy main: start_info=0x96f960
Thread "main": pointer: 0x2026702f70, stack: 0x26660000
"main" "root=/dev/sda1" "ro" "4"
vbd 2049 is hd0
******************* BLKFRONT for device/vbd/2049 **********
backend at /local/domain/0/backend/vbd/526/2049
Failed to read /local/domain/0/backend/vbd/526/2049/feature-barrier.
Failed to read /local/domain/0/backend/vbd/526/2049/feature-flush-cache.
16777216 sectors of 512 bytes
**************************
[H[J Booting 'Ubuntu 10.04.3 LTS, kernel 2.6.32-340-ec2'
root (hd0)
Filesystem type is ext2fs, using whole disk
kernel /boot/vmlinuz-2.6.32-340-ec2 root=LABEL=cloudimg-rootfs ro xencons=hvc0
console=hvc0
initrd /boot/initrd.img-2.6.32-340-ec2
xc_dom_probe_bzimage_kernel: kernel is not a bzImage
close blk: backend at /local/domain/0/backend/vbd/526/2049
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 2.6.32-340-ec2 (buildd@yellow) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #40-Ubuntu SMP Wed Nov 16 14:36:38 UTC 2011 (Ubuntu 2.6.32-340.40-ec2 2.6.32.46+drm33.20)
[ 0.000000] Command line: root=LABEL=cloudimg-rootfs ro xencons=hvc0 console=hvc0
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] Xen-provided physical RAM map:
[ 0.000000] Xen: 0000000000000000 - 0000000026f00000 (usable)
[ 0.000000] last_pfn = 0x26f00 max_arch_pfn = 0x80000000
[ 0.000000] init_memory_mapping: 0000000000000000-0000000026f00000
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] RAMDISK: 01844000 - 03293000
[ 0.000000] (3 early reservations) ==> bootmem [0000000000 - 0026700000]
[ 0.000000] #0 [0001844000 - 00033e9000] Xen provided ==> [0001844000 - 00033e9000]
[ 0.000000] #1 [0001000000 - 00018237b8] TEXT DATA BSS ==> [0001000000 - 00018237b8]
[ 0.000000] #2 [00033e9000 - 0003523000] PGTABLE ==> [00033e9000 - 0003523000]
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000000 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal 0x00100000 -> 0x00100000
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0x00000000 -> 0x00026700
[ 0.000000] 0: 0x00026f00 -> 0x00026f00
[ 0.000000] NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:1 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 18 pages/cpu @ffff880003298000 s44248 r8192 d21288 u73728
[ 0.000000] pcpu-alloc: s44248 r8192 d21288 u73728 alloc=18*4096
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 155259
[ 0.000000] Kernel command line: root=LABEL=cloudimg-rootfs ro xencons=hvc0 console=hvc0
[ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.000000] Initializing CPU#0
[ 0.000000] allocated 6379520 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Software IO TLB disabled
[ 0.000000] Memory: 574464k/637952k available (4836k kernel code, 8192k absent, 54588k reserved, 2084k data, 228k init)
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] NR_IRQS:96
[ 0.000000] Xen reported: 2666.760 MHz processor.
[ 0.000000] Console: colour dummy device 80x25
[ 0.000000] console [hvc0] enabled
[ 0.230003] Calibrating delay using timer specific routine.. 5347.09 BogoMIPS (lpj=26735464)
[ 0.230055] Security Framework initialized
[ 0.230073] AppArmor: AppArmor initialized
[ 0.230089] Mount-cache hash table entries: 256
[ 0.230209] Initializing cgroup subsys ns
[ 0.230215] Initializing cgroup subsys cpuacct
[ 0.230218] Initializing cgroup subsys memory
[ 0.230228] Initializing cgroup subsys devices
[ 0.230230] Initializing cgroup subsys freezer
[ 0.230259] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 0.230262] CPU: L2 cache: 6144K
[ 0.230271] SMP alternatives: switching to UP code
[ 0.255645] Freeing SMP alternatives: 39k freed
[ 0.255834] Brought up 1 CPUs
[ 0.255922] devtmpfs: initialized
[ 0.256333] NET: Registered protocol family 16
[ 0.256945] Brought up 1 CPUs
[ 0.257349] PCI: Fatal: No config space access function found
[ 0.257353] PCI: setting up Xen PCI frontend stub
[ 0.257605] bio: create slab <bio-0> at 0
[ 0.257681] vgaarb: loaded
[ 0.257889] suspend: event channel 9
[ 0.258172] xen_mem: Initialising balloon driver.
[ 0.260364] PCI: System does not support PCI
[ 0.260368] PCI: System does not support PCI
[ 0.260432] NET: Registered protocol family 8
[ 0.260435] NET: Registered protocol family 20
[ 0.260451] NetLabel: Initializing
[ 0.260455] NetLabel: domain hash size = 128
[ 0.260456] NetLabel: protocols = UNLABELED CIPSOv4
[ 0.260490] NetLabel: unlabeled traffic allowed by default
[ 0.260505] Switching to clocksource xen
[ 0.261840] AppArmor: AppArmor Filesystem Enabled
[ 0.262007] NET: Registered protocol family 2
[ 0.262083] IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 0.262363] TCP established hash table entries: 131072 (order: 9, 2097152 bytes)
[ 0.263136] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
[ 0.263553] TCP: Hash tables configured (established 131072 bind 65536)
[ 0.263559] TCP reno registered
[ 0.263629] NET: Registered protocol family 1
[ 0.263708] platform rtc_cmos: registered platform RTC device (no PNP device found)
[ 0.263814] audit: initializing netlink socket (disabled)
[ 0.263838] type=2000 audit(1322421419.386:1): initialized
[ 0.269569] Trying to unpack rootfs image as initramfs...
[ 0.279699] VFS: Disk quotas dquot_6.5.2
[ 0.279731] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 0.279885] DLM (built Nov 16 2011 14:40:41) installed
[ 0.279994] JFS: nTxBlock = 4920, nTxLock = 39360
[ 0.289416] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ 0.289643] SGI XFS Quota Management subsystem
[ 0.299611] Slow work thread pool: Starting up
[ 0.299651] Slow work thread pool: Ready
[ 0.299659] GFS2 (built Nov 16 2011 14:41:38) installed
[ 0.299675] msgmni has been set to 1230
[ 0.299847] alg: No test for stdrng (krng)
[ 0.299858] io scheduler noop registered
[ 0.299860] io scheduler anticipatory registered
[ 0.299862] io scheduler deadline registered (default)
[ 0.299871] io scheduler cfq registered
[ 0.314987] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 0.315818] brd: module loaded
[ 0.316148] loop: module loaded
[ 0.316216] Xen virtual console successfully installed as hvc0
[ 0.316254] Event-channel device installed.
[ 0.324444] Freeing initrd memory: 26940k freed
[ 0.338978] netfront: Initialising virtual ethernet driver.
[ 0.340057] PPP generic driver version 2.4.2
[ 0.340628] Equalizer2002: Simon Janes ([email protected]) and David S. Miller ([email protected])
[ 0.340767] tun: Universal TUN/TAP device driver, 1.6
[ 0.340769] tun: (C) 1999-2004 Max Krasnyansky <[email protected]>
[ 0.341644] i8042.c: No controller found.
[ 0.341704] mice: PS/2 mouse device common for all mice
[ 0.341758] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[ 0.341810] Driver for 1-wire Dallas network protocol.
[ 0.341865] device-mapper: uevent: version 1.0.3
[ 0.341932] device-mapper: ioctl: 4.15.0-ioctl (2009-04-01) initialised: [email protected]
[ 0.342186] NET: Registered protocol family 17
[ 0.342285] registered taskstats version 1
[ 0.355601] xen-vbd: registered block device major 8
[ 0.440415] XENBUS: Device with no driver: device/console/0
[ 0.440429] /build/buildd/linux-ec2-2.6.32/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[ 0.440534] Freeing unused kernel memory: 228k freed
[ 0.440675] Write protecting the kernel read-only data: 6492k
Loading, please wait...
[ 0.460565] udev: starting version 151
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Waiting for root file system ...
I'm sorry this response is so delayed. A few comments first:
The ami you list is no longer current (simply due to time passing, and Ubuntu refreshing images on EC2). If you're interested in finding the most current official AMIs, please see
https://askubuntu.com/questions/53582/how-do-i-know-what-ubuntu-ami-to-launch-on-ec2
The kernel you were running is no longer current for 10.04 (again, simply due to maintenance on Ubuntu).
So, all that said, running
aptitude safe-upgrade
should be safe on EC2. I verified that doing so works when using the AMI you listed above on both an t1.micro and a m1.large. At this point in time, that results in kernel '2.6.32-341.42' rather than what you got '2.6.32-340.40'.I tried to reproduce your issue explicitly by downloading and installing the same version of the kernel via the launchpad archive. My instance of both t1.micro and m1.large rebooted into 2.6.32-340 after a simple
sudo dpkg -i linux-image-2.6.32-340-ec2_2.6.32-340.40_amd64.deb && sudo reboot
.Again,
aptitude safe-upgrade
andapt-get dist-upgrade
should be perfectly safe on EC2. If they're not, please open bugs.