I'm working with a robotic platform (for use in the RoboCup competition) and we run Ubuntu Server 13.10. It's critical during matches that our robots are able to boot as quickly as possible. What's more, if they power cycles during a fall, then the robot has 10 seconds to come back to life before being taken off the field.
Currently the boot time is around 15 seconds and I'd like to know what I can do to reduce that time. Hopefully I can learn something too.
Here is the full output of dmesg
which I suppose helps break down what is going on at each time step.
There are a few gaps in the output:
- At 0.41 there's a gap of about 0.8 seconds
- At 3.64 there's a gap of about 2.8 seconds
- At 9.67 there's a gap of about 0.8 seconds
- At 12.9 there's a gap of about 1.4 seconds
eth
is not required though wlan
is.
Can anyone decode this into actionable advice for how to get this bot booted faster? And if there is not enough information in these messages, what else might I try to investigate further?
EDIT I modified my binary (which starts automatically as an upstart job) to write to the syslog. It shows, starting with the first logged message after powered:
Oct 30 12:51:52 darwin6 kernel: imklog 5.8.11, log source = /proc/kmsg started.
....
Oct 30 12:52:12 darwin6 kernel: [ 34.276716] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Oct 30 12:52:13 darwin6 dhclient: Internet Systems Consortium DHCP Client 4.2.4
Oct 30 12:52:13 darwin6 dhclient: Copyright 2004-2012 Internet Systems Consortium.
Oct 30 12:52:13 darwin6 dhclient: All rights reserved.
Oct 30 12:52:13 darwin6 dhclient: For info, please visit https://www.isc.org/software/dhcp/
Oct 30 12:52:13 darwin6 dhclient:
Oct 30 12:52:13 darwin6 dhclient: Listening on LPF/wlan0/00:0d:f0:95:0d:4d
Oct 30 12:52:13 darwin6 dhclient: Sending on LPF/wlan0/00:0d:f0:95:0d:4d
Oct 30 12:52:13 darwin6 dhclient: Sending on Socket/fallback
Oct 30 12:52:13 darwin6 dhclient: DHCPDISCOVER on wlan0 to 255.255.255.255 port 67 interval 3 (xid=0x3cd422f3)
Oct 30 12:52:13 darwin6 dhclient: DHCPREQUEST of 192.168.0.3 on wlan0 to 255.255.255.255 port 67 (xid=0x3cd422f3)
Oct 30 12:52:13 darwin6 dhclient: DHCPOFFER of 192.168.0.3 from 192.168.0.1
Oct 30 12:52:13 darwin6 avahi-daemon[833]: Joining mDNS multicast group on interface wlan0.IPv6 with address fe80::20d:f0ff:fe95:d4d.
Oct 30 12:52:13 darwin6 avahi-daemon[833]: New relevant interface wlan0.IPv6 for mDNS.
Oct 30 12:52:13 darwin6 avahi-daemon[833]: Registering new address record for fe80::20d:f0ff:fe95:d4d on wlan0.*.
Oct 30 12:52:14 darwin6 dhclient: DHCPACK of 192.168.0.3 from 192.168.0.1
Oct 30 12:52:14 darwin6 avahi-daemon[833]: Joining mDNS multicast group on interface wlan0.IPv4 with address 192.168.0.3.
Oct 30 12:52:14 darwin6 avahi-daemon[833]: New relevant interface wlan0.IPv4 for mDNS.
Oct 30 12:52:14 darwin6 avahi-daemon[833]: Registering new address record for 192.168.0.3 on wlan0.IPv4.
Oct 30 12:52:14 darwin6 dhclient: bound to 192.168.0.3 -- renewal in 41314 seconds.
Oct 30 12:52:21 darwin6 ntpdate[1294]: adjust time server 91.189.94.4 offset -0.243167 sec
Oct 30 12:52:46 darwin6 kernel: [ 68.451644] perf samples too long (2504 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
Oct 30 12:53:49 darwin6 boldhumanoid[1455]: Starting boldhumanoid process
The strange thing is that I also timed with a stopwatch. From turning the power on to my process starting (the robot coming alive) was 35 seconds. It seems that something is causing the syslog write to be delayed by a further 100 seconds, going by the timestamps.
I am not sure if a server installation of Ubuntu is really the right way to go, because of what you are describing sounds like a job for archlinux or similiar to me.
You will need to do a lot of customization on your system, which is when a "take a base system and add anything needed" approach is easier than a "use a full-fledged installation and remove anything unnecessary (e.g. apport, apparmor, dhcp ...)" one.
But anyways, there actually IS a wiki entry all about reducing boot time. The commands may not translate 1to1 for your Ubuntu system, but this entry might point you in the right direction:
https://wiki.archlinux.org/index.php/Improve_boot_performance
I am sorry, but reducing device-specific boot time is nothing a short answer on askubuntu can handle, you will have to look at every item in your startup routine and decide whether it is necessary or not and disable the system components accordingly.
This is one of those applications where you basically start subtracting until you break something and then back off. Like for instance, I don't think your robot is likely using zeroconf which is the avahi daemon, so uninstall it. Do you really need networking? Or better, boot into single user mode (append 1 to kernel cmdline) and then see how many services you have to turn on before your application starts working.
Which size division are you in? I realize this may not be the answer you're looking for here, but for something time critical like standing up within 10 seconds, I would go with an embedded platform.
If you have the space / power budget, you should consider sticking something like an arduino on the robot, and programming some open-loop stand-up behaviors. You can still use ubuntu on the main board. I'd suggest something like this:
There are answers here which can help you improve boot time (turn stuff off until you break it, basically), but even then, if loosing power is a common enough problem, something like this could be very helpful.
Try reducing storage buffer to 4 MB. In /etc/sysctl.conf write:
vm.dirty_bytes = 4194304 vm.dirty_background_bytes = 1048576
also maybe reduce writing time to 3 seconds
vm.dirty_expire centisecs = 300 vm.dirty_writeback_centisecs = 300
Please comment if this helped or not. Thank you.