ntp questions - Page 1

Bron Gondwana

Asked: 2012-07-01 08:15:09 +0800 CST

Anyone else experiencing high rates of Linux server crashes during a leap second day?

363

*NOTE: if your server still has issues due to confused kernels, and you can't reboot - the simplest solution proposed with gnu date installed on your system is: date -s now. This will reset the kernel's internal "time_was_set" variable and fix the CPU hogging futex loops in java and other userspace tools. I have straced this command on my own system an confirmed it's doing what it says on the tin *

POSTMORTEM

Anticlimax: only thing that died was my VPN (openvpn) link to the cluster, so there was an exciting few seconds while it re-established. Everything else was fine, and starting up ntp went cleanly after the leap second had passed.

I have written up my full experience of the day at http://blog.fastmail.fm/2012/07/03/a-story-of-leaping-seconds/

If you look at Marco's blog at http://my.opera.com/marcomarongiu/blog/2012/06/01/an-humble-attempt-to-work-around-the-leap-second - he has a solution for phasing the time change over 24 hours using ntpd -x to avoid the 1 second skip. This is an alternative smearing method to running your own ntp infrastructure.

Just today, Sat June 30th, 2012 - starting soon after the start of the day GMT. We've had a handful of servers in different datacentres as managed by different teams all go dark - not responding to pings, screen blank.

They're all running Debian Squeeze - with everything from stock kernel to custom 3.2.21 builds. Most are Dell M610 blades, but I've also just lost a Dell R510 and other departments have lost machines from other vendors too. There was also an older IBM x3550 which crashed and which I thought might be unrelated, but now I'm wondering.

The one crash which I did get a screen dump from said:

[3161000.864001] BUG: spinlock lockup on CPU#1, ntpd/3358
[3161000.864001]  lock: ffff88083fc0d740, .magic: dead4ead, .owner: imapd/24737, .owner_cpu: 0

Unfortunately the blades all supposedly had kdump configured, but they died so hard that kdump didn't trigger - and they had console blanking turned on. I've disabled console blanking now, so fingers crossed I'll have more information after the next crash.

Just want to know if it's a common thread or "just us". It's really odd that they're different units in different datacentres bought at different times and run by different admins (I run the FastMail.FM ones)... and now even different vendor hardware. Most of the machines which crashed had been up for weeks/months and were running 3.1 or 3.2 series kernels.

The most recent crash was a machine which had only been up about 6 hours running 3.2.21.

THE WORKAROUND

Ok people, here's how I worked around it.

disabled ntp: /etc/init.d/ntp stop
created http://linux.brong.fastmail.fm/2012-06-30/fixtime.pl (code stolen from Marco, see blog posts in comments)
ran fixtime.pl without an argument to see that there was a leap second set
ran fixtime.pl with an argument to remove the leap second

NOTE: depends on adjtimex. I've put a copy of the squeeze adjtimex binary at http://linux.brong.fastmail.fm/2012-06-30/adjtimex — it will run without dependencies on a squeeze 64 bit system. If you put it in the same directory as fixtime.pl, it will be used if the system one isn't present. Obviously if you don't have squeeze 64-bit… find your own.

I'm going to start ntp again tomorrow.

As an anonymous user suggested - an alternative to running adjtimex is to just set the time yourself, which will presumably also clear the leapsecond counter.

John Bachir

Asked: 2011-01-11 12:39:11 +0800 CST

Why is ntpd not updating the time on my server?

24

I have ntpd running on my server. It's all the default settings, except I commented out its ability to be a server to other machines:

# restrict -4 default kod notrap nomodify nopeer noquery                                                                    
# restrict -6 default kod notrap nomodify nopeer noquery   
restrict default ignore

If I run ntpdate -q ntp.ubuntu.com, I'm told that my machine's clock is off by 7 seconds.

What's going on? How can I diagnose what's happening, is there a log I can turn on?

more info #1

# ntpq -np
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 91.189.94.4     193.79.237.14    2 u   30   64    7  108.518   -0.136   0.361

more info #2

Here's what this looked like when I asked the question:

# ntpdate -q ntp.ubuntu.com
server 91.189.94.4, stratum 2, offset 7.191308, delay 0.13310
10 Jan 20:38:09 ntpdate[31055]: step time server 91.189.94.4 offset 7.191308 sec

And here's what it looks like now, after restarting ntpd a couple times (I'm assuming that's what fixed it):

# ntpdate -q ntp.ubuntu.com
server 91.189.94.4, stratum 2, offset 0.000112, delay 0.13164
10 Jan 20:47:03 ntpdate[31419]: adjust time server 91.189.94.4 offset 0.000112 sec

more info #3

I uninstalled ntp and installed openntpd and ran /usr/sbin/ntpd -d, and I'm seeing output like this:

reply from 64.73.32.134: offset 6.715003 delay 0.041152, next query 30s
reply from 208.53.158.34: offset 6.700224 delay 0.036263, next query 31s
adjusting local clock by 6.734120s
reply from 72.18.205.156: offset 6.708575 delay 0.035885, next query 30s
reply from 64.73.32.134: offset 6.701463 delay 0.044199, next query 33s

Which to me pretty clearly indicates that I'm not able to set the time on my server (although, with regular ntp, it does seem to update sometimes...).

more info #4

My VPS provider says:

The latest kernels should not lock your system to our dom0's clock, to be on the safe side you can set xen.independent_wallclock = 1 in your sysctl.conf.

Which I suppose still does not address the issue of the VPS needing a CPU available in order to do correct timing calculations.

John Bachir

Asked: 2011-01-11 12:29:44 +0800 CST

How can I compare an ntp server's time to my server's time?

33

I have ntpd running on a box. I want to see how the time on the box compares to the time retrieved from ntp.ubuntu.com. Is there an easy way to do this?

2 revsuser640

Asked: 2009-06-19 11:46:32 +0800 CST

Is anyone using GPS for time sync?

37

I've come across a handful of GPS NTP servers, as well as some inexpensive solutions using off-the-shelf receivers and software. Right now I'm just using NTP with a list of servers over the Internet. What is the advantage to using GPS instead (considering the alternative is free)?

Jeff Atwood

Asked: 2009-05-01 10:16:30 +0800 CST

Windows Server unable to synchronize NTP time reliably

41

Why does Windows Server (2008, in this case, but I've seen the same problem in 2003) seem to have problem synchronizing time? I've seen this error in my System log across a variety of servers:

The time service has not synchronized the system time for 86400 seconds because none of the time service providers provided a usable time stamp. The time service will not update the local system time until it is able to synchronize with a time source. If the local system is configured to act as a time server for clients, it will stop advertising as a time source to clients. The time service will continue to retry and sync time with its time sources. Check system event log for other W32time events for more details. Run 'w32tm /resync' to force an instant time synchronization.

Under Control Panel, Date and Time, The Internet Time Settings are set to synchronize with time-nw.nist.gov ; the last successful sync was 2 days ago, indicating there's some kind of problem. But if I click the "update now" button on that dialog, indeed, it updates with the time!

So why can't windows server reliably time sync via NTP in the background without me manually intervening? What am I doing wrong?

Anyone else experiencing high rates of Linux server crashes during a leap second day?

Why is ntpd not updating the time on my server?

How can I compare an ntp server's time to my server's time?

Is anyone using GPS for time sync?

Windows Server unable to synchronize NTP time reliably

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Questions[ntp](server)