SnapOverflow

SnapOverflow Logo SnapOverflow Logo

SnapOverflow Navigation

  • Home
  • Server
  • Ubuntu

Mobile menu

Close
  • Home
  • System Administrators
    • Hot Questions
    • New Questions
    • Tags
  • Ubuntu
    • Hot Questions
    • New Questions
    • Tags
  • Help
Home / server / Questions

Questions[leapsecond](server)

Martin Hope
U. Windl
Asked: 2022-02-10 02:03:35 +0800 CST

Can I force a re-read of the leap-seconds file immediately?

  • 0

After realizing that my leap-seconds file is expired, I updated it. However while ntpq indicates expire=202112280000, it did not re-read the current file.

I suspect that the file is being checked once per day only.

Thus the question is there a ntpq configuration command to force a re-read of the leap-seconds file immediately?

I'm using ntp 4.2.8p15 on Linux.

Update:

As a matter of fact, the new leap-seconds file was read one hour after the last expiration log message (I have expire=202206280000 now), but still I'd like to see an answer. So it seems the file is checked every hour.

ntpd leapsecond
  • 0 Answers
  • 92 Views
Martin Hope
amosk
Asked: 2017-01-14 13:51:12 +0800 CST

How to fix leap second sleep issue without reboot

  • 5

I found that after the latest leap second insertion (2016-12-31 23:59:60), our CentOS7 application that has worker threads sleeping for 1 second between jobs, started to wake the sleeping threads immediately instead of in a second. In general, all sleeps are awake 1 second ahead of expected wake time.

The simplest and working solution is to reboot the box. But that is not desirable in our case. Is there a way to fix this without rebooting?

PS. For reference, here's a simple program in C++ that reproduces the issue.

#include <boost/date_time.hpp>
#include <boost/thread.hpp>
#include <iostream>

using namespace std;


// this has to be run in a thread to be able to detect the issue
void check_thread()
{
    size_t expected_delay = 1000;
    cout << "Expected delay: " << expected_delay << " ms" << endl;
    boost::posix_time::ptime t1 = boost::posix_time::microsec_clock::universal_time();
    boost::this_thread::sleep(boost::posix_time::milliseconds(1000));
    boost::posix_time::ptime t2 = boost::posix_time::microsec_clock::universal_time();
    size_t actual_delay = (t2 - t1).total_milliseconds();
    cout << "Actual delay: " << actual_delay << " ms" << endl;
    if (abs(expected_delay - actual_delay) > 900) {
        cout << "Too big delay difference: " << (expected_delay - actual_delay) << endl;
        cout << "Possible leap second issue" << endl;
    }
    else {
        cout << "No issues found" << endl;
    }
}

int main()
{
    boost::thread_group g;
    g.create_thread(check_thread);
    g.join_all();
    return 0;
}

Building:

g++ sleep_test.cpp -Wl,-Bstatic -lboost_thread -lboost_system -lboost_date_time -Wl,-Bdynamic -rdynamic -pthread
linux leapsecond
  • 2 Answers
  • 202 Views
Martin Hope
Diomidis Spinellis
Asked: 2016-12-26 12:44:20 +0800 CST

How do I configure a Unix system to run on TAI time?

  • 5

I want to configure a Unix system to run on International Atomic Time (TAI) in order to be able to see the year-end leap second properly reported as 2016-12-31 23:59:60. I know this will cause the system's timestamps to be incompatible with POSIX ones, but I'm doing this as an experiment. I have already copied the timezone file from /usr/share/zoneinfo/right/ to /etc/localtime. These are my questions.

  • How can I accurately set the system's time? I understand that it must be set to TAI seconds, rather than UTC seconds. Is it possible to do this via NTP? Currently, the system displays the time 36 seconds off from the correct one.
  • Will the displayed time continue to be correct after 2017-02-01? Do the zoneinfo/right timezone files need to be updated?
ntp leapsecond
  • 2 Answers
  • 1715 Views
Martin Hope
User402841
Asked: 2012-07-16 08:31:33 +0800 CST

Fit-PC bricked due to leap second, how to prevent the second one from failing?

  • 3

I've got three Fit-PCs in use. They are being used as light-weight Linux servers. Unfortunately, on Jun 30, the first of them failed to start due to the leap-second bug. I tried rebooting it a few times, but the screen remained blank after the third bootup-attempt. This appeared to be hardware-related and we took it to a repair-man. He told us something had overheated and that the motherboard was broken. He was able to recover the data, but the fit-pc was written off.

The second Fit-PC was unable to reboot a few days later (first time we actually tried to reboot). With apparently sheer luck, it rebooted on the third attempt, and it is now working fine.

The third Fit-PC had not given any problems. When I found out the other ones failed due to the Leap-Second, I actually thought we were lucky with this third one. Fact is, the recent slowness of the server was most likely due to this same bug, and now that I rebooted this machine (first time after Jun 30), it's giving me the exact same symptoms as the other ones. These symptoms are:

  • Initial reboot attempt fails; OS does not load.
  • I connect a screen to see what is going on. Remains black.
  • I reboot again. I now see the regular loading screen ("Intel Atom..."), but this freezes
  • I try to reboot again.
  • Screen now simply does not activate at all. It does now show any sign of life. The monitor simply acts as if nothing is sending any signal, so I have no way to interact with the CPU whatsoever.

I've trying to reboot about 4 times now, but am very much fearing the same problem as before. Where I live the Fit-PCs are uncommon and I am not sure if there are qualified techs who actually know how to repair this (and I am not even sure if the diagnosis of the other tech was correct). So I am asking: do you also think my motherboard was overheated and was yet another Fit-PC bricked, or is there something else I can do?

EDIT: Using Ubuntu 12.04 on all of the Fit-PCs.

EDIT:

I also considered a power-failure. But there are a few inconsistencies:

  • the servers are on three different sites,
  • no power surge was reported and no other hardware was affected - weather was sunny and calm,
  • the only similarity between the three machines was that they started acting odd every since Jun 30 (the third one was having high loads but I failed to recognize this until the first reboot since Jun 30, which I did today).

I could also not find other Fit-PCs affected by the leap-second, but am simply not sure what else could cause this...

linux leapsecond
  • 3 Answers
  • 1041 Views
Martin Hope
Bron Gondwana
Asked: 2012-07-01 08:15:09 +0800 CST

Anyone else experiencing high rates of Linux server crashes during a leap second day?

  • 363
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.

*NOTE: if your server still has issues due to confused kernels, and you can't reboot - the simplest solution proposed with gnu date installed on your system is: date -s now. This will reset the kernel's internal "time_was_set" variable and fix the CPU hogging futex loops in java and other userspace tools. I have straced this command on my own system an confirmed it's doing what it says on the tin *

POSTMORTEM

Anticlimax: only thing that died was my VPN (openvpn) link to the cluster, so there was an exciting few seconds while it re-established. Everything else was fine, and starting up ntp went cleanly after the leap second had passed.

I have written up my full experience of the day at http://blog.fastmail.fm/2012/07/03/a-story-of-leaping-seconds/

If you look at Marco's blog at http://my.opera.com/marcomarongiu/blog/2012/06/01/an-humble-attempt-to-work-around-the-leap-second - he has a solution for phasing the time change over 24 hours using ntpd -x to avoid the 1 second skip. This is an alternative smearing method to running your own ntp infrastructure.


Just today, Sat June 30th, 2012 - starting soon after the start of the day GMT. We've had a handful of servers in different datacentres as managed by different teams all go dark - not responding to pings, screen blank.

They're all running Debian Squeeze - with everything from stock kernel to custom 3.2.21 builds. Most are Dell M610 blades, but I've also just lost a Dell R510 and other departments have lost machines from other vendors too. There was also an older IBM x3550 which crashed and which I thought might be unrelated, but now I'm wondering.

The one crash which I did get a screen dump from said:

[3161000.864001] BUG: spinlock lockup on CPU#1, ntpd/3358
[3161000.864001]  lock: ffff88083fc0d740, .magic: dead4ead, .owner: imapd/24737, .owner_cpu: 0

Unfortunately the blades all supposedly had kdump configured, but they died so hard that kdump didn't trigger - and they had console blanking turned on. I've disabled console blanking now, so fingers crossed I'll have more information after the next crash.

Just want to know if it's a common thread or "just us". It's really odd that they're different units in different datacentres bought at different times and run by different admins (I run the FastMail.FM ones)... and now even different vendor hardware. Most of the machines which crashed had been up for weeks/months and were running 3.1 or 3.2 series kernels.

The most recent crash was a machine which had only been up about 6 hours running 3.2.21.

THE WORKAROUND

Ok people, here's how I worked around it.

  1. disabled ntp: /etc/init.d/ntp stop
  2. created http://linux.brong.fastmail.fm/2012-06-30/fixtime.pl (code stolen from Marco, see blog posts in comments)
  3. ran fixtime.pl without an argument to see that there was a leap second set
  4. ran fixtime.pl with an argument to remove the leap second

NOTE: depends on adjtimex. I've put a copy of the squeeze adjtimex binary at http://linux.brong.fastmail.fm/2012-06-30/adjtimex — it will run without dependencies on a squeeze 64 bit system. If you put it in the same directory as fixtime.pl, it will be used if the system one isn't present. Obviously if you don't have squeeze 64-bit… find your own.

I'm going to start ntp again tomorrow.

As an anonymous user suggested - an alternative to running adjtimex is to just set the time yourself, which will presumably also clear the leapsecond counter.

linux debian ntp server-crashes leapsecond
  • 5 Answers
  • 152304 Views

Sidebar

Stats

  • Questions 681965
  • Answers 980273
  • Best Answers 280204
  • Users 287326
  • Popular
  • Answers
  • Marko Smith

    Can you pass user/pass for HTTP Basic Authentication in URL parameters?

    • 5 Answers
  • Marko Smith

    Ping a Specific Port

    • 18 Answers
  • Marko Smith

    Check if port is open or closed on a Linux server?

    • 7 Answers
  • Marko Smith

    How to automate SSH login with password?

    • 10 Answers
  • Marko Smith

    How do I tell Git for Windows where to find my private RSA key?

    • 30 Answers
  • Marko Smith

    What's the default superuser username/password for postgres after a new install?

    • 5 Answers
  • Marko Smith

    What port does SFTP use?

    • 6 Answers
  • Marko Smith

    Command line to list users in a Windows Active Directory group?

    • 9 Answers
  • Marko Smith

    What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

    • 3 Answers
  • Marko Smith

    How to determine if a bash variable is empty?

    • 15 Answers
  • Martin Hope
    Davie Ping a Specific Port 2009-10-09 01:57:50 +0800 CST
  • Martin Hope
    Smudge Our security auditor is an idiot. How do I give him the information he wants? 2011-07-23 14:44:34 +0800 CST
  • Martin Hope
    kernel Can scp copy directories recursively? 2011-04-29 20:24:45 +0800 CST
  • Martin Hope
    Robert ssh returns "Bad owner or permissions on ~/.ssh/config" 2011-03-30 10:15:48 +0800 CST
  • Martin Hope
    Eonil How to automate SSH login with password? 2011-03-02 03:07:12 +0800 CST
  • Martin Hope
    gunwin How do I deal with a compromised server? 2011-01-03 13:31:27 +0800 CST
  • Martin Hope
    Tom Feiner How can I sort du -h output by size 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent How to determine if a bash variable is empty? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus How do you find what process is holding a file open in Windows? 2009-05-01 16:47:16 +0800 CST

Related Questions

Trending Tags

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • Home
  • Questions
    • Hot Questions
    • New Questions
  • Tags
  • Help

Footer

SnapOverflow

About Us

  • About Us
  • Contact Us

Legal Stuff

  • Privacy Policy

Help

© 2022 SOF-TR. All Rights Reserve