I have two networked voxls which I am trying to synchronize using chrony. The voxls boot up with vastly different system times, like years apart. I would expect chrony to synchronize the times when the service starts using makestep
, but after starting chrony, I still observe a large difference in system time.
The configurations are as follows:
#server 10.0.0.102
makestep 1.0 3
driftfile /var/lib/chrony/drift
rtcsync
allow 10.0.0
local stratum 8
manual
logdir /var/log/chrony
#client 10.0.0.101
server 10.0.0.102 iburst maxpoll 5 prefer
makestep 1.0 3
driftfile /var/lib/chrony/drift
rtcsync
logdir /var/log/chrony
When chrony starts up, I would expect it to use makestep
to synchronize the client in one fell swoop, and I see a time adjustment in the systemclt status
root@voxl1:~# systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2023-02-01 21:34:52 UTC; 83 years 0 months ago
Process: 3086 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 3088 (chronyd)
CGroup: /system.slice/chronyd.service
└─3088 /usr/sbin/chronyd
Feb 01 21:34:52 voxl1 systemd[1]: Starting NTP client/server...
Feb 01 21:34:52 voxl1 chronyd[3088]: chronyd version 2.4 starting (+CMDMON +NTP +REFCLOCK +RTC -PRIVDROP -...EBUG)
Feb 01 21:34:52 voxl1 chronyd[3088]: Frequency -0.681 +/- 0.232 ppm read from /var/lib/chrony/drift
Feb 01 21:34:52 voxl1 systemd[1]: Started NTP client/server.
Feb 01 21:34:56 voxl1 chronyd[3088]: Selected source 10.0.0.102
Feb 01 21:34:56 voxl1 chronyd[3088]: System clock wrong by 2619696428.415401 seconds, adjustment started
Feb 07 11:02:04 voxl1 chronyd[3088]: System clock was stepped by 2619696428.415401 seconds
If I use chronyc tracking
or chronyc sources
to observe the time offsets, the report indicates that the times are synchronized within 100 microseconds.
root@voxl1:~# chronyc tracking
Reference ID : 10.0.0.102 (10.0.0.102)
Stratum : 9
Ref time (UTC) : Sun Feb 07 11:08:34 2106
System time : 0.000066503 seconds slow of NTP time
Last offset : -0.000076736 seconds
RMS offset : 0.000044063 seconds
Frequency : 0.785 ppm slow
Residual freq : -0.216 ppm
Skew : 0.987 ppm
Root delay : 0.004293 seconds
Root dispersion : 0.000069 seconds
Update interval : 129.8 seconds
Leap status : Normal
root@voxl1:~# chronyc sources -v
210 Number of sources = 1
.-- Source mode '^' = server, '=' = peer, '#' = local clock.
/ .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| / '?' = unreachable, 'x' = time may be in error, '~' = time too variable.
|| .- xxxx [ yyyy ] +/- zzzz
|| Reachability register (octal) -. | xxxx = adjusted offset,
|| Log2(Polling interval) --. | | yyyy = measured offset,
|| \ | | zzzz = estimated error.
|| | | \
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 10.0.0.102 8 6 377 46 -77us[ -109us] +/- 1953us
However, if I then print the date, it doesn't match the time server at all.
client 10.0.0.101
root@voxl1:~# date
Sun Feb 7 11:12:00 UTC 2106
server 10.0.0.102
root@voxl2:~# date
Thu Jan 1 04:43:02 UTC 1970
I then tried to trigger a manual chronyc makestep
, but that also didn't seem to have an effect.
Why is my date not the same? Did makestep work as intended? Is there a limit to how far chronyc makestep
can step the clock?
Edit: I have a hypothesis, but I don't know how to test it. I think I might be seeing an underflow error. January 1, 1970 is the Unix Epoch. My hypothesis is that when chrony first tries to synchronize the client on startup, it makes an underflow error, and I see the systemctl message
Feb 01 21:34:56 voxl1 chronyd[3088]: System clock wrong by 2619696428.415401 seconds, adjustment started
Feb 07 11:02:04 voxl1 chronyd[3088]: System clock was stepped by 2619696428.415401 seconds
That incorrect step pushes the client to 2106, and chrony now thinks it's synchronized with the server which is why further makesteps have no effect and the offset appears small.
Any ideas how to test this hypothesis?
Yes, there is a limit. The same limit that means NTP will roll over in the year 2036.
NTP timestamp format is based on 32 bit seconds (and 32 bit fractions of a second) or 136 years, also known as an NTP era. The difference between these is plus or minus 68 years. Which is the safe amount of time delta without the implementation making assumptions about which era you are in.
In practice, implementations will be more conservative and assume the era changed before the limits of the data structure. chrony's configure script defaults to 50 years before its build date. In other words, for about three years now, 1970 is assumed to be an a different NTP era. It isn't, but usually one can assume that clocks have been set sometime in the past 5 decades.
chrony calculated deltas that would put it before this era. So it assumed era rolled over, doing math mod 136 years. This year minus 1970 is 53 years ago. 136 minus 53 is 83, which is your enormous offset:
A different way to see this is an NTP era thing is comparing those server and client timestamps. Convert both to UNIX epoch seconds (
date +%s
from GNU coreutils), subtract 2^32, and subtract the smaller, and they are only 42 apart.Server time in 1970 is extraordinary. As of 2023, we are 1.6 billion seconds past 1970-01-01 00:00:00 UTC.
Use a real time clock. Whatever the client was started with before it got stepped seemed reasonable? Does not need to be accurate, getting the decade correct would be an improvement. Even if the hardware or software started with a hard coded date, that could be correctable, similar to a RTC with a dead battery.
Add more reliable time sources for your NTP server. If you have internet add
pool 2.pool.ntp.org
to chrony.conf. And, with a clear view of the sky, sat nav antennas can add accurate clocks without needing to go over IP.