I have tried several guides on how to set up a local ntp server on ubuntu but none seem to work correctly. My servers are drifting heavily in time for some reason and I have to keep their time close together because I run databases that require this.
- I have 8 ubuntu 14.04 LTS servers, none of them has internet access
- I want to run a ntp server on one (or more if that is better) of the servers and have all other servers connect to the ntp server(s) to set the time
Currently, my server (ip .24) runs this /etc/ntp.conf:
server 127.127.1.0 prefer
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
# Give localhost full access rights
restrict 127.0.0.1
# Give machines on our network access to query us
restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap
broadcast 192.168.178.0
And on the "clients":
# Point to our network's master time server
server 192.168.178.24 iburst
fudge 192.168.178.24 stratum 10
restrict default ignore
restrict ::1
restrict 127.0.0.1
restrict 192.168.178.24 mask 255.255.255.255 nomodify notrap noquery
driftfile /var/lib/ntp/drift
minpoll 4
maxpoll 5
Note: I have used Multi-Tabbed Putty to send the following commands to all ntp clients at the same time.
I have stopped the ntp services for all except the server, used sudo ntpdate 192.168.178.24
to let them fetch the date and restarted the ntp services afterwards. This succeeded. All servers showed the same date straight after the command finished. After about 10 minutes however, my servers show the following time:
Fr 30. Sep 11:16:53 CEST 2016
Fr 30. Sep 11:15:33 CEST 2016 (server .24)
Fr 30. Sep 11:16:50 CEST 2016
Fr 30. Sep 11:15:33 CEST 2016
Fr 30. Sep 11:17:05 CEST 2016
Fr 30. Sep 11:15:33 CEST 2016
Fr 30. Sep 11:15:33 CEST 2016
Fr 30. Sep 11:15:33 CEST 2016
How to have them properly sync to the ntp server? And how can I lower the polling time? It looks like my servers are running out of sync fast so I need them to retrieve the "correct" time again...
With "correct" time I mean a time that is the same for all servers. It does not necessarily need to be the exact correct world time (if you call it like that).
Edit: I have tried the suggested configuration setting. As far as I understood, this is how my server/client configs should look like. In the meantime, I have seen that my .24 server is actually drifting to a worse time. The .20 server is the most accurate one and I am using the .20 server now to host the ntp server. Sorry for the confusion.
Server config:
# Use the local clock
server 127.127.1.0 prefer
fudge 127.127.1.0
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
# Give localhost full access rights
restrict default
# Give machines on our network access to query us
restrict 192.168.178.0 mask 255.255.255.0 nomodify notrap
broadcast 192.168.178.0
For the clients:
# Point to our network's master time server
server 192.168.178.20 iburst
restrict default
driftfile /var/lib/ntp/drift
minpoll 4
maxpoll 5
ntpq -as and ntpq -pe on the server:
ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 41906 963a yes yes none sys.peer sys_peer 3
2 41907 8811 yes none none reject mobilize 1
ntpq -c pe
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) .LOCL. 5 l 60 64 377 0.000 0.000 0.000
192.168.178.0 .BCST. 16 u - 64 0 0.000 0.000 0.000
Five times similar output like this (these servers drift in time):
ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 62104 9024 yes yes none reject reachable 2
ntpq -c pe
remote refid st t when poll reach delay offset jitter
==============================================================================
hadoop20.xx LOCAL(0) 6 u 27 64 377 0.151 63591.8 33407.0
For two (most likely?) working clients:
ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 7757 963a yes yes none sys.peer sys_peer 3
ntpq -c pe
remote refid st t when poll reach delay offset jitter
==============================================================================
*hadoop20.xx LOCAL(0) 6 u 18 64 377 0.183 7.883 3.015
edit 2:
I have used sudo service ntp stop
, sudo ntpdate 192.168.178.20
, wait for ntpdate to finish, sudo service ntp start
on all clients. There are still only 2 succeeding clients and 5 rejecting clients.
The rejecting clients show this output. The delay
+ offset
values look high because the failing clients drift in time. Maybe they are not trusting the server to update the time because the delay/offset is so high?
ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 20981 905a yes yes none reject sys_peer 5
ntpq -c pe
remote refid st t when poll reach delay offset jitter
==============================================================================
hadoop20.xx LOCAL(0) 6 u 34 64 3 0.166 18665.9 16201.3
I have also tried using this https://askubuntu.com/a/256004 answer, it works for about 30 seconds then the state changes to "reject" again! Same for ntpdate -s 192.168.178.20
. It is most likely related to the ntp clients rejecting the time of the server. Is there a way to FORCE them to change the time?
Don't do this. Seriously. Just don't. People keep coming up with the idea that NTP is designed to allow a bunch of machines all to have the same time. It isn't. It's designed, quite carefully, to allow many machines to all have the closest thing they can to the correct time, which is not the same thing.
If you have access to a window, you can build a half-decent stratum-1 server for about £50, or a good one for £100. You would do much better to build something like that, then point the other clients at it. Correct timestamps are much better than merely self-consistent ones, not least for forensics.
But if you absolutely must do what you're doing, then you need to realise that you're perverting ntpd, and this will mean understanding what you're doing.
On the server
means "use the local undisciplined clock as if it were authoritative", which is what you want. I'm not sure why you're forcing it to stratum 10, though; consider dropping the
stratum 10
, and let the driver supply its default stratum of 0. On the clientsmakes no sense at all.
fudge 127.127.x.y
is reserved for forcing the use of various kinds of local clock drivers. It makes no sense to give it any other address. Drop thefudge
line from the clients, and just point them at the server. You're also using a closed network, so drop all the security stuff until you get this working:If that still doesn't seem to work, we'll need to see the output of
ntpq -c as
andntpq -c pe
on both the server, and on a badly-behaving client, after at least ten minutes of uninterrupted running.Edit: you write in a comment below that "I think the offset/jitter is really high because the failing clients drift in time".
I think you may be right. This chap's blog suggests he had the same experience: that the client clock was so bad that it fooled the local
ntpd
into thinking that the server was unreliable. He wroteGiven that it's your clients whose time goes most quickly off which are failing to sync (marking the server as "reject"), I think you're seeing the same effect. His solution was to use
adjtimex
to manually tune the kernel clock (adjusting thetick
value) until the system clock was less wayward, at which point ntpd had a chance to recognise the server as being OK, and sync to it. You should probably give that a try on the worst client first, and see if it helps.I was able to get acceptable time diff following the below-listed steps:
Steps
Install chrony in both your devices
Let's assume the server IP address 192.168.1.87 then client configuration (/etc/chrony/chrony.conf) as follows:
server 192.168.1.87 iburst
keyfile /etc/chrony/chrony.keys
driftfile /var/lib/chrony/chrony.drift
log tracking measurements statistics
logdir /var/log/chrony
Server configuration (/etc/chrony/chrony.conf), assume your client IP is 192.168.1.14
keyfile /etc/chrony/chrony.keys
driftfile /var/lib/chrony/chrony.drift
log tracking measurements statistics
logdir /var/log/chrony
local stratum 8
manual
allow 192.0.0.0/24
allow 192.168.1.14
Restart chrony in both computers
sudo systemctl stop chrony
sudo systemctl start chrony
5.1 Checking on the client-side,
sudo systemctl status chrony
5.1
chronyc tracking
output:You may ditch NTP completely, set time manually on the "server" and issue this command:
Loop it through all you "client" IP's and you are done!
Explanation: local time will be "copied" to remote machine via SSH.