In the past, I have seen that most of my computer on my local network don't really have the same time within up to 1 second or so.
So I wanted to make sure that it would be possible for a stack of servers (say 48 1U computers) to all have the exact same clock. I know I can use NTP for the matter and I know I can have one server get time from an atomic clock, and the others synchronized to that one server.
I have one main concerns with that technique though: If that one server breaks, then my time synchronization stops... not good at all.
Is there a proper way to make sure 48 computers all have their clock synchronized with an accuracy of about 0.5µs? If not 0.5µs, what can we hope for? (i.e. 0.5ms?)
NTP can use a pool of servers (e.g. http://www.pool.ntp.org/en/ but you could build your own) to avoid the one-server-is-down case, and the servers can also maintain a local clock if for some reason all of their parent servers are unavailable.
I actually found a blog post with the information I was looking for. They mention that their cluster has computers properly synchronized at 1ms using NTP.
It looks like PTP, as suggested by Michael Hampton, follows the same strategy as in: it will make use of one computer, the grandmaster, as the source for time synchronization, opposed to trying to get the correct absolute time on all computers (as a result, if the grandmaster is off by 10ms from what the world considered absolute real time, all nodes will be off by 10ms).
The solution proposed in that document is to:
1) Setup one computer to retrieve absolute time with NTP. If that one computer goes down, the clocks may start drifting, but they will not become inaccurate between each others, they will be drifting compared to the absolute real time only.
In this case, you use
server
definitions (the grandmaster):Also setup this computer as an NTP server, say
local.ntp
2) Setup the other computers as peers
You do not need to have all 48 computers connected to each others, instead you would have between 3 and 5 with each computer using a slightly different setup (c1, c2, c3, then c2, c3, c4, etc.) As a result you get a peer to peer network which synchronizes each others as closely as possible, with a few computers (3 to 5) linking to the node defined in (1), i.e.
local.ntp
, to get the time as close as possible to real time.The
local.ntp
reference can itself be viewed as a peer (you may even be able to make it a peer?)P.S. the use of
restrict
is strongly advised when usingpeer
on a semi-public network to prevent others from accessing your NTP network.