Noach Magedman

Asked: 2020-06-10 11:13:46 +0800 CST2020-06-10 11:13:46 +0800 CST 2020-06-10 11:13:46 +0800 CST

Adding Mongo WiredTiger node to MMAPv1 replica set leads to "No keys found for HMAC"

I have a 3-node Replica Set (primary + 2 secondaries). They are all running Mongo 4.0.18 using the MMAPv1 engine. I am trying to switch the replica set over to use WiredTiger.

I read through the MongoDB tutorial on how to Change Replica Set to WiredTiger. That tutorial instructs how to change each node in situ: take it offline, reconfigure it, bring it back online. I am not following those instructions as-is, but instead want to introduce new nodes to the replica set and (when all seems well) decommission the older nodes from the set.

I launched a new AWS EC2 instance with Mongo configured for WiredTiger and manually added it to the replica set, following the Add Members to a Replica Set tutorial. (At essence, rs.add({ host: ip_of_new_instance + ":27017", priority: 0, votes: 0 }))

The new node switches state from OTHER to STARTUP2, populates its dbPath folder with many new collection-* and index-* files, and eventually switches state to SECONDARY. All looks well. I can see all of the collections/documents via the mongo shell when running $ mongo db_name from the new node, and I can still access the primary by running $ mongo 'mongodb://username:password@mongodb.private:27017/db_name?authSource=admin&replicaSet=rs0'.

HOWEVER, the moment the new node transitions from STARTUP2 to SECONDARY, my application starts to fail, reporting the Mongo error:

Cache Reader No keys found for HMAC that is valid for time: { ts: Timestamp(1591711351, 1) } with id: 6817586637606748161

I have not been able to reproduce this Mongo error outside of the application (Rocket.Chat, built on the Meteor framework). Perhaps the problem lies there. Or perhaps the application is doing something I haven’t tried from the mongo shell, e.g. tailing the oplog. [Update: I tried it but am not sure if I’m doing it right: db.oplog.rs.find().tailable({ awaitData: true }) returns a dozen documents before prompting for it]

If, however, I start the new-node process from scratch, changing just one thing –– set the storage.engine to mmapv1 instead of wiredTiger –– then all works well. My application functions properly. I don’t know why the application works when all nodes are running mmapv1 but fails when there is a wiredTiger node, especially since the engine is a node-internal thing, opaque to the client.

I notice a strange discrepancy between running mmapv1 and wiredTiger. The node running wiredTiger includes two keys (operationTime and $clusterTime) in the response to certain commands (e.g. db.adminCommand({ getParameter: '*' })). None of the mmapv1 nodes (new or old) include those keys in their responses. Since the Mongo error message in my application’s logs includes a reference to time, I’m very suspicious that the presence of $clusterTime only on the wiredTiger node is somehow related to the underlying problem.

I’m not sure how to troubleshoot this. I’ve been googling for solutions, but I have not found any strong leads –– only a few references to that error message, none of which seem entirely on target:

https://stackoverflow.com/questions/60876115/error-while-converting-a-mongodb-cluster-into-a-replica-set
https://developer.mongodb.com/community/forums/t/error-while-converting-a-cluster-into-a-replica-set/2022 (duplicate of above)
https://jira.mongodb.org/browse/SERVER-32845 "Arbiter fails when receiving an isMaster command with a $clusterTime"
https://jira.mongodb.org/browse/SERVER-33947 "Arbiter replies "No keys found for HMAC that is valid for time" to isMaster with clusterTime"
https://jira.mongodb.org/browse/SERVER-32639 "Arbiters in standalone replica sets can't sign or validate clusterTime with auth on once FCV checks are removed" (The previous two are considered dups of this, although this one does not contain that error message)

Adding Mongo WiredTiger node to MMAPv1 replica set leads to "No keys found for HMAC"

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Adding Mongo WiredTiger node to MMAPv1 replica set leads to "No keys found for HMAC"

0 Answers