I have an application that is constantly pumping data to MongoDB. MongoDB instance is running with 2-replicas, each with 3TB gp2 EBS volume.
As you can see on the following graph of
irate(node_disk_reads_completed_total{}[1m])
irate(node_disk_writes_completed_total{}[1m])
The reading performance is slow and steady, which is normal, but the writes seem to be under-performing.
Even in the peaks we never reach theoretical 3IOPS/GB * 3000 GB = 9k IOPS. The application itself is spending most of the clock time pumping more and more data to the DB, so from the perspective of the app the DB is the clear bottleneck.
So why can't it go faster? And why is it such a wave? I would expect that writing to WAL will provide constant source of write activity and things like periodic fsync to disk would not really cause such extreme up and down patterns.
Could it be that syncing between replicas are causing pauses in the throughput? But I am using the default write concern, which should be w: 1, j: true
and not require waiting for replicas.
Anything else that I might be missing?