Trying to import ~400m docs in to Elasticsearch from couchdb using the couchdb river plugin. Everything starts out great with indexing time around 5k/s but after a few hours come back and find its hitting the floor around 20/s. We have the system on a beefy box, a x1.xlarge, and all its doing is Elasticsearch. We have a 20 shard with no replication to help with the indexing and disable index refreshing. Heap is setup to use 65% of memory and we are using Java 7 latest from oracle.
What setting do i need to tune to help the initial data importing? I have played with bluk timeouts/size but still cant find the sweet spot.
Any help would be great. Zuhaib