I'm looking for a solution to improve my Spark cluster performances, I have read from http://spark.apache.org/docs/latest/hardware-provisioning.html:
We recommend having 4-8 disks per node
, I have tried both with one and two disks but I have seen that with 2 disks the execution time is doubled. Any explanations about this?
This is my configuration: 1 machine with 140 GB RAM 2 disks and 32 CPU (I know that is an unusual configuration) and on this I have a standalone Spark cluster with 1 Worker.
0 Answers