Using a HPC lustre filesystem, we occasionally experience glitchiness where even simply opening a terminal and typing "ls" can take minutes to return. That is, any process that involves the filesystem has random massive latency (but generally produces no actual errors), and processes that do not involve the filesystem (like dragging windows around in an x-windows session) remain responsive.
What can potentially cause lustre to intermittently exhibit excessive latency? (Would it necessarily be a hardware failure, or a misconfiguration, or nearly-full filesystem, or just a nasty usage pattern from some distributed parallel job that day?)
0 Answers