As the computing horsepower of GPUs increases, so does the demand for input/output (I/O) throughput and the need to strong scale with low-latency, high-bandwidth communication among GPUs. A GPU-accelerated supercomputer transforms a compute-bound problem into an I/O-bound problem. In multi-node GPU clusters, slow CPU single-thread performance is in the critical path of data access from local��
]]>