Supercomputers are the engines of groundbreaking discoveries. From predicting extreme weather to advancing disease research and designing safer, more efficient infrastructures, these machines simulate complex systems that are impractical to test in the real world due to their size, cost, and material requirements. Since the introduction of the GPU in 1999, NVIDIA has continually pushed the…
]]>The open-source llama.cpp code base was originally released in 2023 as a lightweight but efficient framework for performing inference on Meta Llama models. Built on the GGML library released the previous year, llama.cpp quickly became attractive to many users and developers (particularly for use on personal workstations) due to its focus on C/C++ without the need for complex dependencies.
]]>GPUs continue to get faster with each new generation, and it is often the case that each activity on the GPU (such as a kernel or memory copy) completes very quickly. In the past, each activity had to be separately scheduled (launched) by the CPU, and associated overheads could accumulate to become a performance bottleneck. The CUDA Graphs facility addresses this problem by enabling multiple GPU…
]]>GROMACS, a scientific software package widely used for simulating biomolecular systems, plays a crucial role in comprehending important biological processes important for disease prevention and treatment. GROMACS can use multiple GPUs in parallel to run each simulation as quickly as possible. Over the past several years, NVIDIA and the core GROMACS developers have collaborated on a series of…
]]>GROMACS, a simulation package for biomolecular systems, is one of the most highly used scientific software applications worldwide, and a key tool in understanding important biological processes including those underlying the current COVID-19 pandemic. In a previous post, we showcased recent optimizations, performed in collaboration with the core development team, that enable GROMACS to…
]]>GROMACS—one of the most widely used HPC applications— has received a major upgrade with the release of GROMACS 2020. The new version includes exciting new performance improvements resulting from a long-term collaboration between NVIDIA and the core GROMACS developers. As a simulation package for biomolecular systems, GROMACS evolves particles using the Newtonian equations of motion.
]]>The performance of GPU architectures continue to increase with every new generation. Modern GPUs are so fast that, in many cases of interest, the time taken by each GPU operation (e.g. kernel or memory copy) is now measured in microseconds. However, there are overheads associated with the submission of each operation to the GPU – also at the microsecond scale – which are now becoming significant…
]]>