Fortran – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-07-03T22:20:47Z http://www.open-lab.net/blog/feed/ Miko Stulajter <![CDATA[Using Fortran Standard Parallel Programming for GPU Acceleration]]> http://www.open-lab.net/blog/?p=48632 2023-12-05T21:53:22Z 2022-06-12T21:28:55Z Standard languages have begun adding features that compilers can use for accelerated GPU and CPU parallel programming, for instance, do concurrent loops and...]]> Standard languages have begun adding features that compilers can use for accelerated GPU and CPU parallel programming, for instance, do concurrent loops and...

Standard languages have begun adding features that compilers can use for accelerated GPU and CPU parallel programming, for instance, loops and array math intrinsics in Fortran. This is the fourth post in the Standard Parallel Programming series, which aims to instruct developers on the advantages of using parallelism in standard languages for accelerated computing: Using standard��

Source

]]>
8
Michael Wolfe <![CDATA[Detecting Divergence Using PCAST to Compare GPU to CPU Results]]> http://www.open-lab.net/blog/?p=22165 2022-08-21T23:40:47Z 2020-11-18T16:00:00Z Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first...]]> Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first...PCAST helps to quickly isolate divergence between CPU and GPU results so you can isolate bugs or verify your results are OK even if they aren��t identical.

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first is testing changes to parts of a program, new compile-time flags, or a port to a new compiler or to a new processor. You might want to test whether a new library gives the same result, or test the safety of adding OpenMP parallelism��

Source

]]>
0
Guray Ozen <![CDATA[Accelerating Fortran DO CONCURRENT with GPUs and the NVIDIA HPC SDK]]> http://www.open-lab.net/blog/?p=22198 2023-06-12T21:13:52Z 2020-11-16T16:00:00Z Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran...]]> Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran...

Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran Standard Parallel Programming for GPU Acceleration, which aims to instruct developers on the advantages of using parallelism in standard languages for accelerated computing. Now with the latest 20.11 release of the NVIDIA HPC SDK��

Source

]]>
28
Brent Leback <![CDATA[Bringing Tensor Cores to Standard Fortran]]> http://www.open-lab.net/blog/?p=19380 2023-06-12T21:14:42Z 2020-08-07T19:35:38Z Tuned math libraries are an easy and dependable way to extract the ultimate performance from your HPC system. However, for long-lived applications or those that...]]> Tuned math libraries are an easy and dependable way to extract the ultimate performance from your HPC system. However, for long-lived applications or those that...

Tuned math libraries are an easy and dependable way to extract the ultimate performance from your HPC system. However, for long-lived applications or those that need to run on a variety of platforms, adapting library calls for each vendor or library version can be a maintenance nightmare. A compiler that can automatically generate calls to tuned math libraries gives you the best of both��

Source

]]>
1
Brent Leback <![CDATA[Tensor Core Programming Using CUDA Fortran]]> http://www.open-lab.net/blog/?p=14140 2023-02-13T17:46:24Z 2019-04-02T13:00:36Z The CUDA Fortran compiler from PGI now supports programming Tensor Cores with NVIDIA��s Volta V100 and Turing GPUs. This enables scientific programmers using...]]> The CUDA Fortran compiler from PGI now supports programming Tensor Cores with NVIDIA��s Volta V100 and Turing GPUs. This enables scientific programmers using...

The CUDA Fortran compiler from PGI now supports programming Tensor Cores with NVIDIA��s Volta V100 and Turing GPUs. This enables scientific programmers using Fortran to take advantage of FP16 matrix operations accelerated by Tensor Cores. Let��s take a look at how Fortran supports Tensor Cores. Tensor Cores offer substantial performance gains over typical CUDA GPU core programming on Tesla V100��

Source

]]>
0
Ronald M. Caplan <![CDATA[Using OpenACC to Port Solar Storm Modeling Code to GPUs]]> http://www.open-lab.net/blog/?p=11070 2023-05-19T19:20:48Z 2018-07-16T14:20:50Z Solar storms consist of massive explosions on the Sun that can release the energy of over 2 billion megatons of TNT in the form of solar flares and Coronal Mass...]]> Solar storms consist of massive explosions on the Sun that can release the energy of over 2 billion megatons of TNT in the form of solar flares and Coronal Mass...

Solar storms consist of massive explosions on the Sun that can release the energy of over 2 billion megatons of TNT in the form of solar flares and Coronal Mass Ejections (CMEs). CMEs eject billions of tons of magnetized plasma into space, and while most of them miss Earth entirely, there have been some in the past that would have inflicted great damage on our modern technological society had they��

Source

]]>
2
Brad Nemire <![CDATA[GPU-Accelerated PC Solves Complex Problems Hundreds of Times Faster Than Massive CPU-only Supercomputers]]> https://news.www.open-lab.net/?p=7578 2022-08-21T23:42:26Z 2016-07-19T20:47:18Z Russian scientists from Lomonosov Moscow State University used an ordinary GPU-accelerated desktop computer to solve complex quantum mechanics equations in just...]]> Russian scientists from Lomonosov Moscow State University used an ordinary GPU-accelerated desktop computer to solve complex quantum mechanics equations in just...

Russian scientists from Lomonosov Moscow State University used an ordinary GPU-accelerated desktop computer to solve complex quantum mechanics equations in just 15 minutes that would typically take two to three days on a large CPU-only supercomputer. Senior researchers Vladimir Pomerantcev and Olga Rubtsova and professor Vladimir Kukulin used a GeForce GTX 670 with CUDA and the PGI CUDA Fortran��

Source

]]>
0
Brad Nemire <![CDATA[Performance Portability for GPUs and CPUs with OpenACC]]> http://news.www.open-lab.net/?p=6632 2022-08-21T23:41:33Z 2015-10-29T22:30:49Z New PGI compiler release includes support for C++ and Fortran applications to run in parallel on multi-core CPUs or GPU accelerators. OpenACC gives?scientists...]]> New PGI compiler release includes support for C++ and Fortran applications to run in parallel on multi-core CPUs or GPU accelerators. OpenACC gives?scientists...

New PGI compiler release includes support for C++ and Fortran applications to run in parallel on multi-core CPUs or GPU accelerators. OpenACC gives scientists and researchers a simple and powerful way to accelerate scientific computing applications incrementally. With the PGI Compiler 15.10 release, OpenACC enables performance portability between accelerators and multicore CPUs.

Source

]]>
0
Paresh Kharya <![CDATA[Introducing the NVIDIA OpenACC Toolkit]]> http://www.open-lab.net/blog/parallelforall/?p=5569 2022-11-28T18:20:54Z 2015-07-13T07:01:55Z Programmability is crucial to accelerated computing, and NVIDIA's CUDA Toolkit has been critical to the success of GPU computing. Over three million CUDA...]]> Programmability is crucial to accelerated computing, and NVIDIA's CUDA Toolkit has been critical to the success of GPU computing. Over three million CUDA...

Programmability is crucial to accelerated computing, and NVIDIA��s CUDA Toolkit has been critical to the success of GPU computing. Over three million CUDA Toolkits have been downloaded since its first launch. However, there are many scientists and researchers yet to benefit from GPU computing. These scientists have limited time to learn and apply a parallel programming language, and they often have��

Source

]]>
2
Mark Harris <![CDATA[Six Ways to SAXPY]]> http://www.parallelforall.com/?p=40 2023-02-13T18:13:03Z 2012-07-02T11:03:25Z For even more ways to SAXPY using the latest NVIDIA HPC SDK with standard language parallelism, see N Ways to SAXPY: Demonstrating the Breadth of GPU...]]> For even more ways to SAXPY using the latest NVIDIA HPC SDK with standard language parallelism, see N Ways to SAXPY: Demonstrating the Breadth of GPU...

Source

]]>
17
Mark Harris <![CDATA[An OpenACC Example (Part 2)]]> http://www.parallelforall.com/?p=21 2023-05-18T22:12:51Z 2012-03-26T06:39:14Z You may want to read?the more?recent post?Getting Started with OpenACC?by Jeff Larkin. In?my previous post?I added 3 lines of OpenACC directives to a...]]> You may want to read?the more?recent post?Getting Started with OpenACC?by Jeff Larkin. In?my previous post?I added 3 lines of OpenACC directives to a...

You may want to read the more recent post Getting Started with OpenACC by Jeff Larkin. In my previous post I added 3 lines of OpenACC directives to a Jacobi iteration code, achieving more than 2x speedup by running it on a GPU. In this post I��ll continue where I left off and demonstrate how we can use OpenACC directives clauses to take more explicit control over how the compiler parallelizes our��

Source

]]>
2
Mark Harris <![CDATA[An OpenACC Example (Part 1)]]> http://www.parallelforall.com/?p=19 2023-05-18T22:12:40Z 2012-03-20T06:37:33Z You may want to read the more recent post Getting Started with OpenACC?by Jeff Larkin. In this post I'll continue where I left off in my?introductory...]]> You may want to read the more recent post Getting Started with OpenACC?by Jeff Larkin. In this post I'll continue where I left off in my?introductory...

You may want to read the more recent post Getting Started with OpenACC by Jeff Larkin. In this post I��ll continue where I left off in my introductory post about OpenACC and provide a somewhat more realistic example. This simple C/Fortran code example demonstrates a 2x speedup with the addition of just a few lines of OpenACC directives, and in the next post I��ll add just a few more lines to push��

Source

]]>
0
Mark Harris <![CDATA[OpenACC: Directives for GPUs]]> http://www.parallelforall.com/?p=12 2022-08-21T23:36:44Z 2012-03-13T05:56:45Z NVIDIA has made a lot of progress with CUDA over the past five years; we estimate that there are over 150,000 CUDA developers, and important science is being accomplished with the help of CUDA. But we have a long way to go to help everyone benefit from GPU computing. There are many programmers who can��t afford the time to learn and apply a parallel programming language. Others��

Source

]]>
0
���˳���97caoporen����