Sparsity – NVIDIA Technical Blog

Sparsity – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-07-09T19:00:00Z http://www.open-lab.net/blog/feed/ Gwena Cunha Sergio <![CDATA[Sparsity in INT8: Training Workflow and Best Practices for NVIDIA TensorRT Acceleration]]> http://www.open-lab.net/blog/?p=64658 2023-06-09T20:26:40Z 2023-05-16T16:00:00Z

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of...]]>

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of...

dense_sparse_sparseQAT_highlighted

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of floating-point computations during inference. Research has shown that many of those computations can be skipped by forcing some weights to be zero, with little impact on the final accuracy. In parallel to that, previous posts have shown that��

]]> 0 Jeff Pool <![CDATA[Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=34218 2023-06-12T21:09:10Z 2021-07-20T13:00:00Z

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. When deploying a neural network, it's useful to think about how the network could be...]]>

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. When deploying a neural network, it's useful to think about how the network could be...

inference-sparsity-ampere-tensorRT

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. When deploying a neural network, it��s useful to think about how the network could be made to run faster or take less space. A more efficient network can make better��

]]> 13 Federico Busato <![CDATA[Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt]]> http://www.open-lab.net/blog/?p=22602 2022-08-21T23:40:49Z 2020-12-08T19:34:58Z

Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The...]]>

Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The... Decorative image of Tensor Cores.

Decorative image of Tensor Cores.

Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The computational power needed to process these neural networks is rapidly increasing, so efficient models and computation are crucial. Neural network pruning, removing unnecessary model parameters to yield a sparse network, is a useful way to��

]]> 10 Chris Campa <![CDATA[Defining AI Innovation with NVIDIA DGX A100]]> http://www.open-lab.net/blog/?p=17629 2023-03-22T01:06:28Z 2020-05-14T13:00:57Z

Organizations of all kinds are incorporating AI into their research, development, product, and business processes. This helps them meet and exceed their...]]>

Organizations of all kinds are incorporating AI into their research, development, product, and business processes. This helps them meet and exceed their...

DGX A100

Organizations of all kinds are incorporating AI into their research, development, product, and business processes. This helps them meet and exceed their particular goals, and also helps them gain experience and knowledge to take on even bigger challenges. However, traditional compute infrastructures aren��t suitable for AI due to slow CPU architectures and varying system requirements for different��

]]> 0 Mohammad Shoeybi <![CDATA[State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU]]> http://www.open-lab.net/blog/?p=17320 2023-04-04T17:01:46Z 2020-05-14T13:00:46Z

Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as...]]>

Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as...

time-spent-per-iteration

Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as question-answering, dialog systems, summarization, and article completion. However, during training, large models do not fit in the available memory of a single accelerator, requiring model parallelism to split the parameters across multiple��

]]> 1 Ronny Krashinsky <![CDATA[NVIDIA Ampere Architecture In-Depth]]> http://www.open-lab.net/blog/?p=17431 2023-05-24T00:05:26Z 2020-05-14T13:00:00Z

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU...]]>

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU...

nvidia-a100-gpu-on-sxm4

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU architecture. This post gives you a look inside the new A100 GPU, and describes important new features of NVIDIA Ampere architecture GPUs. The diversity of compute-intensive applications running in modern cloud data centers has driven��

]]> 0 Rory Mitchell <![CDATA[Gradient Boosting, Decision Trees and XGBoost with CUDA]]> http://www.open-lab.net/blog/parallelforall/?p=8335 2022-10-10T18:49:29Z 2017-09-12T04:50:07Z

Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression,...]]>

Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression,... Gradient Boosting, Decision Trees, XGBoost

Gradient Boosting, Decision Trees, XGBoost

Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression, classification and ranking. It has achieved notice in machine learning competitions in recent years by ��winning practically every competition in the structured data category��. If you don��t use deep neural networks for your problem, there is a good chance��

]]> 1 ��˳��97caoporen��