Ayush Dattagupta – NVIDIA Technical Blog

Ayush Dattagupta – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-12-20T21:13:45Z http://www.open-lab.net/blog/feed/ Ayush Dattagupta <![CDATA[Accelerating GPU Analytics Using RAPIDS and Ray]]> http://www.open-lab.net/blog/?p=94495 2024-12-20T21:13:45Z 2024-12-20T21:13:42Z

RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries that are well supported for scale-out with distributed engines like Spark and...]]>

RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries that are well supported for scale-out with distributed engines like Spark and Dask. Ray is a popular open-source distributed Python framework commonly used to scale AI and machine learning (ML) applications. Ray particularly excels at simplifying and scaling training and inference pipelines and can easily target both…

]]> Ayush Dattagupta <![CDATA[Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator]]> http://www.open-lab.net/blog/?p=68797 2024-10-18T20:15:54Z 2023-08-08T18:33:00Z

The latest developments in large language model (LLM) scaling laws have shown that when scaling the number of model parameters, the number of tokens used for...]]>

The latest developments in large language model (LLM) scaling laws have shown that when scaling the number of model parameters, the number of tokens used for training should be scaled at the same rate. The Chinchilla and LLaMA models have validated these empirically derived laws and suggest that previous state-of-the-art models have been under-trained regarding the total number of tokens used…

]]> 0 Ayush Dattagupta <![CDATA[Accelerating Sequential Python User-Defined Functions with RAPIDS on GPUs for 100X Speedups]]> http://www.open-lab.net/blog/?p=32421 2022-08-21T23:51:47Z 2021-06-07T15:00:00Z

Motivation Custom ��row-by-row�� processing logic (sometimes called sequential User-Defined Functions) is prevalent in ETL workflows. The sequential nature of...]]>

Custom “row-by-row” processing logic (sometimes called sequential User-Defined Functions) is prevalent in ETL workflows. The sequential nature of UDFs makes parallelization on GPUs tricky. This blog post covers how to implement the same UDF logic using RAPIDS to parallelize computation on GPUs and unlock 100x speedups. Typically, sequential UDFs revolve around records with the same…

]]> 0 ��˳��97caoporen��