What does it take to win a Kaggle competition in 2025? In the April Playground challenge, the goal was to predict how long users would listen to a podcast—and the top solution wasn’t just accurate, it was fast. In this post, Kaggle Grandmaster Chris Deotte will break down the exact stacking strategy that powered his first-place finish using GPU-accelerated modeling with cuML. You’ll learn a…
]]>Feature engineering remains one of the most effective ways to improve model accuracy when working with tabular data. Unlike domains such as NLP and computer vision, where neural networks can extract rich patterns from raw inputs, the best-performing tabular models—particularly gradient-boosted decision trees—still gain a significant advantage from well-crafted features. However…
]]>Picture this: You’re browsing through an online store, looking for the perfect pair of running shoes. But with thousands of options available, where do you even begin? Suddenly, a section catches your eye: “Recommended for You.” Intrigued, you click and, within seconds, a curated list of running shoes tailored to your unique preferences appears. It’s as if the website understands your tastes…
]]>In recent years, transformers have emerged as a powerful deep neural network architecture that has been proven to beat the state of the art in many application domains, such as natural language processing (NLP) and computer vision. This post uncovers how you can achieve maximum accuracy with the fastest training time possible when fine-tuning transformers. We demonstrate how the cuML support…
]]>In this post, we summarize questions and answers from GTC sessions with NVIDIA’s Kaggle Grandmaster team. Additionally, we answer audience questions we did not get a chance during these sessions. Ahmet: I read the competition description and evaluation metric. Then I give myself several days to think about if I have any novel ideas that I can try on. If I do not have any interesting…
]]>This post was originally published on the RAPIDS AI blog. k-Nearest Neighbors classification is a straightforward machine learning technique that predicts an unknown observation by using the k most similar known observations in the training dataset. In the second row of the example pictured above, we find the seven digits 3, 3, 3, 3, 3, 5, 5 from the training data are most similar to the…
]]>Recommender systems (RecSys) have become a key component in many online services, such as e-commerce, social media, news service, or online video streaming. However with the growth in importance, the growth in scale of industry datasets, and more sophisticated models, the bar has been raised for computational resources required for recommendation systems. To meet the computational demands…
]]>Kaggle is an online community that allows data scientists and machine learning engineers to find and publish data sets, learn, explore, build models, and collaborate with their peers. Members also enter competitions to solve data science challenges. Kaggle members earn the following medals for their achievements: Novice, Contributor, Expert, Master, and Grandmaster. The quality and quantity of…
]]>