Tree-ensemble models remain a go-to for tabular data because they’re accurate, comparatively inexpensive to train, and fast. But deploying Python inference on CPUs quickly becomes the bottleneck once you need sub-10 ms of latency or millions of predictions per second. Forest Inference Library (FIL) first appeared in cuML 0.9 in 2019, and has always been about one thing: blazing-fast…
]]>The success of deep neural networks in multiple areas has prompted a great deal of thought and effort on how to deploy these models for use in real-world applications efficiently. However, efforts to accelerate the deployment of tree-based models (including random forest and gradient-boosted models) have received less attention, despite their continued dominance in tabular data analysis and their…
]]>